A subscription to JoVE is required to view this content. Sign in or start your free trial.
This protocol introduces the tools available for modeling small-molecule ligands in cryoEM maps of macromolecules.
Deciphering the protein-ligand interactions in a macromolecular complex is crucial for understanding the molecular mechanism, underlying biological processes, and drug development. In recent years, cryogenic sample electron microscopy (cryoEM) has emerged as a powerful technique to determine the structures of macromolecules and to investigate the mode of ligand binding at near-atomic resolution. Identifying and modeling non-protein molecules in cryoEM maps is often challenging due to anisotropic resolution across the molecule of interest and inherent noise in the data. In this article, the readers are introduced to various software and methods currently used for ligand identification, model building, and refinement of atomic coordinates using selected macromolecules. One of the simplest ways to identify the presence of a ligand, as illustrated with the enolase enzyme, is to subtract the two maps obtained with and without the ligand. The extra density of the ligand is likely to stand out in the difference map even at a higher threshold. There are instances, as shown in the case of metabotropic Glutamate receptor mGlu5, when such simple difference maps cannot be generated. The recently introduced method of deriving the Fo-Fc omit map can serve as a tool for validating and demonstrating the presence of the ligand. Finally, using the well-studied Ξ²-galactosidase as an example, the effect of resolution on modeling the ligands and solvent molecules in cryoEM maps is analyzed, and an outlook on how cryoEM can be used in drug discovery is presented.
Cells accomplish their functions by carrying out innumerable chemical reactions simultaneously and independently, each meticulously regulated to ensure their survival and adaptability in response to environmental cues. This is achieved by molecular recognition, which enables biomolecules, especially proteins, to form transient or stable complexes with other macromolecules as well as small molecules or ligands1. Thus, protein-ligand interactions are fundamental to all processes in biology, which include the regulation of protein expression and activity, the recognition of substrates and cofactors by enzymes, as well as how cells perceive and relay signals1,2. A better understanding of the kinetic, thermodynamic, and structural properties of the protein-ligand complex reveals the molecular basis of ligand interaction and also facilitates rational drug design by optimizing drug interaction and specificity. An economical and faster approach to studying protein-ligand interaction is to use molecular docking, which is a computational method that virtually screens a diverse range of small molecules and predicts the binding mode and affinity of these ligands to target proteins3. However, experimental evidence from high-resolution structures determined by X-ray Diffraction (XRD), Nuclear Magnetic Resonance (NMR), or electron cryomicroscopy (cryoEM) provides the essential proof for such predictions and aids in the development of newer and more effective activators or inhibitors for a given target. This article uses the abbreviation 'cryoEM' as the technique is commonly referred to. However, there is an ongoing debate on choosing the correct nomenclature, and recently, the term cryogenic-sample Electron Microscopy (cryoEM) has been proposed to indicate that the sample is at cryogenic temperature and imaged with electrons4. Similarly, the maps derived from cryoEM have been called electron potential, electrostatic potential, or Coulomb potential, and for simplicity, here we use cryoEM maps5,6,7,8,9,10.
Although XRD has been the gold standard technique in high-resolution structure determination of protein-ligand complexes, post resolution-revolution11 cryoEM has gained momentum, as indicatedΒ by the surge of Coulomb potential maps or cryoEM maps deposited in the Electron Microscopy Database (EMDB)12,13 over the last few years14. Owing to advancements in sample preparation, imaging, and data processing methods, the number of Protein Data Bank (PDB)14 depositions employing cryoEM increased from 0.7% to 17% between 2010 and 2020, with approximately 50% of reported structures in 2020 being determined at 3.5β Γ resolution or better15,16. CryoEM has been rapidly adopted by the structural biology community, including the pharmaceutical industry, as it allows the study of flexible and non-crystalline biological macromolecules, especially membrane proteins and multi-protein complexes, at near-atomic resolution, overcoming the process of crystallization and obtaining well-diffracting crystals required for high-resolution structure determination by XRD.
Accurate modeling of the ligand in the cryoEM map is paramount, as it serves as a blueprint of the protein-ligand complex at a molecular level. There are several automated ligand-building tools used in X-ray crystallographyΒ that depend on the shape and topology of the ligand density in order to fit or build the ligand into the electron density17,18,19,20. Nevertheless, if the resolution is lower than 3 Γ , these approaches tend to produce less desirable outcomes because the topological features on which they depend for recognition and building become less defined. In many instances, these methods have proven ineffective in accurately modeling ligands into cryoEM maps, as these maps have been determined in the low-to-medium resolution range, typically between 3.5 Γ -5 Γ 17.
The first step in 3D structure determination of a protein-ligand complex by cryoEM involves either co-purifying the ligand with the protein (when the ligand has a high binding affinity to the protein) or incubating the protein solution with the ligand for a specific duration before grid preparation. Subsequently, a small sample volume is placed onto a plasma-cleaned holey TEM grid, followed by flash freezing in liquid ethane and ultimately imaging with a cryo-TEM. The 2D projection images from hundreds of thousands to millions of individual particles are averaged to reconstruct a 3-dimensional (3D) Coulomb potential map of the macromolecule. Identifying and modeling ligands and solvent molecules in these maps pose significant challenges in many cases due to the anisotropic resolution across the map (i.e., the resolution is not uniform across the macromolecule), flexibility in the region where the ligand is bound, and the noise in the data. Many of the modeling, refinement, and visualization tools that were developed for XRD are now being adapted for use in cryoEM for the same purposes18, 19, 20, 21. In this article, an overview of various methods and software currently used to identify ligands, build models, and refine the coordinates derived from cryoEM is presented. A step-by-step protocol has been provided to illustrate the processes involved in modeling ligands using specific protein-ligand complexes with varying resolution and complexity.
The first step in modeling ligands in cryoEM maps includes the identification of the ligand density (non-protein) in the map. If ligand binding does not induce any conformational change in the protein, then calculating a simple difference map between the protein-ligand complex and the apo-protein essentially highlights the regions of extra density, suggesting the presence of the ligand. Such differences can be observed immediately, as it just requires two maps, and even intermediate maps during the process of 3D refinement can be used to check if the ligand is present. Additionally, if the resolution is high enough (<3.0 Γ ), then the difference map can also provide insights into the location of water molecules as well as ions interacting with the ligand and the protein residues.
In the absence of the apo-protein map, it is now possible to use Servalcat22, which is available as a standalone tool and has also been integrated into the CCP-EM software suite23,24 as part of the Refmac refinement and in CCP4 8.0 release25,26. Servalcat allows the calculation of an FSC weighted difference (Fo-Fc) map using the unsharpened half-maps and the apoprotein model as input. The Fo-Fc omit map represents the disparity between the experimental map (Fo) and the map derived from the model (Fc). In the absence of a ligand in the model, a positive density in a Fo-Fc map that overlaps with the experimental EM map typically suggests the presence of the ligand. The assumption here is that the protein chain is well-fitted in the map, and the remaining positive density indicates the location of the ligand. However, it is important to meticulously examine whether the positive density stems from modeling inaccuracies, such as the wrong rotamer of a protein side-chain.
The second step involves obtaining or creating a cartesian coordinate file of the ligand with well-defined geometry from the available chemical information. Standard ligands (for example, ATP and NADP+) that are already available in the CCP4 monomer library can be used for refinement by retrieving the coordinate and geometry files via their monomer accession code. However, for unknown or non-standard ligands, various tools are available to create the geometry files. Some of the examples include the eLBOW27 - (electronic ligand builder and optimization workbench) in Phenix28, Lidia - an in-built tool in Coot29, JLigand/ACEDRG30,31, CCP-EM23,24, Ligprep32-a module of Glide within the Schrodinger suite. The ligand coordinate file is then fitted in the density, guided by both the experimental cryoEM map and the difference map in Coot. This is followed by real-space refinement in Phenix28 or reciprocal refinement in Refmac33. A Linux workstation or a laptop equipped with a good graphics card and the above-mentioned software is required. Most of these programs are included in various suites. CCP-EM24Β and Phenix28 are freely available to academic users and include a variety of tools that are used in this article, including Coot, Refmac533,34,35,36, Servalcat, phenix.real_space_refine, etc. Similarly,Β Chimera37, and ChimeraX38 provide free licenses to academic users.
1. Modeling phosphoenolpyruvate (PEP) in enolase from Mycobacterium tuberculosis
2. Modeling of ligands in metabotropic Glutamate receptor mGlu5
3. Modeling the inhibitor, deoxygalacto-nojirimycin (DGN), and solvent molecules in a high-resolution sharpened map of Ξ²-galactosidase
4. Effect of resolution on ligand modeling in Ξ²-galactosidase
Example 1
The enzyme enolase from M. tuberculosis catalyzes the penultimate step of glycolysis and converts 2-phosphoglycerate to phosphoenolpyruvate (PEP), which is an essential intermediate for several metabolic pathways44,45. CryoEM data for the apo-enolase and PEP-bound enolase samples were collected at the same pixel size of 1.07 Γ
, and image processing was performed with Relion 3.146,
The improvements in microscope hardware and software have resulted in an increase in the number of cryoEM structures in recent years. Although the highest resolution achieved at the moment in single particle cryoEM is 1.2 Γ 57,58,59, the majority of the structures are being determined around 3-4 Γ resolution. Modeling ligands in medium to low-resolution maps can be tricky and often fraught with ambiguity. Given the wide...
The authors have nothing to disclose.
SJ is a recipient of the PhD studentship from DAE-TIFR, and the funding is acknowledged. KRV acknowledges DBT B-Life grant DBT/PR12422/MED/31/287/2014 and the support of the Department of Atomic Energy, Government of India, under Project Identification No. RTI4006.
Name | Company | Catalog Number | Comments |
CCP4-8.0 | Consortium of several institutes | https://www.ccp4.ac.uk | Free for academic users and includes Coot and list of tools developed for X-ray crystallography |
CCP-EM | Consortium of several institutes | https://www.ccpem.ac.uk/download.php | Free for academic users and includes Coot, Relion and many others |
CootΒ | Paul Emsley, LMB, Cambridge | https://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot/ | General software for model building but also available with other suites described above |
DockinMap (Phenix) | Consortium of several institutes | https://phenix-online.org/documentation/reference/dock_in_map.html | Software inside the Phenix suite for docking model into cryoEM maps |
Electron Microscopy Data BankΒ | Consortium of several institutes | https://www.ebi.ac.uk/emdb/ | Public Repository for Electron Microscopy maps |
Falcon | Thermo Fisher ScientificΒ | https://assets.thermofisher.com/TFS-Assets/MSD/Technical-Notes/Falcon-3EC-Datasheet.pdf | Commercial, camera from Thermo FisherΒ |
PhenixΒ | Consortium of several institutes | https://phenix-online.org/download | Free for academic users and includes Coot |
Protein Data Bank | Consortium of several institutes | https://rcsb.org | Public database of macromolecular structures |
Pymol | Schrodinger | https://pymol.org/2/ | Molecular viusalization tool. Educational version is free but comes with limitation. The full version can be obtained with a small fee. |
Relion | MRC-LMB, Cambridge | https://relion.readthedocs.io/en/release-4.0/Installation.html | Software for cryoEM image processing, also available with CCP-EM |
Titan Krios | Thermo Fisher ScientificΒ | https://www.thermofisher.com/in/en/home/electron-microscopy/products/transmission-electron-microscopes/krios-g4-cryo-tem.html?cid=msd_ls_xbu_xmkt_tem-krios_285811_gl_pso_gaw_tpne1c& gad_source=1&gclid=CjwKCAiA-P-rBhBEEiwAQEXhHyw5c8MKThmdA AkZesWC4FYQSwIQRk ZApkj08MfYG040DtiiuL8 RihoCebEQAvD_BwE | Commercial, cryoTEM from Thermo Fisher |
UCSF Chimera | UCSF, USA | https://www.cgl.ucsf.edu/chimera/download.html | General purpose software for display, analysis and more |
UCSF Chimera X | UCSF, USA | https://www.cgl.ucsf.edu/chimerax/ | General purpose software for display, analysis and more |
Request permission to reuse the text or figures of this JoVE article
Request PermissionThis article has been published
Video Coming Soon
Copyright Β© 2025 MyJoVE Corporation. All rights reserved