Introduction
There are commonly known five basic senses including vision, auditory, touch, smell, and taste. We use these senses to interact with the external world. Each of these senses involves a specific class of biological membrane receptors in the body. These senses are essential for us to sense and interact with the external world. We have gained a great deal of knowledge of the molecular basis for vision, auditory, and touch, but we are still far from understanding the molecular basis of smell. Smell uses combinatorial limited olfactory receptors to recognize an exceedingly diverse and wide range of odorants and scents.
Human olfactory receptors (Buck and Axel, Reference Buck and Axel1991) belong to a diverse family of G-protein-coupled receptors (GPCRs) (Firestein, Reference Firestein2001; Gaillard et al., Reference Gaillard, Rouquier and Giorgi2004; García-Nafría and Tate, Reference García-Nafría and Tate2021) primarily involved in the detection of odorant molecules. These receptors are located on the olfactory sensory neurons in the olfactory epithelium of the nasal cavity. However, recent studies have revealed their ectopic expression in various non-olfactory tissues (Massberg and Hatt, Reference Massberg and Hatt2018) including blood, breast, lung, intestine, skin, heart, prostate, liver, lungs, testis, and hair (Chéret et al., Reference Chéret, Bertolini, Ponce, Lehmann, Tsai, Alam, Hatt and Paus2018) and overexpressed in some cancer cells (Chung et al., Reference Chung, Cho, Lee and Koo2022). Studying olfactory receptors thus opens new avenues for frontier research including cancer research.
The olfactory receptors all have seven transmembrane alpha-helices (TM1–TM7) (García-Nafría and Tate, Reference García-Nafría and Tate2021). These helices form a cylindrical structure that traverses the cell membrane’s lipid bilayer. The arrangement of these helices creates a binding pocket within the membrane where odorant molecules can interact with the receptor. The extracellular loops connecting these helices are also responsible for recognizing and binding odorant molecules. The intracellular loops and the C-terminal tail interact with G-proteins. The binding of an odorant molecule to the receptor activates the G-protein, which then triggers a signaling cascade involving adenylate cyclase, which converts ATP to cyclic AMP (cAMP). The increase in cAMP levels leads to the opening of ion channels, resulting in neuronal depolarization and the transmission of the olfactory signal to the brain (Firestein, Reference Firestein2001; Gaillard et al., Reference Gaillard, Rouquier and Giorgi2004; Glezer and Malnic, Reference Glezer and Malnic2019; Kuroda et al., Reference Kuroda, Nakaya-Kishi, Tatematsu and Hinuma2023).
The transmembrane domains of the olfactory receptors, like other integral membrane proteins, are composed of mostly hydrophobic amino acids that interact with the fatty acid chains of the membrane lipids, excluding water and rendering the transmembrane domains hydrophobic. For this reason, it is notoriously difficult to study membrane proteins as they require detergents to solubilize and stabilize these membrane proteins when they are removed from their membranes. This severely limits their application for medicinal and technological development (Vinothkumar and Henderson, Reference Vinothkumar and Henderson2010).
Olfactory receptor classes have been one of the most difficult membrane proteins to undertake structural studies because they are likely unstable without their odorants which often bind with low affinities. Among ~400 human olfactory receptors (Niimura et al., Reference Niimura, Matsui and Touhara2014), currently there are only one structure is available (Billesbølle et al., Reference Billesbølle, de March, van der Velden, Ma, Tewari, Del Torrent, Li, Faust, Vaidehi, Matsunami and Manglik2023), one engineered OR52cs with consensus protein sequences (Choi et al., Reference Choi, Bae, Kim, Lee, Kang, Kim, Bang, Kim, Huh, Seok, Park, Im and Choi2023) and an amine odorant receptor TAAR9 receptor (Guo et al., Reference Guo, Cheng, Lian, Liu, Lu, Zheng, Zhu, Zhang, Kong, Zhang, Rong, Zhuang, Fang, Jiang, Zhang, Han, Liu, Xia, Liu, Zhang, Liberles, Yu, Xu, Yang, Li and Sun2023). Thus, alternative methods should be encouraged to study olfactory receptors.
We here selected six olfactory receptors to carry out structural bioinformatic studies.
OR51E1, also earlier known as PSGR (Prostate Specific G-Protein Coupled Receptor), is a member of the olfactory receptor family predominantly expressed in the prostate (Bax et al., Reference Bax, Taverna, Eusebio, Sironi, Grizzi, Guazzoni and Capelli2018). It has been identified as a potential biomarker for prostate cancer because of its overexpression in malignant prostate tissues compared with benign ones (Weng et al., Reference Weng, Wang, Cai, Stafford, Mitchell, Ittmann and Liu2005, Reference Weng, Wang, Hu, Wang, Ittmann and Liu2006). The activation of OR51E1 has been shown to suppress growth in human prostate cancer cells, therefore making it a possible alternative therapeutic target for prostate cancer (Massberg et al., Reference Massberg, Jovancevic, Offermann, Simon, Baniahmad, Perner, Pungsrinont, Luko, Philippou, Ubrig, Heiland, Weber, Altmüller, Becker, Gisselmann, Gelis and Hatt2016). Additionally, OR51E1 has been found to act as a tumor biomarker for lung carcinoids (LC) in somatostatin receptor-negative tumor patients (Giandomenico et al., Reference Giandomenico, Cui, Grimelius, Öberg, Pelosi and Tsolakis2013).
OR51E2, also known as PSGR2, is also a member of the olfactory receptor family predominantly expressed in the prostate. It is similarly overexpressed in malignant prostate cancer tissues compared with benign tissues (Weng et al., Reference Weng, Wang, Hu, Wang, Ittmann and Liu2006). Activation of OR51E2 by specific ligands has been shown to evoke an intracellular calcium response and inhibit prostate cancer cell proliferation, suggesting its potential as a candidate for prostate cancer treatment (Neuhaus et al., Reference Neuhaus, Zhang, Gelis, Deng, Noldus and Hatt2009, Mermer et al., Reference Mermer, Strotmann, Kummer and Paddenberg2021).
OR1A1 is a member of the olfactory receptor family (Schmiedeberg et al., Reference Schmiedeberg, Shirokova, Weber, Schilling, Meyerhof and Krautwurst2007). It has been detected that OR1A1 is significantly expressed on the surface of HepG2 liver cells. The activation of OR1A1 by the ligand (−)-carvone increases the cyclic adenosine monophosphate (cAMP) without changing the intracellular Ca2+ concentration, thus inducing the protein kinase A (PKA) – cAMP response element-binding protein (CREB) – hairy and enhancer of split (HES)-1 signaling axis. In those cells where OR1A1 was activated by (−)-carvone, intracellular triglyceride levels were reduced. These results suggest that OR1A1 may modulate hepatic triglyceride metabolism (Wu et al., Reference Wu, Jia, Lee, Kim, Sekharan, Batista and Lee2015).
OR1A2 is another olfactory receptor that is ectopically expressed in several tissues including the liver, blood, heart, and pancreas (Massberg and Hatt, Reference Massberg and Hatt2018). OR1A2 is being explored for its huge therapeutic potential in the reduction of hepatocellular carcinoma progression. It has been found to be expressed in studies involving Huh7 cells, a monoterpene-activated hepatocellular carcinoma (HCC) cell line. Activation of OR1A2 by (S)-(−)-citronellal in these cells not only induces calcium signaling but also reduces cell proliferation (Massberg et al., Reference Massberg, Simon, Häussinger, Keitel, Gisselmann, Conrad and Hatt2015).
TAAR9 is an olfactory trace amine receptor belonging to the family of trace amine receptors. They are expressed in olfactory epithelium neurons. They detect diverse ethological signals including predators, decayed food, pheromones, and others (Gainetdinov et al., Reference Gainetdinov, Hoener and Berry2018). TAAR9 has been detected in breast cancer tissues. Recent studies show that there are correlations between the expression levels of TAAR9 and genes involved with neuroactive ligand signaling in abnormal tissue growth. This co-expression between genes in primary tumors and metastatic lesions suggests that TAAR9 may play a role in modulating breast cancer progression (Vaganova et al., Reference Vaganova, Maslennikova, Konstantinova, Kanov and Gainetdinov2023). Furthermore, in melanoma, deregulation of TAAR9 and other TAARs has been observed, indicating they may have significance in tumor progression (Vaganova et al., Reference Vaganova, Kuvarzin, Sycheva and Gainetdinov2022).
OR52cs is a designed olfactory receptor combining the protein consensus sequences that represent 26 members of the human OR52 family (Ikegami et al., Reference Ikegami, de March, Nagai, Ghosh, Do, Sharma, Bruguera, Lu, Fukutani, Vaidehi, Yohda and Matsunami2020). Its CryoEM structure has been elucidated (Choi et al., Reference Choi, Bae, Kim, Lee, Kang, Kim, Bang, Kim, Huh, Seok, Park, Im and Choi2023). It is only one of three olfactory receptor structures so far among ~400 human olfactory receptors (Niimura et al., Reference Niimura, Matsui and Touhara2014).
In this structural bioinformatics study, we used the newly released AlphaFold3 (Abramson et al., Reference Abramson, Adler, Dunger, Evans, Green, Pritzel, Ronneberger, Willmore, Ballard, Bambrick, Bodenstein, Evans, Hung, O’Neill, Reiman, Tunyasuvunakool, Wu, Žemgulytė, Arvaniti, Beattie, Bertolli, Bridgland, Cherepanov, Congreve, Cowen-Rivers, Cowie, Figurnov, Fuchs, Gladman, Jain, Khan, Low, Perlin, Potapenko, Savy, Singh, Stecula, Thillaisundaram, Tong, Yakneen, Zhong, Zielinski, Žídek, Bapst, Kohli, Jaderberg, Hassabis and Jumper2024). AlphaFold3 is a deep-learning artificial intelligence model that uses a diffusion network to predict protein structures with incredible accuracy. Results that may have previously taken hours for AlphaFold2 were ready within minutes. AlphaFold3 is also capable predict complex interactions albeit not in the perfect state yet.
Google DeepMind released AlphaFold2 in July 2021 (Jumper et al., Reference Jumper, Evans, Pritzel, Green, Figurnov, Ronneberger, Tunyasuvunakool, Bates, Žídek, Potapenko, Bridgland, Meyer, Kohl, Ballard, Cowie, Romera-Paredes, Nikolov, Jain, Adler, Back, Petersen, Reiman, Clancy, Zielinski, Steinegger, Pacholska, Berghammer, Bodenstein, Silver, Vinyals, Senior, Kavukcuoglu, Kohli and Hassabis2021; Jumper & Hassabis Reference Jumper and Hassabis2022; Varadi et al., Reference Varadi, Anyango, Deshpande, Nair, Natassia, Yordanova, Yuan, Stroe, Wood, Laydon, Žídek, Green, Tunyasuvunakool, Petersen, Jumper, Clancy, Green, Vora, Lutfi, Figurnov, Cowie, Hobbs, Kohli, Kleywegt, Birney, Hassabis and Velankar2022) and AlphaFold3 in May 2024 (Abramson et al., Reference Abramson, Adler, Dunger, Evans, Green, Pritzel, Ronneberger, Willmore, Ballard, Bambrick, Bodenstein, Evans, Hung, O’Neill, Reiman, Tunyasuvunakool, Wu, Žemgulytė, Arvaniti, Beattie, Bertolli, Bridgland, Cherepanov, Congreve, Cowen-Rivers, Cowie, Figurnov, Fuchs, Gladman, Jain, Khan, Low, Perlin, Potapenko, Savy, Singh, Stecula, Thillaisundaram, Tong, Yakneen, Zhong, Zielinski, Žídek, Bapst, Kohli, Jaderberg, Hassabis and Jumper2024). The AlphaFold2 and AlphaFold3 use artificial intelligence tools and deep learning to predict protein structures from their amino-acid sequences. They have revolutionized 3D protein structure predictions and AlphaFold3 is capable of predicting protein–molecular complexes. DeepMind, in partnership with EMBL-EBI, released the AlphaFold Protein Structure Database which contains over 214 million predicted protein structures (Varadi et al., Reference Varadi, Bertoni, Magana, Paramval, Pidruchna, Radhakrishnan, Tsenkov, Nair, Mirdita, Yeo, Kovalevskiy, Tunyasuvunakool, Laydon, Žídek, Tomlinson, Hariharan, Abrahamson, Green, Jumper, Birney, Steinegger, Hassabis and Velankar2024). In comparison to the ~224,000 experimentally determined structures available through RCSB-PDB, AlphaFold3 predictions have acknowledged limitations, and need to be validated through experimental analysis. Physical structural studies of the water-soluble QTY variants of membrane proteins are still needed to validate the AlphaFold3-predicted structures.
We previously applied the QTY (Glutamine, Threonine, Tyrosine) code to design several detergent-free chemokines and cytokine receptors, all of which retained structural thermal stability and native ligand-binding activities and enzymatic activities despite substantial changes to the transmembrane domain (Zhang et al., Reference Zhang, Tao, Qing, Tang, Skuhersky, Corin, Tegler, Wassie, Wassie, Kwon, Suter, Entzian, Schubert, Yang, Labahn, Kubicek and Maertens2018; Hao et al., Reference Hao, Jin, Zhang and Qing2020; Tegler et al., Reference Tegler, Corin, Skuhersky, Pick, Vogel and Zhang2020; Zhang and Egli, Reference Zhang and Egli2022; Li et al., Reference Li, Tang, Qing, Wang, Xu, Zhang and Tao2024). These water-soluble variants were then used to elucidate the mechanism of native receptor-ligand interaction and their binding abilities despite significant truncation in several chemokine receptors (Qing et al., Reference Qing, Han, Fei, Skuhersky, Badr, Schubert and Zhang2019; Qing et al., 2021, Qing et al., Reference Qing, Hao, Smorodina, Zalevsky, Jin and Zhang2022). Using the online version of AlphaFold2, we predicted the QTY variant structures of 7 chemokine receptors and 1 olfactory receptor (Skuhersky et al., Reference Skuhersky, Tao, Qing, Smorodina, Jin and Zhang2021), 14 glucose transporters (Smorodina et al., Reference Smorodina, Tao, Qing, Jin, Yang and Zhang2022a) and 13 solute carrier transporters (Smorodina et al., Reference Smorodina, Diankin, Tao, Qing, Yang and Zhang2022b), and 6 human ABC transporters (Pan et al., Reference Pan, Smorodina and Zhang2024), 8 human glutamate transporters (Karagöl et al., Reference Karagöl, Karagöl, Smorodina and Zhang2024a) and 7 human monoamine transporters (Karagöl et al., Reference Karagöl, Karagöl and Zhang2024b). We recently also showed that the QTY code also works very well for bacterial outer membrane beta-barrels (Sajeev-Sheeja et al., Reference Sajeev-Sheeja, Smorodina and Zhang2023), and for IgG monoclonal antibodies that are rich in beta-sheet structure (Li et al., Reference Li, Wang, Tao, Xu and Zhang2023).
Recently, we have asked if the QTY code is applicable to other olfactory receptors. The olfactory receptors are all integral membrane proteins with seven transmembrane alpha-helices embedded in a lipid bilayer. Therefore, because of the hydrophobic properties of transmembrane domains, they are not water-soluble without the aid of detergents. We wanted to see if the QTY code could be utilized to design water-soluble variants of these olfactory receptor proteins.
Here we report the structural bioinformatic studies of three experimentally-determined olfactory receptors (OR51E2, OR52cs, and TAAR9), and three without experimentally-determined structures (OR1A1, OR1A2, OR51E1) and their AlphaFold3 predicted water-soluble QTY variants. We provide superpositions of the hydrophobic native transporters and their hydrophilic QTY variants. We also provide the comparative hydrophobicity molecular structures with their hydrophilic QTY variants. Furthermore, we provide the AlphaFold3 predicted and molecular modeled odorant binding studies of OR1A2 with its odorant octanoate and TAAR9 with its odorant spermidine.
Results and discussion
The QTY code
The QTY code is a straightforward tool devised to create water-soluble versions of membrane proteins, traditionally challenging because of their hydrophobic nature, thereby facilitating more effective research and drug development. When applying the code, the four hydrophobic amino acids, leucine (L), isoleucine (I)/valine (V), and phenylalanine (F), in the transmembrane domain, are pairwise replaced with the three polar and neutral amino acids glutamine (Q), threonine (T), and tyrosine (Y). Specifically, glutamine replaces leucine, threonine (T) replaces both isoleucine (I) and valine (V), and tyrosine replaces phenylalanine. This works because of the strong similarities of the electron density maps between Q versus L, T versus I&V, and Y versus T (Zhang et al., Reference Zhang, Tao, Qing, Tang, Skuhersky, Corin, Tegler, Wassie, Wassie, Kwon, Suter, Entzian, Schubert, Yang, Labahn, Kubicek and Maertens2018; Zhang and Egli, Reference Zhang and Egli2022). Consequently, the hydrophobic amino acids in the transmembrane alpha-helical domains are replaced with hydrophilic amino acids upon applying the QTY code. Therefore, transforming the transmembrane and its properties from hydrophobic to water-soluble.
Olfactory receptor protein sequence alignments and other characteristics
The protein sequences of the native olfactory receptors and those of their QTY variants were aligned (Table 1, Figure 1), revealing 20.98%–25.88% alterations in the overall amino acid composition, specifically substantial replacement in the transmembrane domains 43.03%–50.31%. Despite these changes, the isoelectric-focusing points (pI) remain similar, with anywhere from a 0- to 0.15-unit difference. These pI changes are inconsequential with respect to surface charges and unlikely to interfere with structures. At neutral PH, amino acids Q, T, and Y do not bear any charges; hence, they do not notably change a protein’s pI after the QTY code has been applied. Instead of saturated carbon side chains, the Q, T, and Y amino acids have water-soluble side chains. The sidechain –NH2 of glutamine (Q) can form four hydrogen bonds with water molecules (two as donors from –NH2 and two as acceptors from the oxygen on –C=O), and the sidechain –OH of threonine (T) and tyrosine (Y) can form three hydrogen bonds (two as acceptors from O and 1 as a donor from H). This explains why the molecular weights (MWs) of the QTY counterparts were slightly bigger than the native proteins, as Nitrogen (14 Da) and oxygen (16 Da) in the water-soluble side chains have a higher MW than carbon (12 Da).
Abbreviations: Isoelectric focusing (pI), molecular weight (Mw), transmembrane (TM), not applicable (−), and residue mean-square distance (RMSD). The 5 olfactory receptors and TAAR9 are listed in the same order as Figure 1. RMSDs were calculated in the native cryo-EM determined structures and the corresponding residuals in the predicted QTY structures. The QTY amino acid substitutions in the transmembrane (TM) are significant between 43.03% and 50.31%, whereas the overall structural changes are between 20.98% and 25.88%.
Side-by-side display of the CryoEM structures, AlphaFold3-predicted native olfactory receptors, and their water-soluble QTY variants
We display the three types of olfactory receptors, including the CryoEM structures, AlphaFold3-predicted native olfactory receptors, and their water-soluble QTY variants, side by side with the same structural orientations to show their similarities although the native structures are hydrophobic and the QTY variant structures have become hydrophilic (cyan color) (Figure 2).
Superpositions of CryoEM structures, AlphaFold3-predicted native olfactory receptors, and their water-soluble QTY variants
We superposed the AlphaFold3-predicted native structures of the native olfactory receptors with their respective QTY variants as well as their experimentally determined CryoEM structures (only available for OR51E2, OR52cs, and TAAR9). Therefore, the superposition of the following proteins was carried out: OR51E2CryoEM versus OR51E2AF3 versus OR51E2QTY, OR52csCryoEM versus OR52csAF3 versus OR52csQTY, TAAR9CryoEM versus TAAR9AF3 versus TAAR9QTY, OR51E1AF3 versus OR51E1QTY, OR1A1AF3 versus OR1A1QTY, and OR1A2AF3 versus OR1A2QTY (Figure 3).
For the three olfactory receptors with experimentally determined CryoEM structures (OR51E2, OR52cs, and TAAR9), the native CryoEM structures and their AlphaFold3-predicted QTY variants superpose very well (Figure 3a-c). The root mean square deviation (RMSD) between the native CryoEM structures and the AlphaFold3-predicted QTY variants was between 0.987 Å and 1.275 Å. Therefore, all pairs had an RMSD of <1.30 Å: OR51E2CryoEM versus OR51E2QTY (1.275 Å), OR52csCryoEM versus OR52csQTY (1.038 Å) and TAAR9CryoEM versus TAAR9QTY (0.987 Å). These affirm the significant degree of similarity between the native olfactory receptors and their water-soluble QTY variants, as well as support AlphaFold3’s competence and power.
As shown in Figure 3d-f, these structures superposed remarkably well, sharing very similar folds despite a 43.03%–50.31% replacement of amino acids in the transmembrane domain of the QTY variants. Negligible discrepancies and variations are observed. The RMSD values between the AlphaFold3-predicted native structures of the olfactory receptors and their QTY variants are OR51E2AF3 versus OR51E2QTY (0.461 Å), OR52csAF3 versus OR52csQTY (0.441 Å), TAAR9AF3 versus TAAR9QTY (0.822 Å), OR51E1AF3 versus OR51E1QTY (0.713 Å), OR1A1AF3 versus OR1A1QTY (0.713 Å), and OR1A2AF3 versus OR1A2QTY (0.732 Å). The range of the RMSD values was between 0.441 and 0.822 Å; thus, all pairs had an RMSD of <1.00 Å. These results not only show the exceptional structural correspondence between native olfactory receptors and their water-soluble QTY variants but also highlight the accuracy of AlphaFold3’s predictions.
To further validate the accuracy of AlphaFold3, we also asked how well the CryoEM-determined native olfactory receptor structures would superpose with the AlphaFold3-predicted native structures. The CryoEM and AlphaFold3-predicted native structures superposed very well: OR51E2CryoEM versus OR51E2AF3 (1.079 Å), OR52csCryoEM versus OR52csAF3 (0.832 Å), and TAAR9CryoEM versus TAAR9AF3 (1.265 Å). All pairs had an RMSD of <1.30 Å, affirming the advanced performance of AlphaFold3.
Analysis of the hydrophobic surface of native olfactory receptors and their water-soluble QTY variants
The native olfactory receptors are highly hydrophobic, particularly in the transmembrane alpha-helical domains. This is because the transmembrane alpha helices are embedded directly in the lipid bilayer, and the hydrophobic side chains of amino acids leucine (L), isoleucine (I), valine (V), and phenylalanine (F) interact with the lipid bilayer, thereby excluding water molecules. Consequently, the transmembrane domains demonstrate highly hydrophobic patches (Figure 3). Once these native proteins have been extracted from the lipid bilayer membranes, they require surfactants to solubilize and stabilize. Otherwise, they aggregate and precipitate, losing their biological functions.
After the QTY code was applied, pairwise-replacing the hydrophobic amino acids L, I/V, and F with the hydrophilic amino Q, T, and Y, the hydrophobic patches were significantly reduced (Figure 4). Transforming the transmembrane alpha-helices from hydrophobic to hydrophilic using the QTY code did not significantly alter their transmembrane structures.
AlphaFold3 predictions
Over the decades, the scientific community has been attempting to predict the complex process of how proteins naturally fold instantaneously into their 3D structure, one of the most important challenges in the biological sciences. Predicting protein folding is important in advancing our understanding of disease and biological functions. Moreover, it is crucial in drug development and discovery as well as in the creation of biotechnological applications. Numerous attempts have been made to predict how proteins fold; however, this has been an extremely difficult task until the release of AlphaFold in late 2019, which is an AI and machine learning protein structure prediction software.
On May 8, 2024, DeepMind launched AlphaFold3, marking a significant milestone in allowing us to study proteins and their complexes with other molecules. AlphaFold3 goes beyond just the prediction of protein structures and interactions to that of other critical biomolecules including DNA, RNA, other proteins, peptides, and small molecular ligands. AlphaFold3 offers increased accuracy to AlphaFold2 in predicting single protein structures and predicts protein complexes with much higher precision, outperforming previous classical docking tools.
AlphaFold3’s predictions allow us to continue to study integral transmembrane protein structures and interactions in silico with increased speed and accuracy. We have used AlphaFold3 to predict the native structure of olfactory receptors as well as their water-soluble QTY variants. Then, we can superpose and compare the native structure of the proteins with the AlphaFold3-predicted QTY variants. Our work using AlphaFold3 has shown that the water-soluble QTY-variant structures are very similar to the native structures, indicating that the QTY code likely works for other transmembrane proteins.
AlphaFold3 offers increased accuracy but also unprecedented speed when it comes to predicting the structure of our native membrane proteins and their QTY variants. Past protein structure predictions could take hours, but AlphaFold3 has the capability to predict complex structures in mere minutes. This efficiency not only accelerates protein structure research and designing new proteins but also facilitates the speeding up of drug discovery and screening.
Odorant binding and residue-wise analysis
Odorant octanoic acid (OCA) can bind several olfactory receptors with different affinity. The olfactory receptor OR1A2 protein was selected for molecular dynamics studies based on initial virtual screening results from SwissDock, where it exhibited higher docking scores for octanoic acid compared with other candidate olfactory receptors although other receptors also bind to octanoic acid to a lesser extent. The structural analysis (Figure 5a) reveals that octanoic acid is nestled within a hydrophobic pocket of the membrane protein, interacting predominantly with hydrophilic residues LYS109, HIS159 that form hydrogen bonds, and hydrophobic residues ILE181 and PHE206 that form hydrophobic interactions. For protein design, these key residues might need to be preserved to maintain function. The electrostatic potential surface demonstrates the compatibility of the binding pocket for the hydrophobic tail and polar head group of octanoic acid, suggesting a stable interaction within this region.
The 50 ns molecular dynamics simulations in the membrane system and subsequent Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA) binding free energy calculations revealed insights into the binding interactions between the odorants and the OR1A2. The total binding energy of octanoic acid to the protein fluctuated around an average of −13 kcal/mol, indicating a moderate to strong binding affinity (Figure 5b). The radius of gyration was also maintained through the simulation which was compared with the OR1A2 membrane system without ligands. The moving average trendline suggests stabilization of the binding interaction over the course of the simulation, with minor fluctuations indicating transient conformational changes in the protein–ligand complex. Among the OR1A2 residues, LYS109 demonstrated the most significant contribution to binding, with a median energy of −54.435 kcal/mol (Table 2, Figure 5c). The narrow interquartile range (IQR), −55.775 to −52.7725 kcal/mol, for LYS109, indicates consistency in its binding contribution throughout the simulation. Other important contributors include HIS159 (in its protonated HSD form), ILE 181, and PHE 206. In substitution-based protein designs, such as the QTY code, maintaining consistent contributions from key residues might be necessary to ensure that the protein remains functional after substitutions.
a Residue contributions for octanoic acid binding (L:OCA) across 100 simulation frames (50 ns). R = receptor, OR1A2. L: ligand, octanoic acid.
b The lower quartile corresponds with the 25 percentile.
c The upper quartile corresponds with the 75 percentile.
Figure 5d shows the hydrogen bindings of octanoic acid from HIS159 and LYS109. The histidine side chain acts as a hydrogen donor (2.193 Å) to one of the two oxygens on octanoic acid. The lysine side chain acts as two hydrogen donors (1.783 Å and 1.789 Å) to both oxygens on octanoic acid. Figure 5e shows the hydrophobic interactions between ILE181 (3.330 and 3.661 Å) and PHE206 (3.611 Å) to the carbon chain of octanoic acid.
Interestingly, some residues showed wider IQRs, suggesting a degree of flexibility in the OR1A2 binding pocket. For instance, residues PHE206, VAL203, and MET101 show particularly wide IQRs, spanning both negative and positive energy contributions. This flexibility could be indicative of the receptor’s potential to accommodate other odorants. This is particularly relevant for designing variants that retain the ability to bind multiple ligands. The binding interactions of octanoate, the deprotonated form of octanoic acid, with the OR1A2, were analyzed to understand the similarities and differences in their binding patterns. Despite the difference in their protonation states, the overall binding orientation and key residue interactions remain consistent, suggesting that the binding site is capable of stabilizing both forms of the ligand. Octanoate displayed a similar energy profile to octanoic acid, indicating that the deprotonation of the carboxyl group does not significantly alter the binding strength. Furthermore, the moving average trendlines for both ligands suggest that their binding interactions are stable over time (Figure 5b).
An amine odorant spermidine-TAAR9 binding structural bioinformatic study
The odorant binding pattern extends beyond the common olfactory receptor family, as evidenced by the binding of an amine odorant spermidine binding in trace amine receptor TAAR9, which utilizes a topologically similar pocket (Figure 6a). Molecular Mechanics Poisson–Boltzmann Surface Area (MMPBSA) binding free energy calculations revealed both systems were energetically favorable (Figure 6b). This cross-receptor spatial conservation of binding sites suggests a potentially convergent evolutionary mechanism in odorant sensing G protein-coupled receptors. Interestingly, comparative assessment of the hydrogen bonding networks demonstrated notable differences between the two ligands, despite their spatial overlap within the binding pocket. The predominant positive electrostatic potential (red coloration) in the spermidine pocket of TAAR9 suggests a binding site optimized for interaction with negatively charged or electronegative ligands, consistent with its role in recognizing biogenic amines (Figure 6a).
Conversely, the negative electrostatic potential (blue coloration) characterizing the octanoic acid binding pocket in OR1A2 indicates a binding potential for electropositive or partially positive charged regions of odorant molecules (Figure 5a). This electrostatic complementarity between receptor-binding pockets and their respective ligands indicates the molecular basis for selective olfactory recognition, where the spatial distribution of charge plays a crucial role in determining receptor–ligand specificity. Concurrently, interacting residues were also diverse, as ASP 112 showed a significant contribution to binding energy for TAAR9 (Figure 6c). Protonated histidine residues showed contributions for both ligands (Figure 6d,e).
Future scopes and the potential applications
The consistent binding trends observed across our simulations provide a computational foundation for odorant interactions and could guide future experimental investigations. While we identified topologically similar binding pockets in these receptors, the specific molecular interactions driving ligand recognition were found to be distinct, further studies could involve pocket flexibility and conformational changes. Especially considering the prediction of the binding pocket flexibility of these receptors is a daunting task (Wang et al., 2024). Lipid distortion profiling could produce additional insights into the membrane-specific impact on protein interactions (Karagöl et al., Reference Karagöl, Karagöl and Zhang2024c), including odorant binding and pocket flexibility. Further molecular dynamics simulations on QTY-variants and a comparative analysis could also be helpful in understanding structural flexibility in odorant binding. Recent simulations on QTY-variant glutamate transporters revealed lipid impact and structural flexibility in the transporter architecture (Karagöl et al., Reference Karagöl, Karagöl and Zhang2024c). Regardless, the stable binding energies observed during simulations indicate physiologically relevant interactions, opening possibilities for medical applications in treating olfactory disorders and developing new drug delivery systems targeting membrane proteins. Olfactory receptors are also distributed in non-olfactory tissues, making them promising targets for a range of diseases (Wu et al., Reference Wu, Xu, Dong, Cui and Yuan2024).
These findings could guide the understanding of the evolutionary relationships between olfactory and trace amine-associated receptors and develop broader computational frameworks for predicting ligand specificity across related receptor families. Our results highlight the significant role of electrostatic potential in ligand interactions within these binding pockets. Notably, QTY-variants, which preserve residue charge, hold the potential for designing artificial receptors. The detailed binding characteristics uncovered in this study may also support advancements in artificial olfactory sensor development. To this extent, odorant receptor structure construction could be used to identify potential modulators (Wang et al., 2024). Alongside the olfactory system, these findings could provide insights into developing new delivery systems for drugs targeting membrane proteins. The observed plasticity in hydrogen bonding patterns may also inform structure-based drug design strategies for other G protein-coupled receptors where similar mechanical principles might apply.
Conclusion
In our study, we selected six human olfactory receptors including OR51E2, OR52cs, TAAR9, OR51E1, OR1A1, and OR1A2 that have been linked with research for the possible development of cancer treatment and detection methods. We applied the QTY code to the six olfactory proteins to convert the hydrophobic alpha helices to hydrophilic alpha helices and thus create water-soluble QTY variants. Then, we used AlphaFold3 to predict the structures of both the native proteins and their QTY variants. Through superposing the QTY variants and the native structures and calculating the RMSD values, we found that despite the substantial replacement of the amino acids in the transmembrane domains, the structures of the QTY variants were remarkably similar to those of the native proteins. This reveals that the AlphaFold3-predicted QTY variants of the native olfactory receptors are likely to retain their properties. Using bioinformatic computational tools, we validated this by analyzing calculated characteristics affiliated with protein stability and water solubility. Finally, we found that the surfaces of the QTY variants were markedly more hydrophilic than those of the native proteins. We also carried out the odorant binding molecular simulation study. Such a study revealed that the odorant octanoic acid forms a complex with the olfactory receptor OR1A2 with specific amino acid interactions. These water-soluble olfactory receptor QTY variants may now be engineered to design and develop as biomimetic sensing devices (Qing et al., Reference Qing, Xue, Zhao, Wu, Breitwieser, Smorodina, Schubert, Azzellino, Jin, Kong, Palacios, Sleytr and Zhang2023). Our current studies further demonstrate that the QTY code is a valid method to accurately design water-soluble variants of olfactory receptors. We believe these hydrophilic QTY variants of the olfactory receptors have potential for cancer detection technology and the discovery of therapeutic treatments for various cancerous diseases.
Methods
Protein sequence alignments and other characteristics
The native protein sequences for OR51E2, TAAR9, OR51E2, OR1A1, and OR1A2 were attained from UniProt (https://www.uniprot.org). Professor Hee-Jung Choi of Seoul National University, Korea, kindly provided us with the native protein sequence for OR52cs after we could not find it on UniProt or in any citation references. The native protein sequences and those of their QTY variants were aligned using the methods previously described. We used the Expasy website (https://web.expasy.org/compute_pi/) to calculate the MWs and pI values of the proteins.
AlphaFold3 predictions
We predicted the structures of the native proteins and their QTY variants using the AlphaFold3 website (https://alphafoldserver.com), following the included instructions. We also used the UniProt website (https://www.uniprot.org) to obtain protein ID, entry name, description, and FASTA sequence for each native protein. We then applied the QTY code to the FASTA sequences by manually replacing amino acids in the transmembrane domains found on the UniProt website and confirmed using the Protter website (https://wlab.ethz.ch/protter/start/).
Superposed structures
PDB files for native protein structures determined experimentally using Cryogenic electron microscopy (Cryo-EM) were taken from the RCSB PDB, including OR51E2, OR52cs, and TAAR9. We used AlphaFold3 (https://alphafoldserver.com) to predict the native structures of the olfactory receptors as well as their QTY variants. PyMol (https://pymol.org) was used to superpose these structures and calculate their RMSD values. Predictions for the QTY variants were carried out using AlphaFold3 (https://alphafoldserver.com). Then, these structures were superposed, and the RMSD values were calculated using PyMOL (https://pymol.org). For OR51E2, TAAR9, and OR52cs, the CryoEM molecular only model the heterodimer. As the AlphaFold2-predicted QTY variants only model the monomer, we removed unstructured loops and other protein monomers from the figures for clarity.
Structure visualization
We first used PyMOL (https://pymol.org) to superpose the native predicted protein structures, their QTY variants, and the CryoEM-determined structures for those proteins where these existed. We then used UCSF Chimera (https://pymol.org) to render each protein model with hydrophobicity patches.
Odorant docking and molecular dynamics simulations
For the docking studies, octanoic acid is selected as a ligand, represented using SMILES notation. AlphaFold2 database predicted proteins were utilized as receptors. Molecular docking was performed using the Attracting Cavities 2.0 (AC) method available on the SwissDock server (http://www.swissdock.ch/) (Röhrig et al., Reference Röhrig, Goullieux, Bugnon and Zoete2023; Bugnon et al., Reference Bugnon, Röhrig, Goullieux, Perez, Daina, Michielin and Zoete2024). The OR1A2 was chosen for molecular dynamics studies as the target receptor because of its higher scores in SwissParam assessments. Molecular dynamics simulations were conducted for the membrane systems of OR1A2, OR1A2-octanoic acid complex, and OR1A2-octanoate complex with the best results from the molecular docking study according to SwissParam scores. For TAA9, spermidine was selected as a ligand as it is an identified agonist (Xu and Li, Reference Xu and Li2020). The internal binding pocket was predicted using the SwissDock server, based on SwissParam scoring. All MD simulations and analyses were executed on Google Colab (https://colab.research.google.com), via Ubuntu 2021 4.2, utilizing a total of 96 core v2 TPUs and 334GB RAM. The simulations were parallelized across multiple processors or cores within the VM. Configuration files and Linux bash codes for the simulations are publicly available with step-by-step instructions.
Membrane-protein systems were constructed using CHARMM-GUI’s web-based membrane builder (Jo et al., Reference Jo, Kim, Iyer and Im2008; Wu et al., Reference Wu, Cheng, Jo, Rui, Song, Dávila-Contreras, Qi, Lee, Monje-Galvan, Venable, Klauda and Im2014). The protein portion was centered in a rectangular box and protonation states were assigned based on the local pH. The spatial orientation relative to the lipid bilayer was optimized via the PPM 2.0 method (Lomize et al., Reference Lomize, Pogozheva, Joo, Mosberg and Lomize2012). This method accounts for the anisotropic water-lipid environment, characterized by dielectric constant and hydrogen-bonding profiles (Lomize et al., Reference Lomize, Pogozheva, Joo, Mosberg and Lomize2012). The membrane models generated consisted of 70% 1-palmitoyl-2-oleoyl-glycero-3-phosphocholine (POPC) and 30% cholesterol, as it represents a simplified model of the plasma membrane. The system was solvated in TIP3P water with 150 mM KCl. All molecular dynamics (MD) simulations were conducted using GROMACS 2021.4 (Abraham et al., Reference Abraham, Murtola, Schulz, Páll, Smith, Hess and Lindahl2015) with the CHARMM36m all-atom force field (Huang et al., Reference Huang, Rauscher, Nawrocki, Ran, Feig, de Groot, Grubmüller and MacKerell2017). System energy was minimized using the steepest descent until maximum forces converged below 1000 kJ/mol/nm. Electrostatics were handled with Particle Mesh Ewald (PME), with both Coulomb and van der Waals interaction cutoffs set at 1.2 nm. A multistep minimization and equilibration process was employed to relax the protein-membrane systems. The standard six-step CHARMM-GUI protocol (Jo et al., Reference Jo, Kim, Iyer and Im2008) was used for 125-ps equilibration simulations. Temperature and pressure were maintained at 303.15 K and 1 bar, respectively, using the Nose–Hoover thermostat and Parrinello–Rahman barostat with semi-isotropic coupling. Following NVT and NPT equilibration, a 50-ns production MD simulation was run, with timestamps every 500 ps. Trajectories were subsequently combined using gmx traj (Abraham et al., Reference Abraham, Murtola, Schulz, Páll, Smith, Hess and Lindahl2015).
System stability was assessed through trajectory analysis. This included calculating the protein gyration radius (gmx gyrate tool), and residue root mean square fluctuation (RMSF) for protein Cα atoms. The solvent-accessible surface area (SASA) of protein residue side chains was determined using gmx_sasa, with a solvent probe radius of 1.4 Å (Huang et al., Reference Huang, Rauscher, Nawrocki, Ran, Feig, de Groot, Grubmüller and MacKerell2017). The resulting plots were rendered using Grace (https://plasma-gate.weizmann.ac.il/Grace/). MMPBSA binding free energy estimations were performed on the full 50 ns of equilibrated MD trajectories.
MMPBSA calculations for membrane-protein systems
The binding free energy of the complexes was calculated using the Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA) algorithm and performed using the gmx_MMPBSA tool, following the Amber reference manual for membrane-bound protein systems (Miller et al., Reference Miller, McGee, Swails, Homeyer, Gohlke and Roitberg2012; Valdés-Tresanco et al., Reference Valdés-Tresanco, Valdés-Tresanco, Valiente and Moreno2021). Implicit membrane region incorporated into the solvation calculations. With the default options, the program computed solvent-excluded surfaces using both the water probe (prbrad = 1.40) and the membrane probe (mprob = 2.70). Electrostatic energy and forces were computed using the particle-particle particle-mesh (P3M) method (Botello-Smith and Luo, Reference Botello-Smith and Luo2015). Binding energies were calculated for each time step, with averages and standard deviations reported. The standard error of the mean (SEM) was determined using error propagation. Per-residue decomposition analysis (idecomp = 2) was conducted to assess individual residue contributions to binding energy. This included 1–4 EEL and 1–4 VDW terms in total EEL and VDW potentials, respectively. Residues within 4 Å of both receptor and ligand were included in the output. Binding surface compositions were further analyzed and visualized using the gmx_MMPBSA_ana tool.
Open peer review
To view the open peer review materials for this article, please visit http://doi.org/10.1017/qrd.2024.18.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/qrd.2024.18.
Data availability statement
The AlphaFold2-predicted protein structures are at European Bioinformatics Institute (EBI) https://alphafold.ebi.ac.uk. The QTY code-designed water-soluble variants are published in this article and at Protein characteristics used in the analysis are available on UniProt, https://www.uniprot.org/. The native cryo-EM-determined ABC-transporter proteins are available in the RCSB PDB repository, https://www.rcsb.org/. The QTY code-designed water-soluble variants of the proteins are available at https://github.com/karagol-taner/Olfactory-receptors-QTY.
Acknowledgements
We thank Prof. Hee-Jung Choi of the Department of Biological Sciences, Seoul National University, Republic of Korea for kindly providing us with the protein sequence of consensus OR52cs by email. Prof. Hee-Jung Choi’s lab determined the CryoEM structure of OR52cs in December 2023.
Author contribution
Conceptualization: S.Z.; Formal analysis: F.J., A.K., T.K.; Investigation: Methodology: F.J., A.K., T.K.; Validation: F.J., A.K., T.K.; Data curation: A.K., T.K.; Writing—original draft preparation: F.J., A.K., T.K, S.Z.; Review and editing: F.J., A.K., T.K. and S.Z.
Financial support
Finn Johnsson is a high school student in London, UK. Taner Karagöl and Alper Karagöl are medical school students. There is no financial support for this digital structural bioinformatic study only free online tools.
Competing interest
Massachusetts Institute of Technology (MIT) filed several patent applications for the QTY code for GPCRs excluding the olfactory receptors. OH2Laboratories licensed the technology from MIT to work on water-soluble GPCR variants. S.Z. is an inventor of the QTY code and has a minor equity in OH2Laboratories. S.Z. is a Scientific Advisor and has minor shares for a startup RealNose to develop a sensing device based on olfactory receptors. S.Z. founded a startup 511 Therapeutics to generate therapeutic monoclonal antibodies against solute carrier transporters to treat pancreatic cancer. S.Z. has majority equity in 511 Therapeutics. All other authors have no competing interests.
Additional statement
1) All methods were carried out in accordance with relevant guidelines and regulations. 2) All experimental protocols were approved by a named institutional and licensing committee. 3) Neither human biological samples nor human subjects were used in the study. This is a completely digital structural bioinformatic study using the publicly available AlphaFold3 machine learning program.
Ethics Statement
All methods were carried out in accordance with relevant guidelines and regulations. All experimental protocols were approved by a named institutional and licensing committee. Neither human biological samples nor human subjects were used in the study. This is a completely digital structural bioinformatic study using the publicly available AlphaFold3 machine learning program.
Comments
Prof. Bengt Nordén
Editor in Chief
QRB Discovery
Dear Bengt,
I herewith submit a manuscript titled: “Structural bioinformatic study of six human olfactory receptors and their AlphaFold3 predicted water-soluble QTY variants and OR1A2 with an odorant octanoate” for consideration.
The molecular mechanism of olfaction, namely, how we smell with limited olfactory receptors to recognize seemingly unlimited scents still remain unknown despite the recent advance in chemistry, chemical, structural and molecular biology. Olfactory receptors are notoriously difficult to study because they are fully embedded in the cell membrane. After decades of efforts and significant funding, there are only three olfactory receptor structures known. In order to understand olfaction, we carried out the structural bioinformatic study of six human olfactory receptors including OR51E1, OR51E2, OR52cs, OR1A1, OR1A2, TAAR9, and their AlphaFold3 predicted water-soluble QTY variants with odorants. We applied the QTY code to replace leucine (L) with glutamine (Q), isoleucine (I) and valine (V) with threonine (T), and phenylalanine (F) with tyrosine (Y) only in the transmembrane helices. Therefore, these QTY variants become water-soluble. We also present the superimposed structures of native olfactory receptors and their water-soluble QTY variants. The superimposed structures show remarkable similarity with RMSDs between 0.441Å and 1.275Å despite significant changes to the protein sequence of the transmembrane domains (43.03–50.31%). We also show the differences of hydrophobicity surfaces between the native olfactory receptors and their QTY variants. Furthermore, we also used AlphaFold3 and molecular dynamics to study the odorant octanoate with OR1A2. Our bioinformatics studies provide insight into the differences between the hydrophobic helices and hydrophilic helices, and will likely further stimulate designs of water-soluble integral transmembrane proteins for new technology and device designs.
Finn Johnsson is currently a high school student, Alper Karagöl and Taner Karagöl are currently medical students at Istanbul Medical University. Thus, there is no research funding to carry out the current online digital based study. We hope to apply discount for the very expensive publication charge.
If you have any questions, please contact me.
Yours sincerely,
Shuguang Zhang, Ph.D.