preprints are available on bioRxiv
Computational Protein Design
Katherine I. Albanese, Sophie Barbe, Shunsuke Tagami, Derek N. Woolfson, Thomas Schiex. Nat Rev Methods Primer, 2025. https://doi.org/10.1038/s43586-025-00383-1
Combining molecular modelling, machine-learned models and an increasingly detailed understanding of protein chemistry and physics, computational protein design and human expertise have been able to produce new protein structures, assemblies and functions that do not exist in nature. Currently, generative deep-learning-based methods, which exploit large databases of protein sequences and structures, are revolutionizing the field, leading to new capabilities, improved reliability and democratized access in protein design. This Primer provides an introduction to the main approaches in computational protein design, covering both physics-based and machine-learning- based tools. It aims to be accessible to biological, physical and computer scientists alike. Emphasis is placed on understanding the practical challenges arising from limitations in our fundamental understanding of protein structure and function and on recent developments and new ideas that may help transcend these.
De Novo Design of Parallel and Antiparallel A3B3 Heterohexameric α-Helical Barrels
Joel J. Chubb*, Katherine I. Albanese*, Alison Rodger, Derek N. Woolfson Biochemistry, 2025. https://doi.org/10.1021/acs.biochem.4c00584
*these authors contributed equally
The de novo design of α-helical coiled-coil peptides is advanced. Using established sequence-to-structure relationships, it is possible to generate various coiled-coil assemblies with predictable numbers and orientations of helices. Here, we target new assemblies, namely, A3B3 heterohexamer α-helical barrels. These designs are based on pairs of sequences with three heptad repeats (abcdefg), programmed with a = Leu, d = Ile, e = Ala, and g = Ser, and b = c = Glu to make the acidic (A) chains and b = c = Lys in the basic (B) chains. These design rules ensure that the desired oligomeric state and stoichiometry are readily achieved. However, controlling the orientation of neighboring helices (parallel or antiparallel) is less straightforward. Surprisingly, we find that assembly and helix orientation are sensitive to the length of the overhang between helices. To study this, cyclically permutated peptide sequences with three heptad repeats (the register) in the peptide sequences were analyzed. Peptides starting at g (g-register) form a parallel 6-helix barrel in solution and in an X-ray crystal structure, whereas the b- and c-register peptides form an antiparallel complex. In lieu of experimental X-ray structures for b- and c-register peptides, AlphaFold-Multimer is used to predict atomistic models. However, considerably more sampling than the default value is required to match the models and the experimental data, as many confidently predicted and plausible models are generated with incorrect helix orientations. This work reveals the previously unknown influence of the heptad register on helical overhang and the orientation of α-helical coiled-coil peptides and provides insights for the modeling of oligopeptide coiled-coil complexes with AlphaFold.
Rationally seeded computational protein design of ɑ-helical barrels
Katherine I. Albanese*, Rokas Petrenas*, Fabio Pirro, Elise A. Naudin, Ufuk Borucu, William M. Dawson, D. Arne Scott, Graham. J. Leggett, Orion D. Weiner, Thomas A. A. Oliver & Derek N. Woolfson. Nat Chem Biol, 2024. https://doi.org/10.1038/s41589-024-01642-0
*these authors contributed equally
Computational protein design is advancing rapidly. Here we describe efficient routes starting from validated parallel and antiparallel peptide assemblies to design two families of α-helical barrel proteins with central channels that bind small molecules. Computational designs are seeded by the sequences and structures of defined de novo oligomeric barrel-forming peptides, and adjacent helices are connected by loop building. For targets with antiparallel helices, short loops are sufficient. However, targets with parallel helices require longer connectors; namely, an outer layer of helix–turn–helix–turn–helix motifs that are packed onto the barrels. Throughout these computational pipelines, residues that define open states of the barrels are maintained. This minimizes sequence sampling, accelerating the design process. For each of six targets, just two to six synthetic genes are made for expression in Escherichia coli. On average, 70% of these genes express to give soluble monomeric proteins that are fully characterized, including high-resolution structures for most targets that match the design models with high accuracy.
Trimethyllysine reader proteins exhibit widespread charge-agnostic binding via different mechanisms to cationic and neutral ligands
Christopher R. Travis, Kelsey M. Kean, Katherine I. Albanese, Hanne C. Henriksen, Joseph W. Treacy, Elaine Y. Chao, K. N. Houk, and Marcey L. Waters. J. Am. Chem. Soc. 2024 146 (5), 3086-3093. https://doi.org/10.1021/jacs.3c10031
In the last 40 years, cation−π interactions have become part of the lexicon of noncovalent forces that drive protein binding. Indeed, tetraalkylammoniums are universally bound by aromatic cages in proteins, suggesting that cation−π interactions are a privileged mechanism for binding these ligands. A prominent example is the recognition of histone trimethyllysine (Kme3) by the conserved aromatic cage of reader proteins, dictating gene expression. However, two proteins have recently been suggested as possible exceptions to the conventional understanding of tetraalkylammonium recognition. To broadly interrogate the role of cation−π interactions in protein binding interactions, we report the first large-scale comparative evaluation of reader proteins for a neutral Kme3 isostere, experimental and computational mechanistic studies, and structural analysis. We find unexpected widespread binding of readers to a neutral isostere with the first examples of readers that bind the neutral isostere more tightly than Kme3. We find that no single factor dictates the charge selectivity, demonstrating the challenge of predicting such interactions. Further, readers that bind both cationic and neutral ligands differ in mechanism: binding Kme3 via cation−π interactions and the neutral isostere through the hydrophobic effect in the same aromatic cage. This discovery explains apparently contradictory results in previous studies, challenges traditional understanding of molecular recognition of tetraalkylammoniums by aromatic cages in myriad protein–ligand interactions, and establishes a new framework for selective inhibitor design by exploiting differences in charge dependence.
Comparative analysis of sulfonium-π, ammonium-π, and sulfur-π interactions and relevance to SAM-dependent methyltransferases
Katherine I Albanese, Andrew Leaver-Fay, Joseph W Treacy, Rodney Park, KN Houk, Brian Kuhlman, Marcey L Waters. J. Am. Chem. Soc. 2022, 144, 6, 2535–2545. https://doi.org/10.1021/jacs.1c09902
We report the measurement and analysis of sulfonium−π, thioether−π, and ammonium−π interactions in a β-hairpin peptide model system, coupled with computational investigation and PDB analysis. These studies indicated that the sulfonium−π interaction is the strongest and that polarizability contributes to the stronger interaction with sulfonium relative to ammonium. Computational studies demonstrate that differences in solvation of the trimethylsulfonium versus the trimethylammonium group also contribute to the stronger sulfonium−π interaction. In comparing sulfonium−π versus sulfur−π interactions in proteins, analysis of SAM- and SAH-bound enzymes in the PDB suggests that aromatic residues are enriched in close proximity to the sulfur of both SAM and SAH, but the populations of aromatic interactions of the two cofactors are not significantly different, with the exception of the Me−π interactions in SAM, which are the most prevalent interaction in SAM but are not possible for SAH. This suggests that the weaker interaction energies due to loss of the cation−π interaction in going from SAM to SAH may contribute to turnover of the cofactor.
From peptides to proteins: coiled-coil tetramers to single-chain 4-helix bundles
Elise A Naudin, Katherine I Albanese, Abigail J Smith, Bram Mylemans, Emily G Baker, Orion D Weiner, David M Andrews, Natalie Tigue, Nigel J Savery, Derek N Woolfson. Chem. Sci. 2022, 13, 11330-11340. https://doi.org/10.1039/d2sc04479j
The design of completely synthetic proteins from first principles—de novo protein design—is challenging. This is because, despite recent advances in computational protein–structure prediction and design, we do not understand fully the sequence-to-structure relationships for protein folding, assembly, and stabilization. Antiparallel 4-helix bundles are amongst the most studied scaffolds for de novo protein design. We set out to re-examine this target, and to determine clear sequence-to-structure relationships, or design rules, for the structure. Our aim was to determine a common and robust sequence background for designing multiple de novo 4-helix bundles. In turn, this could be used in chemical and synthetic biology to direct protein–protein interactions and as scaffolds for functional protein design. Our approach starts by analyzing known antiparallel 4-helix coiled-coil structures to deduce design rules. In terms of the heptad repeat, abcdefg—i.e., the sequence signature of many helical bundles—the key features that we identify are: a = Leu, d = Ile, e = Ala, g = Gln, and the use of complementary charged residues at b and c. Next, we implement these rules in the rational design of synthetic peptides to form antiparallel homo- and heterotetramers. Finally, we use the sequence of the homotetramer to derive in one step a single-chain 4-helix-bundle protein for recombinant production in E. coli. All of the assembled designs are confirmed in aqueous solution using biophysical methods, and ultimately by determining high-resolution X-ray crystal structures. Our route from peptides to proteins provides an understanding of the role of each residue in each design.
Contributions of methionine to recognition of trimethyllysine in aromatic cage of PHD domains: implications in polarizability, hydrophobicity, and charge on binding
Katherine I Albanese, Marcey L Waters. Chem. Sci. 2021, 12, 8900-8908. https://doi.org/10.1039/D1SC02175C
Recognition of trimethyllysine (Kme3) by reader proteins is an important regulator of gene expression. This recognition event is mediated by an aromatic cage made up of 2–4 aromatic residues in the reader proteins that bind Kme3 via cation–π interactions. A small subset of reader proteins contain a methionine (Met) residue in place of an aromatic sidechain in the binding pocket. The unique role of sulfur in molecular recognition has been demonstrated in a number of noncovalent interactions recently, including interactions of thiols, thioethers, and sulfoxides with aromatic rings. However, the interaction of a thioether with an ammonium ion has not previously been investigated and the role of Met in binding Kme3 has not yet been explored. Herein, we systematically vary the Met in two reader proteins, DIDO1 and TAF3, and the ligand, Kme3 or its neutral analog tert-butyl norleucine (tBuNle), to determine the role of Met in the recognition of the cationic Kme3. Our studies demonstrate that Met contributes to binding via dispersion forces, with about an equal contribution to binding Kme3 and tBuNle, indicating that electrostatic interactions do not play a role. During the course of these studies, we also discovered that DIDO1 exhibits equivalent binding to tBuNle and Kme3 through a change in the mechanism of binding.
Thermodynamic consequences of Tyr to Trp mutations in the cation–π-mediated binding of trimethyllysine by the HP1 chromodomain
Mackenzie W Krone*, Katherine I Albanese*, Gage O Leighton, Cyndi Qixin He, Ga Young Lee, Marc Garcia-Borràs, Alex J Guseman, David C Williams, KN Houk, Eric M Brustad, Marcey L Waters. Chem. Sci. 2020, 11, 3495-3500. https://doi.org/10.1039/D0SC00227E
*these authors contributed equally
Evolution has converged on cation–π interactions for recognition of quaternary alkyl ammonium groups such as trimethyllysine (Kme3). While computational modelling indicates that Trp provides the strongest cation–π interaction of the native aromatic amino acids, there is limited corroborative data from measurements within proteins. Herein we investigate a Tyr to Trp mutation in the binding pocket of the HP1 chromodomain, a reader protein that recognizes Kme3. Binding studies demonstrate that the Trp-mediated cation–π interaction is about −5 kcal mol−1 stronger, and the Y24W crystal structure shows that the mutation is not perturbing. Quantum mechanical calculations indicate that greater enthalpic binding is predominantly due to increased cation–π interactions. NMR studies indicate that differences in the unbound state of the Y24W mutation lead to enthalpy–entropy compensation. These results provide direct experimental quantification of Trp versus Tyr in a cation–π interaction and afford insight into the conservation of aromatic cage residues in Kme3 reader domains.
Engineered reader proteins for enhanced detection of methylated lysine on histones
Katherine I Albanese*, Mackenzie W Krone*, Christopher J Petell, Madison M Parker, Brian D Strahl, Eric M Brustad, Marcey L Waters. ACS Chem. Biol. 2020, 15, 1, 103-111. https://doi.org/10.1021/acschembio.9b00651**
*these authors contributed equally
**featured on the cover of this issue
Histone post-translational modifications (PTMs) are crucial for many cellular processes including mitosis, transcription, and DNA repair. The cellular readout of histone PTMs is dependent on both the chemical modification and histone site, and the array of histone PTMs on chromatin is dynamic throughout the eukaryotic life cycle. Accordingly, methods that report on the presence of PTMs are essential tools for resolving open questions about epigenetic processes and for developing therapeutic diagnostics. Reader domains that recognize histone PTMs have shown potential as advantageous substitutes for anti-PTM antibodies, and engineering efforts aimed at enhancing reader domain affinities would advance their efficacy as antibody alternatives. Here we describe engineered chromodomains from Drosophila melanogaster and humans that bind more tightly to H3K9 methylation (H3K9me) marks and result in the tightest reported reader–H3K9me interaction to date. Point mutations near the binding interface of the HP1 chromodomain were screened in a combinatorial fashion, and a triple mutant was found that binds 20-fold tighter than the native scaffold without any loss in PTM-site selectivity. The beneficial mutations were then translated to a human homologue, CBX1, resulting in an even tighter interaction with H3K9me3. Furthermore, we show that these engineered readers (eReaders) increase detection of H3K9me marks in several biochemical assays and outperform a commercial anti-H3K9me antibody in detecting H3K9me-containing nucleosomes in vitro, demonstrating the utility of eReaders to complement antibodies in epigenetics research.
Investigation of trimethyllysine binding by the HPI chromodomain via unnatural amino acid mutagenesis
Stefanie A Baril, Amber L Koenig, Mackenzie W Krone, Katherine I Albanese, Cyndi Qixin He, Ga Young Lee, Kendall N Houk, Marcey L Waters, Eric M Brustad. J. Am. Chem. Soc. 2017, 139, 48, 17253-17256. https://doi.org/10.1021/jacs.7b09223
Histone post-translational modifications (PTMs) are crucial for many cellular processes including mitosis, transcription, and DNA repair. The cellular readout of histone PTMs is dependent on both the chemical modification and histone site, and the array of histone PTMs on chromatin is dynamic throughout the eukaryotic life cycle. Accordingly, methods that report on the presence of PTMs are essential tools for resolving open questions about epigenetic processes and for developing therapeutic diagnostics. Reader domains that recognize histone PTMs have shown potential as advantageous substitutes for anti-PTM antibodies, and engineering efforts aimed at enhancing reader domain affinities would advance their efficacy as antibody alternatives. Here we describe engineered chromodomains from Drosophila melanogaster and humans that bind more tightly to H3K9 methylation (H3K9me) marks and result in the tightest reported reader–H3K9me interaction to date. Point mutations near the binding interface of the HP1 chromodomain were screened in a combinatorial fashion, and a triple mutant was found that binds 20-fold tighter than the native scaffold without any loss in PTM-site selectivity. The beneficial mutations were then translated to a human homologue, CBX1, resulting in an even tighter interaction with H3K9me3. Furthermore, we show that these engineered readers (eReaders) increase detection of H3K9me marks in several biochemical assays and outperform a commercial anti-H3K9me antibody in detecting H3K9me-containing nucleosomes in vitro, demonstrating the utility of eReaders to complement antibodies in epigenetics research.