Improving T cell receptor:peptide-MHC interaction predictions

T cells are a critical part of the adaptive immune system that respond to pathogens, tumors, or play a key role in autoimmune diseases. A T cell recognizes potential targets through its T cell receptor (TCR), a plasma membrane-anchored protein which detects foreign molecules referred to as antigens presented for inspection by specialized immune cells. When an antigen-presenting cell (APC) encounters an antigen, the offending protein is engulfed and broken down into peptides. These peptides are bound and presented by the major histocompatibility complex (MHC) proteins on the surface of APCs. The peptide/MHC complex interacts with the appropriate TCR. The formation of the TCR-peptide-MHC (TCR-pMHC) complex is the first step leading to T cell activation and is vital for an effective immune response. Visualization of the TCR-pMHC interaction is crucial to understanding how it influences T cell activation, but it is extremely challenging due to the variety of TCRs and MHC peptides and the limited number of experimentally validated TCR-pMHC interactions. TCRs consist of two membrane-anchored protein chains, alpha and beta, each containing a constant (C) and a variable (V) domain; the V domain directly interacts with the peptide-MHC complex. The V domain is the product of a complex succession of DNA sequence rearrangements occurring during T cell development, that confer upon it a unique specificity for a specific antigen. Both alpha and beta chains have complementarity determining regions (CDRs) at the terminus of the V domain, which recognize peptides presented by MHC.

Inspired by recent breakthroughs in protein structure prediction, Dr. Philip Bradley from the Public Health Sciences Division is interested in using 3D-structural modeling to create a generalizable prediction algorithm for TCR-pMHC binding specificity. To do so, Dr. Bradley “modified the deep learning structure prediction method AlphaFold to build improved models of T cell receptor (TCR):peptide-MHC complexes.” His findings were recently published in eLife.

Dr. Bradley first constructed diverse TCR-pMHC models using the default AlphaFold version (AlphaFold Multimer) that specializes in predicting structures of multiple interacting proteins. As he suspected, AlphaFold Multimer showed inconsistent performance on TCR-pMHC structures. In order to improve the accuracy of the prediction of TCR-pMHC structures, Dr. Bradley developed an “AlphaFold pipeline that utilizes hybrid templates created from existing TCR:pMHC structures to constrain TCR docking orientation to native-like geometries.” Based on these hybrid templates, independent AlphaFold simulations are conducted, and the final prediction represents the structure with the highest degree of confidence. Importantly, the improved AlphaFold pipeline was shown to generate accurate TCR:pMHC structure predictions that show a good agreement between predicted and observed structures. Additionally, by fine-tuning AlphaFold parameters on TCR:pMHC structures, model accuracy can be improved. AlphaFold's parameters were fine-tuned using human TCR:pMHC complexes and the performance of this model was evaluated using mouse TCR:pMHC targets. The fine-tuning resulted in improved predictions for mouse TCR:pMHC complexes. Based on these results, the AlphaFold pipeline can learn "generalizable features of the TCR:pMHC interaction."

Left figure. A representation of the interaction between a TCR heterodimer (blue and yellow) and a peptide-MHC (green), showing the coordinate frames used to build hybrid templates. Middle figure. Cartoon of pMHC specificity prediction test with one wild-type (WT) peptide and two decoys. Right figure. The AlphaFold pipeline generates TCR:pMHC binding scores for each TCR:pMHC pairing, with lower scores indicating stronger predicted binding. Image provided by Dr. Bradley

Next, the performance of the AlphaFold pipeline on TCR specificity prediction was evaluated. To do so, Dr. Bradley used a benchmark set of ~400 experimentally validated TCR:pMHC pairings. The AlphaFold TCR pipeline was used to generate binding scores between each TCR and its cognate pMHC epitope as well as nine decoy epitopes (a lower score indicates a stronger predicted binding). Based on the scores, the true TCR-pMHC pairings ranked lower than most of the decoys. Dr. Bradley explained that “these structural models can, in some cases, help to discriminate correct from incorrect TCR:pMHC pairings.”

To investigate whether the structural models’ accuracy correlated with binding prediction, Dr. Bradley compared the structural model of each TCR in complex with the wild-type epitope to all experimentally determined ternary structures in the protein structure database. The value of these interactions was used as a proxy for the accuracy of the predicted binding mode. Well-predicted epitopes indeed showed a lower value, indicating a more favorable binding prediction. Overall, these results indicate that decoy discrimination is correlated with structural accuracy, indicating that Dr. Bradley’s pipeline is selecting the correct peptide based on molecular specificity.

Going forward, Dr. Bradley hopes to “use structural modeling, augmented by data-driven machine learning, to interpret TCR sequence data, linking TCRs to their cognate viral, tumor, or autoimmune epitopes.” He is very excited about “improving these predictions by fine-tuning AlphaFold's neural network parameters on TCR:peptide-MHC data.” He added “I think this approach could also be useful for screening designed TCR sequences to select ones that are more likely to have the desired epitope specificity.”

The spotlighted research was funded by the National Institutes of Health.

Fred Hutch/University of Washington/Seattle Children's Cancer Consortium member Philip Bradley contributed to this work.

Bradley P. Structure-based prediction of T cell receptor:peptide-MHC interactions. Elife. 2023 Jan 20;12:e82813.