MIT team developed a ChatGPT-like model to accelerate AI drug discovery

News

MIT team developed a ChatGPT-like model based on a large protein language model to accelerate AI drug discovery

The MIT team developed a ChatGPT-like model based on a large protein language model to accelerate AI drug discovery.

Running a drug screening project is like throwing a big drinking party and listening and documenting the whole process. The vast majority of exchanges at a reception are small talk, with meaningful conversations only a handful. The same is true for drug screening, where weak drug-target interactions far outweigh high-affinity binding.

If you had to listen carefully to every line of dialogue, you can imagine how boring it would be. Evaluating every drug-target interaction in classic drug screening is an equally bad experience, exhausting even the most patient listener— the AI .

Traditional artificial intelligence (AI) systems take a long time to screen data on interactions between drug candidates and protein targets. Most AI systems calculate the three-dimensional structure of each target protein based on its amino acid sequence, and then use these structures to predict which drug molecules it will interact with. This approach, while exhaustive, is slow.

What kind of technological innovations can the large language model AI technology represented by ChatGPT , which has exploded this year, bring to drug screening?

On June 8, 2023, Bonnie Berger , a professor at the Massachusetts Institute of Technology (MIT) and a pioneer in computational molecular biology, published in the Proceedings of the National Academy of Sciences (PNAS) entitled: Contrastive learning in protein language space predicts interactions between drugs and Research paper on protein targets .

The research team designed an AI algorithm based on a large language model – ConPLex , which can analyze a large amount of text like ChatGPT and find the words (in this study, amino acids ) that are most likely to appear together , thereby linking the target protein with the potential Drug molecules are matched without the computationally intensive steps of computing molecular structures.

ConPLex can take advantage of the pre-trained protein language model (PLex) and adopt protein-anchored contrastive co-embedding (Con) , surpassing the current state-of-the-art algorithms to achieve prediction of interactions between drugs and target proteins through contrastive learning in the protein language space.

The MIT team developed a ChatGPT-like model based on a large protein language model to accelerate AI drug discovery.

Drug discovery is a fundamental task to achieve human health, but the whole process is still very expensive and time-consuming. It usually takes more than 1 billion US dollars and 10 years for a drug to go from development to approval and marketing.

While experimental results remain the gold standard for validating drug function, in silico predictions of drug-target binding can also be faster and cheaper, and so will continue to play an important role in the early screening of therapeutic candidates.

To address a critical first step in drug discovery, the research team developed the ConPLex model in an attempt to advance drug-target interactions (DTIs) to unknown drug and target types while being able to distinguish between highly similar structures with different binding characteristics. drug candidate molecules .

ConPLex addresses the above two challenges through the dual use of protein language model (PLM) and contrastive learning (Contrastive Learning) .

ConPLex ‘s wide applicability, specificity for decoys, and large-scale data capabilities will enable it to achieve a key step in drug discovery, helping to greatly improve the efficiency and success rate of drug discovery.

According to the research team, ConPLex achieves high precision, broad adaptability to unseen data, and specificity to decoy compounds. It predicts binding based on the distance between learned representations, enabling prediction of large-scale compound libraries and the human proteome.

The MIT team developed a ChatGPT-like model based on a large protein language model to accelerate AI drug discovery. Architecture and training framework of the ConPLex model, ConPLex is trained in two phases to optimize generality and specificity.

The research team tested the model by screening a library of approximately 4,700 candidate drug molecules for their ability to bind to 51 protein kinases. The entire screening process was entirely based on the predictions of the ConPLex model without reference to previous experimental data or literature.

The research team selected 19 pairs of drug-protein pairs from the predicted popular candidate molecules for further experimental testing. The experimental results showed that 12 of them had strong binding affinities, and 4 of them had extremely strong subnanomolar binding affinities. , so that a tiny drug concentration of one part per billion is sufficient to inhibit the function of the target protein. For example, the model screened out PD-166326, an inhibitor that can potently bind to protein tyrosine kinase receptor B1 (EPHB1) , with K D =1.3nM.

The MIT team developed a ChatGPT-like model based on a large protein language model to accelerate AI drug discovery.

In this paper, the research team mainly applied the ConPLex model to the screening of small molecule drugs, but the research team said that they are applying the model to other types of drug screening, such as therapeutic antibodies .

In addition, the model can used for toxicity screening of potential drug compounds to ensure that they do not have any unwanted toxic side effects before being tested in animal models .

Professor Bonnie Berger , the corresponding author of the paper , said that this work addresses the need for efficient and accurate silicon-based screening of potential drug candidates.

The ConPLex model can be used for large-scale screening to evaluate off-target effects, drug repurposing, and determine the impact of mutations on drug binding. .

One of the reasons why new drug development is so expensive is that its failure rate is too high.

If we can predict in advance and screen out those candidate drug molecules that are unlikely to be effective through this artificial intelligence model, the failure rate can be greatly reduced, thereby greatly reducing the cost of new drugs. development time and cost .

The research team also made ConPLex open source.

Paper link :
https://www.pnas.org/doi/10.1073/pnas.2220778120

The MIT team developed a ChatGPT-like model based on a large protein language model to accelerate AI drug discovery.

(source:internet, reference only)

Disclaimer of medicaltrend.org

Important Note: The information provided is for informational purposes only and should not be considered as medical advice.

Tags: drug