DeepMind, an artificial intelligence firm owned by Google, has modified its AlphaFold system to predict whether simple mutations in DNA are harmful. The adapted system, called AlphaMissense, has analyzed 71 million missense mutations in human proteins and made the results publicly available. This tool aims to help clinicians and geneticists determine the cause of genetic diseases by assessing the potential effects of mutations.
The Challenge of Genetic Variation
Every individual is born with numerous mutations that are not found in their parents. This leads to a significant amount of genetic variation between individuals, making it difficult for doctors to identify disease-causing mutations. When sequencing a person’s genome, there can be thousands of potential mutations linked to a specific condition.
The Role of AlphaMissense
AlphaMissense focuses on missense mutations, which occur when one of the DNA letters in a triplet is changed to another letter, potentially resulting in the wrong amino acid being added to a protein. This alteration can have varying effects on protein function. AlphaMissense compares the sequence of each potential mutated protein to a database of proteins that AlphaFold was trained on. If the mutated protein appears “natural,” it is deemed harmless. However, if it appears “unnatural,” it is classified as potentially harmful.
Advantages of AlphaMissense
When tested on known variants, AlphaMissense outperformed other computational methods. The system produced remarkable results, according to Joseph Marsh at the University of Edinburgh and Sarah Teichmann at the University of Cambridge. AlphaMissense helps prioritize which disease-causing mutations should be further investigated.
Limitations of the System
It is important to note that systems like AlphaMissense can only aid in the diagnosis process. Missense mutations are just one type of mutation, and other types, such as additions, deletions, duplications, and rearrangements, can also occur. Additionally, not all disease-causing mutations alter proteins; some affect nearby sequences involved in gene regulation.