AI Breakthrough Decodes ‘Dark Matter’ of DNA with AlphaGenome

13

For decades, scientists have struggled to understand the vast stretches of non-coding DNA within our genomes—often called “junk DNA” because it doesn’t directly build proteins. Now, Google DeepMind has unveiled AlphaGenome, a new artificial intelligence model designed to predict how this mysterious genetic material influences health and disease. This represents a significant leap forward in genomics, potentially unlocking insights into how mutations impact gene expression.

The Genome’s Hidden Language

The human genome is overwhelmingly non-coding, with over 98% consisting of DNA that doesn’t translate into proteins. Yet, this “dark matter” regulates gene activity, determining whether cells function correctly or become diseased. The challenge has always been predicting how these non-coding regions work—until now.

AlphaGenome analyzes DNA sequences up to one million base pairs long, assessing how mutations alter gene expression. This is a crucial step because genetic diseases often stem from changes in these non-coding regions, not just the protein-coding genes. The model’s creators have made it freely available to the wider research community.

Building on AI Successes

AlphaGenome builds on DeepMind’s earlier breakthroughs in AI-powered biology. First came AlphaFold, which accurately predicts protein structures from amino acid sequences, earning its developers a Nobel Prize in Chemistry in 2024. Then came AlphaMissense, focusing on protein-coding mutations. AlphaGenome extends this power to the vast non-coding space, tackling a previously intractable problem.

“It’s like you have a huge book of three billion characters, and something wrong happened in this book,” explains Pushmeet Kohli, DeepMind’s VP of science. “AlphaGenome can be used to say, ‘If you change these words, what would be the effect?’”

How AlphaGenome Works

The model combines data from multiple sources related to gene expression, identifying patterns and predicting the functional consequences of DNA changes. A key innovation is its ability to handle extremely long DNA sequences without sacrificing accuracy—a limitation of previous tools. This means researchers can study entire regulatory regions at once, rather than piecing together fragmented data.

Potential Applications

AlphaGenome isn’t ready for clinical use yet. But its research applications are vast:
Understanding Disease: Pinpointing mutations that drive genetic diseases, including cancer.
Gene Therapy: Designing more effective treatments by targeting the correct regulatory regions.
Genome-Wide Studies: Analyzing how genomes regulate genes in different cells and tissues.
Rare Conditions: Helping diagnose rare genetic disorders where the underlying mutations are unknown.

“For all the best evaluations we have, AlphaGenome looks like they pushed [the field] forward a little bit,” says David Kelley of Calico Life Sciences.

Caveats and Future Steps

AlphaGenome has limitations. It was trained on human and mouse genomes only and may miss effects in other species. The model is also imperfect; it might predict no effect when one exists. Researchers are working to improve predictive power and quantify uncertainty.

“Predicting how a disease manifests from the genome is an extremely hard problem, and this model is not able to magically predict that,” says Žiga Avsec, DeepMind’s genomics lead. “But AlphaGenome can narrow down the pool of possible mutations involved in a disease, making it useful for prioritizing research.”

AlphaGenome represents incremental but real progress. It won’t solve the mysteries of the genome overnight, but it offers a powerful new tool for unraveling the complexities of life’s blueprint.

Попередня статтяJames Webb Telescope Delivers Most Detailed Dark Matter Map Yet
Наступна статтяJWST Reveals Stunning New Details of the Helix Nebula