Of all the scientific research being carried out today, genomics arguably requires the most computing resources. Even with today’s powerful hardware, it takes days to sequence entire genomes. Obviously, this has a drastic effect on the length of time it takes to complete genetic research studies and clinical trials.
Now, researchers from Harvard University and Nvidia are working together to speed things up—literally. The duo published a study in the journal Nature Communications that details an artificial intelligence (AI) based solution called the AtacWorks toolkit.
Using this tool, researchers are able to run inference on an entire genome in just half an hour. The same task usually takes more than two days using traditional methods.
AI typically excels at crunching vast quantities of data and analyzing large datasets. Although sequencing a genome is a bit more complex, it turns out that AI is also very good at it.
The researchers designed AtacWorks based on the well-established ATAC-seq method. Scientists use this to find areas of a person’s DNA that activate certain functions in the body. In other words, analyzing these areas can provide clues as to whether the person is susceptible to things like cancer or heart disease.
Typically, ATAC-seq requires tens of thousands of cells to work. When there are fewer cells available in the sample, the data produced appears “noisier” than when the sample is large. With the AI-based AtacWorks toolkit, researchers are able to get clean results with “tens of cells.”
Study co-author Jason Buenrostro of Harvard University says, “With AtacWorks, we’re able to conduct single-cell experiments that would typically require 10 times as many cells.”
“Denoising low-quality sequencing coverage with GPU-accelerated deep learning has the potential to significantly advance our ability to study epigenetic changes associated with rare cell development and diseases,” he adds.
This allows scientists to search for mutations that occur in rare cell types. With traditional methods, they may not be able to gather a large enough sample for sequencing. Since AtacWorks drastically decreases the number of cells needed, it’s possible to study these rare mutations.
Opening New Doors
Meanwhile, AtacWorks is also helping researchers study entire cell subtypes that couldn’t previously be explored. The team was able to identify areas of DNA that are associated with specific types, such as red blood cells or white blood cells.
That’s an important breakthrough for the genomics community since it opens the door for discoveries of previously unknown mutations and biomarkers. These could then help identify certain diseases or aid in the development of new treatments.
Nvidia researcher and lead author Avantika Lal says, “With very rare cell types, it’s not possible to study differences in their DNA using existing methods. AtacWorks can help not only drive down the cost of gathering chromatin accessibility data, but also opens up new possibilities in drug discovery and diagnostics.”