CSIRO researchers crunched one trillion genomic data points in the cloud to help locate parts of the human genome that cause disease.
The CSIRO's bioinformatics group used its own VariantSpark artificial intelligence (AI) based platform, which runs on Amazon Web Services (AWS).
In a new study published in the technical journal Giga Science, the researchers outlined how they analysed a synthetic dataset of 100,000 individuals’ genomes, each made up of over three billion DNA base pairs.
Dr Denis Bauer, head of the bioinformatics group, said no other technology platform has yet been able process one trillion data points of genomic data, over 10 million variants and 100,000 samples at once.
Using AI platforms in this way will be essential for the future of healthcare in Australia, CSIRO’s Australian e-Health research centre chief executive Dr David Hansen added.
"Artificial intelligence is a critical component of understanding genomic information," Hansen said.
"Despite recent technology breakthroughs with whole genome sequencing studies, the molecular and genetic origins of complex diseases are still poorly understood which makes prediction, application of appropriate preventive measures and personalised treatment difficult."
This is because many traits and disease are thought to be polygenic, or influenced by more than one gene, the Giga Science paper states.
VariantSpark was found to better identify genomic variants associated with complex genetic expressions compared to traditional monogenic, genome-wide association studies.
"Our research shows VariantSpark is the only method able to scale to ultra-high dimensional genomic data in a manageable time," Bauer said.
"It was able to process this information in 15 hours while it would take the fastest competitors likely more than 100,000 years to process such a volume of data.
"This is a significant milestone, as it means VariantSpark can be scaled up to analyse population-level datasets and drive better healthcare outcomes."
The paper concluded that VariantSpark is not a replacement for traditional genetic association analysis, but rather a complement.
“The results of traditional GWAS [genome-wide association studies] and VariantSpark should be considered together to gain insights into the full influence of the genome on disease and other phenotypes,” the authors wrote.