Genomic Sequences Processed in Minutes, Rather Than Weeks
RICHLAND, WA ““ A new computational tool developed at the Department of Energy’s Pacific Northwest National Laboratory is speeding up our understanding of the machinery of life ““ bringing us one step closer to curing diseases, finding safer ways to clean the environment and protecting the country against biological threats.
ScalaBLAST is a sophisticated “sequence alignment tool” that can divide the work of analyzing biological data into manageable fragments so large data sets can run on many processors simultaneously. The technology means large-scale problems ““ such as the analysis of an organism ““ can be solved in minutes, rather than weeks.
In the world of high-end computing, researchers assemble systems composed of many processors. For example, PNNL’s supercomputer has 1,960 processors ““ a big machine with lots of memory and the ability to tackle large problems. However, without special modifications, software doesn’t run any faster on it than it would on a personal computer. In order to get answers to complicated biological questions more quickly, PNNL researchers “parallelized” the software using Global Arrays, a powerful programming toolkit, by creating algorithms to divvy up the work.
PNNL researchers say ScalaBLAST may be used to process complex genomic sequences, work that is essential to understanding the building blocks of the genome ““ or rather, how they work and fit together. Genomes represent an organism’s entire DNA, including its genes. When the gene’s sequences are analyzed they can provide clues to diseases and possible treatments.
Using ScalaBLAST, researchers can manage the large influx of data resulting from new questions that arise during human genome research. Prior to this new tool, it took researchers 10 days to analyze one organism. Now, researchers can analyze 13 organisms within nine hours, making the time-to-solution hundreds of times faster.
“Access to and understanding the pieces of genome sequences will allow researchers to understand the body’s cellular machinery and discover clues to some types of cancer. And it will help in developing drugs or detection methods to be used for particular diseases,” said T.P. Straatsma, a PNNL senior research scientist.
And it likely will help in other areas of human health. It’s fair to say that, in the realm of human health and disease, if you can solve a problem in one area, you can often solve it in others ““ that’s the nature of human biology,” Straatsma said.
Having the ability to process large data sets with this computational tool can also provide new insight into how microorganisms can process toxic pollutants through processes like bioremediation. It also can help understand the components of biological systems, leading to better detection methods for homeland security purposes and making it possible to more quickly identify and respond to threats or develop biological countermeasures.
ScalaBLAST is a product of PNNL’s Advanced Computing Technology Laboratory, supporting research projects associated with high-end computing. Development of ScalaBLAST was funded primarily by the Department of Energy’s Office of Advanced Scientific Computing Research as part of the BioPilot project, a larger joint research effort between PNNL and Oak Ridge National Laboratory.
PNNL (www.pnl.gov) is a DOE Office of Science laboratory that solves complex problems in energy, national security, the environment and life sciences by advancing the understanding of physics, chemistry, biology and computation. PNNL employs more than 4,000 staff, has a $650 million annual budget, and has been managed by Ohio-based Battelle since the lab’s inception in 1965.
On the Internet: