Genome Folding and POWER8: Accelerating Insight and Discovery in Medical Research

By Richard Talbot, Director – Big Data, Analytics and Cloud Infrastructure

No doubt, the words “surgery” and “human genome” rarely appear in the same sentence. Yet that’s what a team of research scientists in the Texas Medical Center announced recently — a new procedure designed to modify how a human genome is arranged in the nucleus of a cell in three dimensions, with extraordinary precision. Picture folding a genome almost as easily as a piece of paper.

An artist’s interpretation of chromatin folded up inside the nucleus. The artist has rendered an extraordinarly long contour into a small area, in two dimensions, by hand. Credit: Mary Ellen Scherl.

An artist’s interpretation of chromatin folded up inside the nucleus. The artist has rendered an extraordinarly long contour into a small area, in two dimensions, by hand. Credit: Mary Ellen Scherl.

This achievement, which appeared recently in the Proceedings of the National Academy of Sciences, was driven by a team of researchers led by Erez Lieberman Aiden, a geneticist and computer scientist with appointments at the Baylor College of Medicine and Rice University in Houston, and his students Adrian Sanborn and Suhas Rao. The news spread quickly across a broad range of major news sites. Because genome folding is thought to be associated with many life-altering diseases, the implications are profound. Erez said, “This work demonstrates that it is possible to modify how a genome is folded by altering a handful of genetic letters, without disturbing the surrounding DNA.”

Lurking just beneath the surface, this announcement represents a major computational achievement also. Erez and his team have been using IBM’s new POWER8 scale-out systems packed with NVIDIA Tesla K40 GPU accelerators to build a 3-D visualization of the human genome and model the reaction of the genome to this surgical procedure.

The total length of the human genome is over 3 billion base pairs (a typical measure of the size of a human or mammalian genome) and the data required to analyze a single person’s genome can easily exceed a terabyte: enough to fill a stack of CDs that is 40 feet tall. Thus, the computational requirement behind this achievement is a grand challenge of its own.

POWER8 memory bandwidth and the high octane computational horsepower of the NVIDIA Tesla Accelerated Computing Platform enabled the team to run applications that aren’t feasible on industry standard systems. Aiden said that the discoveries were possible, in part, because these systems enabled his team to analyze far more 3-D folding data than they could before.

This high performance cluster of IBM POWER8 systems, codenamed “PowerOmics”, was installed at Rice University in 2014 and made available to Rice faculty, students and collaborative research programs in the Texas Medical Center. The name “PowerOmics” was selected to portray the Life Sciences research mission of this high performance compute and storage resource for the study of large-scale, data-rich life sciences — such as genomics, proteomics and epigenomics. This high performance research computing infrastructure was made possible by a collaboration with OpenPOWER Foundation members Rice University, IBM, NVIDIA and Mellanox.


For more information:


Share this post Email this to someoneShare on FacebookShare on Google+Share on LinkedInTweet about this on Twitter

Leave a Reply

Your email address will not be published. Required fields are marked *