IBM is working with the Coriell Institute for Medical Research to help the biobank operate the IT systems for its cryogenic freezers and better manage the 4.5 million samples of personalized genetic data.
Founded in 1953, Coriell is a nonprofit biomedical research institution and the largest biobank of human living cells. It runs the Coriell Personalized Medicine Collaborative research study, which aims to explore genome-informed personalized medicine.
"There’s an awful lot of information we can glean from someone’s genetic makeup to help us determine what drugs will work for that person or what complex conditions they might have a predilection for throughout the course of their life," Scott Megill, Coriell’s CIO, told eWEEK.
The biobank sought IBM’s help to find a way to manage the data in a way a nonprofit could afford while maintaining huge volumes of data, Megill said.
DNA extracted from blood cultures is stored for many years. With one person’s genome equal to 2 million points of data and about 1.5GB of information, storing this mass of data is a challenge.
"The sheer volume of data that’s generated from genetic testing is unlike anything that we’ve seen before," Megill said. "It’s almost a terabyte of information that’s generated for one patient. It’s really incumbent on us to put good tools and infrastructure in place to simply make that actually comprehensible."
Customizing treatments based on an individual’s genes brings great potential for treatment of patients with conditions such as cancer, diabetes and heart disease. The data can better inform the decisions of physicians as they care for patients.
IBM monitoring software allows Coriell to protect its genetic samples from cryogenic freezer failures and by using the IBM XIV Storage System, it has reduced storage costs by 30 percent.
XIV is a thin-provisioning grid storage platform that allocates space on demand, Megill noted. Moving to thin provisioning and using XIV’s grid mode allowed Coriell to significantly reduce the amount of storage space it consumes.
"Every file is divided into very small blocks and then spread evenly across the entire bank of drives, rather than files that go deep on a couple of drives in the array," Megill explained. "It means you can get the same sort of throughput out of the XIV using much slower spinning disks," he added, referring to standard 7,200-rpm drives.
To read the original eWeek article, click here: IBM Helps Coriell Institute Manage Vast Store of Medical Data