Modern life science research and development technologies will soon be generating petabytes of data; budgeting for necessary IT resources may slow pace of discovery.
(Burbank, CA) February 19, 2018—Life sciences research technology is in the midst of a digital arms race. Although research is leading to important discoveries, the pace of data creation is far outstripping the ability to store and analyze it. A heavy contributor to this phenomenon is next-generation sequencing of DNA. Human whole genome sets are typically hundreds of gigabytes in size. While current figures indicate that sequence data is doubling every seven to nine months, sequencing is still in its infancy. In 2014, an estimated 228,000 genomes were sequenced; that figure is now estimated to be over 1.6 billion genomes. Jim D’Arezzo, CEO of Condusiv Technologies, the world leader in I/O reduction solutions for virtual and physical server environments, explains, “This analysis gap threatens to slow the pace of important discoveries by forcing research organizations to allot an increasing portion of budgets to sharing and preserving research data.”
Genomics is only part of the problem. The field of connectomics maps neural connections and pathways in the brain, using nanometer-resolution electron microscopy. Connectomics data sets, the largest of which are in the 100-terabyte range, are soon expected to reach petabyte scales, largely driven by faster, higher-resolution electron microscopes. Dr. Dorit Hainen of the Sanford Burnham Prebys Medical Discovery Institute, for example, says that a soon-to-be-installed new microscope will produce high-resolution images at a rate of 400 frames per second, ten times the speed of her current equipment.1
Other large-scale undertakings, such as the Blue Brain Project, the 100K Genomes Project, the National Institutes of Health Human Microbiome Project, the BRAIN Initiative and the Cancer Moonshot will generate a total of hundreds of petabytes of data, and downstream analysis will generate even more. The burden of discovery in the life sciences is shifting from scientific methodologies to analytical frameworks and bioinformatics. To help facilitate this transition, universities such as Harvard have begun to offer data analytics courses, including programming, for career biologists.2
Not only foundational research organizations but also pharmaceutical and healthcare systems developers are being affected by the gap between data acquisition and analysis. Life sciences companies are now launching products at a more rapid rate, and in a greater number of therapy areas. As manufacturers focus on these innovative product launches, operating budgets remain strained by the simultaneous investments required, while their IT departments face an ongoing need to do more with limited resources. At the same time, the rapid and continuing evolution of technology puts additional pressure on IT organizations to deliver both innovation and efficiencies to their companies.3
In a sense, notes Condusiv’s D’Arezzo, medical research is simply coming up against a classic computing problem—the I/O bound state, in which the time it takes to complete a computation is determined principally by the period spent waiting for input/output operations to be completed.
D’Arezzo pointed out that “data analysis is inherently slower than data acquisition. It can be made a great deal faster, however, not by throwing money at new hardware, but by optimizing the performance of existing servers and storage. We are the world leaders in this area, and we have seen users of our software solutions more than double the I/O capability of storage and servers in their current configurations. For life sciences researchers grappling with rapidly expanding data sets, I/O optimization technology represents a safe, reasonably priced, highly effective way to increase their ability to perform analytics.”
About Condusiv Technologies
Condusiv Technologies is the world leader in software-only storage performance solutions for virtual and physical server environments, enabling systems to process more data in less time for faster application performance. Condusiv guarantees to solve the toughest application performance challenges with faster-than-new performance via V-locity® for virtual servers or Diskeeper® for physical servers and PCs. With over 100 million licenses sold, Condusiv solutions are used by 90% of the Fortune 1000 and almost three-quarters of the Forbes Global 100 to increase business productivity and reduce data center costs while extending the life of existing hardware. Condusiv chief executive officer Jim D’Arezzo has had a long and distinguished career in high technology.
Condusiv was founded in 1981 by Craig Jensen as Executive Software. Jensen authored Diskeeper, which became the best-selling defragmentation software of all time. Over 37 years, he has taken the thought leadership in file system management and caching and transformed it into enterprise software. For more information, visit www.condusiv.com.
- Hiatt, David, “The Next Digital Arms Race in Life Sciences,” Bio-IT World, August 23, 2017.
- “Data Analysis for Life Sciences,” edX Courses, Harvard University, 2018.
- Aitken, Murray, “Why life sciences companies need to tap technology to gain a competitive edge—and what that means for the chief information officer’s role,” Mhealth, March 23, 2016.