home news forum careers events suppliers solutions markets expos directories catalogs resources advertise contacts
 
Solution Page

Solutions
Solutions sources
Topics A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  Species
 

iPlant taming the 'Big Data Beast' - The University of Arizona-headquartered iPlant Collaborative is ready to handle the projected rise in genomic information, turning raw data into scientific breakthroughs


Arizona, USA
July 23, 2015

Eric Lyons, the iPlant Collaborative co-principal investigator, is addressing the challenge presented by the projected rise in scientific data. (Photo: Judy Davis)
Eric Lyons, the iPlant Collaborative co-principal investigator, is addressing the challenge presented by the projected rise in scientific data. (Photo: Judy Davis)

In an era of unprecedented scientific discovery, researchers are generating more data than ever before. But do scientists have access to enough technological firepower to turn this mountain of data into tangible results?

Many biologists worry that the future rise in genomic data will strain the computational resources of the discipline beyond its capacity to store, analyze and distribute large datasets. However, University of Arizona assistant professor and iPlant Collaborative co-principal investigator Eric Lyons is much more optimistic.

"We are ready to meet this challenge today," Lyons said.

The UA-headquartered iPlant Collaborative is a National Science Foundation-funded cyberinfrastructure project providing computational support to life science researchers in the form of secure data storage, services for data analysis, and the underlying infrastructure to share datasets among collaborators anywhere in the world with an Internet connection.

"Currently, we are managing 1.1 petabytes of user data, broken into 88 million data objects with associated metadata. More than half of these datasets are shared among two or more users," Lyons said. "People can input huge datasets and easily share and collaborate from anywhere on the planet."

In addition to helping direct the iPlant Collaborative, Lyons is an assistant professor in the UA's College of Agriculture and Life Sciences and a member of the BIO5 Institute.

In a recent study published in PLoS Biology, the authors discuss technologies they predict will be needed to address future computational challenges posed by genomics, including a need for tools for data acquisition, storage, distribution and analysis.

The article's argument is timely, Lyons said.

"We’re generating data on a massive scale because the technology to generate it is becoming faster, cheaper and more available," he said.

The iPlant Collaborative, founded in 2008, has established an infrastructure to handle the projected rise in genomic data and any large datasets inherent to the life sciences, while providing an array of platforms, services, tools and training resources, empowering data scientists in all disciplines.

Originally created for plant science research, iPlant borrowed concepts and technologies from other disciplines, including astronomy and physics. Now, Lyons said, "those groups are looking to see what iPlant has done in terms of broadening inner connections so they can take best practices we've pioneered addressing biological problems back to their communities, thereby completing the cycle of scientific sharing."

When it comes to the problem of data storage, Lyons believes that storing process data may be more feasible, and more valuable to science in the future, than attempting to store all raw data generated. Michael Schatz, an associate professor of quantitative biology at Cold Spring Harbor Laboratory and adjunct assistant professor of computer science at Stony Brook University, and one of the PLoS Biology study authors, agrees with Lyons.

"In genomics, it’s going to be really important to think about aspects such as data compression, and the different analyses that need to take place," Schatz said. "I think that's where iPlant is going to play a role, providing data storage, compute resources and technological interfaces — making massive amounts of computer resources usable to specialists in high performance computing."

And if dataset numbers and sizes, in genomics and other disciplines, skyrocket in the future as predicted?

"We're always keeping our eyes on the future," Lyons said. "We realize that in life sciences, the problem of big data is immense. iPlant has worked, and will continue working, with existing cyberinfrastructure projects regardless of the science discipline and technologies used, to be prepared for the future."



More solutions from:
    . University of Arizona
    . CyVerse


Website: http://www.arizona.edu/

Published: July 24, 2015


Copyright @ 1992-2024 SeedQuest - All rights reserved