home news forum careers events suppliers solutions markets expos directories catalogs resources advertise contacts
 
Solution Page

Solutions
Solutions sources
Topics A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  Species
 

Building solutions for bioinformatics from the ground up


Norwich, United Kingdom
March 20, 2014

Following Jason Williams's visit from New York to TGAC last month for his seminar “iPlant Collaborative - A Unified Cyberinfrastructure for the New Life Science”, we speak to him about how the virtual organisation is benefiting life scientists.

Could you tell us about iPlant Collaborative and its "unified cyberstructure" for new life science?
The iPlant Collaborative is a virtual organisation created by the U.S.’s National Science Foundation (NSF) to build a cyberinfrastructure for plant sciences. It’s become much more than that, but first I should explain the concept of cyberinfrastructure. Cyberinfrastructure (or CI) is the organising of data storage, high-performance computing, software, and people into a system capable of solving the largest computational problems in biology - especially problems that could not be solved otherwise. Some of the “Big Data” challenges that would come to mind for most biologists include things like genome sequencing and assembly. Those are big challenges, but iPlant’s vision reaches further. We want to enable biologists to not only understand individual pieces of the puzzle (for example, genomic data, environmental data, phenotypes, etc.) but to have the capacity to merge these data into understanding the underlying biology and predicting networks – molecular, metabolic, ecological, and beyond. The capacity we are building comes at a time when the analysis of data (rather than its acquisition) is the bottleneck. What used to take a decade to acquire (for example, the human genome) now can be done in hours. As impressive as this is, there is still a great need not only for the technological capacities of CI, but the people capacities – user friendly interfaces to these analyses platforms, and training on how to use them.

What brought your seminar to the UK?
I had the pleasure of being invited by Vicky Schneider and Rubina Kalra. I am part of iPlant Education, Outreach, and Training effort, and it was only natural that we would want to collaborate given our shared passion for helping researchers, students, and the public to understand the exciting changes going on in genomics and other areas of biology.

How did you get involved with the National Science Foundation and iPlant Open Source?
I attended the iPlant kick-off meeting at Cold Spring Harbor Laboratory in 2008. While I did not get involved with iPlant at the time, when I later joined the laboratory’s DNA Learning Centre, I became part of the team that was developing iPlant’s strategy for reaching educators and students with tools that would allow them to leverage the CI we’ve built.

As an advocate of cyberinfrastructure being accessible to all life scientists, which future collaborative projects do you anticipate?
One of the themes of my talk is an observation that the transition to a data-driven biology has some parallels with the development of modern physics. By necessity physicists have to congregate around computation, as well as increasingly large and expensive apparatuses (e.g. the LHC (Large Hadron Colider)). Biologists are different in that we are moving away from central equipment resources, and can sequence genomes in our own labs. This means that facilities like genome centres may take on a new role of pushing the boundaries of technologies and developing best practices. Additionally, increasingly demanding computational challenges will still push biologists to merge and share those resources. This is why iPlant exists, and why future collaborations will mean sharing and disseminating lessons about CI with global partners. In this process we will find ways to avoid reduplication of efforts, and provide better capabilities for users.

How will iPlant's last year's renewal benefit the "Big Data" needs of the science community?
iPlant’s renewal is a tremendous “proof of concept” that reflects well on the members of the iPlant team itself, as well as the vision of the NSF. We were able to build a well-adopted platform that has enabled researchers to analyse and share data in ways that were not possible when the project began. The next few years will not only grow our capabilities, but allow us to learn from, and contribute to the fostering of uniform community practices. Bioinformatics as a domain of life science has developed organically, and thus in some ways “messily.” Many file formats are mutually incompatible or arcane, software and platforms can become orphaned, duplicating analysis pipelines can be impossible. I think iPlant provides a context where we can develop in a way that deliberately serves a vision that is rooted in community feedback. I hope one of the objectives iPlant achieves is to make bioinformatics more uniform, more reproducible, and more accessible.

Who do you see as iPlant's new generation of biologists set to address the "Grand Challenges" of plant biology?
As always the next generation lies just behind us, perhaps in a second year biology class getting ready for finals. We want these biologists to come to the lab already understanding how to approach biological problems with computational thinking – understanding how, when, and to what end their wet-lab investigations will enter and be manipulated by computation. There is though a subtle danger here I think. We are also going through a point in this generations’ interaction with technology where much of what is going on computationally is completely obscured; the average undergraduate knows how to use Twitter, but has no idea what an API (Application Programming Interface) is. I think this generation will also need a more formal grounding in basic computational theory. They need to learn from the current biologists who have “paid their dues” by building solutions to these bioinformatics problems from the ground up, and who also have a tremendous wealth of knowledge (and instinct) for their organism or system.

What is the biggest achievement of iPlant Collaborative in advancing life science to date?
Categorising iPlant’s achievements has always been an exercise in the “non-traditional” measures for success; cited publications are not the only measure of how we enable science. While there are dozens of projects we have made possible, from my perspective as an educator, the most important success is when one of our workshop attendees is able to achieve their analyses goals. In the end, I don’t really think it has so much to do with anything I demonstrated, rather, I think it’s the quality of the system we’ve built. When a new user feels confident about her/his ability and finally has the tools needed, she also can get the feeling of control over her data – no more need for a bioinformatics “black box.”

TGAC in the future: your one wish? Or vision?
TGAC in the future? I hope that TGAC continues to push the development of new analyses for sequence data. We still have a lot to learn about what data is important (and what is noise). While there will always be new biology to discover, the faster that we can take genome (transcriptome, proteome) data and turn that into decisions about medicine and agriculture, the better we will be able to keep up with quickly accelerating challenges in health and climate change.

Jason Williams is the iPlant’s Education, Outreach, and Training Lead – based out of Cold Spring Harbor Laboratory, Cold Spring Harbor NY, he has a background in plant molecular biology. Jason is also an Educator at CSHL’s DNA Learning Center and faculty at Yeshiva University – running a science immersion course at Yeshiva University High School for Girls.

Jason Williams @ TGAC: iPlant Collaborative - A Unified Cyberinfrastructure for the New Life Science



More solutions from:
    . Earlham Institute
    . CyVerse
    . NSF - National Science Foundation


Website: http://www.earlham.ac.uk

Published: March 21, 2014


Copyright @ 1992-2024 SeedQuest - All rights reserved