SeedQuest - Central information website for the global seed industry

Solution Page

Solutions

Solutions sources

Topics

Species

Searching for similarity with Basic Local Alignment Search Tool (BLAST)

Norwich, United Kingdom
December 23, 2014

The Genome Analysis Centre (TGAC) welcomed participants to its one-day training workshop on biological analysis through sequence similarity searching - “Effective Similarity Searching - What BLAST does, Why it works”.

Effective Similarity Searching - what BLAST does, why it works To advance our understanding of genes, how they function and evolutionary developments it can be highly beneficial to search for sequence similarities within diverse DNA and protein samples. Similarities, for instance, can be used to understand evolutionary relationships and help to build phylogenetic trees, or locate genes that are known to be involved in the development of a desired genetic trait, such as disease resistance. Conducting these comparisons is a complex process and requires bioinformatics tools such as the Basic Local Alignment Search Tool (BLAST).

BLAST threshold

Often the programme of choice for researchers, BLAST runs an algorithm that identifies regions of nucleotides in DNA or amino acids in proteins that exceed a chosen threshold of similarity. As well as being able to alter this threshold, researchers may select the databases, or ‘target sequences’, from BLAST’s broad database that the sequence of interest, or ‘query sequence’, will be compared against. Through analysing the results from these searches, researchers can gain insight into an array of biological information.

Evolutionary context

The workshop, led by Professor William Pearson, University of Virginia, explored the biological and statistical concepts that make similarity searching a valuable tool. The group discussed the nature of homology (shared ancestry of sequences), the powerful insights that it can provide and how similarity scores can be used to infer its existence. Participants further explored strategies that will allow them to apply BLAST in an effective and tailored manner to their future research. This included understanding the effects of modifying the BLAST search parameters and appreciating when it can be beneficial to do so.

Effective Similarity Searching - what BLAST does, why it works

“To get a better sense of what BLAST does well and when it might not do things as well, it’s really important to put it in an evolutionary context - to think about the biology that is causing this to work; you need to understand that you’re always moving back and forth on an evolutionary tree,” said Professor William Pearson, Biochemistry and Molecular Genetics, University of Virginia.

Scoring matrices

Attendee Ben White, PhD student at TGAC, commented: "I was interested in learning more about BLAST as I’ll be working on data from unreferenced crop species; looking for markers associated with resistant genes. The course made me rethink a lot of what I’d been previously taught about BLAST, and FASTA; highlighting the importance of doing simple things like selecting appropriate databases and thinking about the scoring matrices being used. In my research, I’ll be sure to follow the key principles from the course of using protein databases and expectation values, not percent identity, for my future similarity searches with BLAST/FASTA."

Many thanks to all those who attended and facilitated this workshop.

More solutions from: Earlham Institute

Website: http://www.earlham.ac.uk

Published: January 8, 2015