home news forum careers events suppliers solutions markets expos directories catalogs resources advertise contacts
 
Solution Page

Solutions
Solutions sources
Topics A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  Species
 

TGAC scientists release NextClip: a bioinformatics tool for complex post-sequence data analysis


Norwich, United Kingdom
January 7, 2014

Scientists from The Genome Analysis Centre (TGAC) have released a research paper on a tool for the analysis of data from Long Mate Pair (LMP) libraries. The software, NextClip, generates a comprehensive quality report and extracts high class trimmed and de-duplicated data.

The tool supports Illumina’s recently released Nextera LMP kit, which enables the production of jumping libraries of up to 12kb. These Long Mate Pair libraries are an invaluable resource for analysing large areas of the genome, carrying out complex assemblies and other downstream bioinformatics analytics. However, LMP libraries are intrinsically noisy and to maximise their value, post-sequencing data analysis is required.

Author Richard Leggett, at TGAC, said: “Regulating laboratory protocols and selection of sequenced data for downstream analysis are vital in making effective use of mate pair libraries. However, quality control of the libraries can require significant bioinformatics analysis. Further processing is also required to extract true mate pair reads, remove fragment junction adaptors and clip reads. For this reason we developed NextClip, a tool for comprehensive quality analysis of Nextera LMP libraries and preparation of reads for scaffolding.”

Mate pair libraries are formed by making large fragments of DNA (5-12 kb in length for Nextera) and are sequenced from either end of the fragment to produce two sequences of DNA that are separated by a known distance.

Sequence reads from Long Mate Pair libraries are an important tool in the construction of complex genome assemblies because they connect large repeat regions. Grouping the data generated from mate pair library sequencing with shorter insert paired-end reads provide a powerful combination, allowing the joining together of longer DNA sequences, with higher certainty.

The study was led by Richard Leggett, Project Leader in the Sequencing Informatics group, with Director of TGAC Mario Caccamo, Bioinformatics Assembly Algorithms Development Project Leader Bernardo Clavijo, Library Construction Team Leader Leah Clissold and Plant and Microbial Genomics Group Leader Matthew Clark. The article was published by Oxford Journals’ Bioinformatics.



More solutions from: Earlham Institute


Website: http://www.earlham.ac.uk

Published: January 7, 2014


Copyright @ 1992-2025 SeedQuest - All rights reserved