DNA Sequences Faster

DNA sequences resolved in minutes opposed to days

Method tames giant bioinformatics database

Summary:
DNA Sequences Resolved In Minutes Opposed To Days:  Database searches for DNA sequences that can take biologists and medical researchers days can now be completed in a matter of minutes, thanks to a new search method developed by computer scientists.
dnasequencing_853x480-min
FULL STORY

DNA Sequences Resolved In Minutes Opposed To Days:  Database searches for DNA sequences that can take biologists and medical researchers days can now be completed in a matter of minutes, thanks to a new search method developed by computer scientists at Carnegie Mellon University.

The method developed by Carl Kingsford, associate professor of computational biology. student in the Computational Biology Department, is designed for searching so-called “short reads” — DNA and RNA sequences generated by high-throughput sequencing techniques. It relies on a new indexing data structure, called Sequence Bloom Trees, or SBTs, that the researchers describe in a report published online by the journal Nature Biotechnology.

DNA Sequences Resolved In Minutes Opposed To Days:  The National Institutes of Health maintains a humongous database, called the Sequence Read Archive, which contains about three petabases, or sequences totaling three quadrillion base-pairs. The information is useful to a wide swath of researchers, from those asking questions about basic biological processes to those studying potential cancer cures.

DNA Sequences Resolved In Minutes Opposed To Days:  “The database contains untold numbers of as-yet undiscovered insights and is heavily used,” Kingsford said. “Its main problem is that it’s very difficult to search.”  Thousands of hard drives would be needed to store these sequences. Searching through the short reads, which are typically 50 to 200 base-pairs each, to see which ones could be assembled to form a target gene of perhaps 10,000 base-pairs, is cumbersome and can take days in some cases, he noted.

Just as an index can speed searches through a book or catalog, the SBT-based index developed by Kingsford and Solomon can greatly speedup searches of this bioinformatics database. They actually represent each short read as a set of fixed-length subsequences, employing data structures called Bloom filters that can efficiently store information in a small space and can test whether an element is part of a set.

DNA Sequences Resolved In Minutes Opposed To Days:  At the first level of inquiry, the SBTs can tell whether a target DNA sequence is contained in the database at all. If it is, the search proceeds to the next level, where the SBTs indicate whether the sequence is in one half or the other of the database. At each level, the inquiry branches one way or the other until the desired experiments are identified.

Kingsford and Solomon tested their technique using a database of 2,652 human blood, breast and brain experiments, each of which often contain over a billion base-pairs of RNA sequences. They found that most searches of that database could be completed in an average of 20 minutes. They estimated the comparable search time using existing techniques, known as SRA-BLAST and STAR, would take 2.2 days and 921 days, respectively.  Further speedups are possible because batches of over 200,000 queries can be performed simultaneously, they noted.


Leave a Reply

Your email address will not be published.

ABCNEWS ADWEEK ATLANTIC AXIOS BBC BILD BILLBOARD BLAZE BOSTON GLOBE BOSTON HERALD BREITBART BUSINESS INSIDER BUZZFEED CBS NEWS CBS NEWS LOCAL CELEBRITY SERVICE C-SPAN CHICAGO SUN-TIMES CHICAGO TRIB CHRISTIAN SCIENCE CNBC CNN DAILY BEAST DAILY CALLER DEADLINE HOLLYWOOD DER SPIEGEL E! ECONOMIST ENT WEEKLY FINANCIAL TIMES FORBES FOXNEWS FRANCE 24 FREE BEACON FREE REPUBLIC HOT AIR HELLO! HILL HILL: JUST IN H'WOOD REPORTER HUFFINGTON POST INFOWARS INTERCEPT JERUSALEM POST LA DAILY NEWS LA TIMES LIFEZETTE LUCIANNE.COM MEDIAITE MOTHER JONES NATION NATIONAL REVIEW NBC NEWS NEW REPUBLIC NEW YORK NY DAILY NEWS NY OBSERVER NY POST NY TIMES NY TIMES WIRE NEW YORKER NEWSBUSTERS NEWSMAX PEOPLE PJ MEDIA POLITICO RADAR REAL CLEAR POLITICS REASON ROLL CALL ROLLING STONE SALON SAN FRAN CHRON SKY NEWS SLATE SMOKING GUN TALKING POINTS MEMO TIME MAG TMZ [UK] DAILY MAIL [UK] DAILY MAIL FEED [UK] DAILY MIRROR [UK] DAILY RECORD [UK] EVENING STANDARD [UK] EXPRESS [UK] GUARDIAN [UK] INDEPENDENT [UK] SUN [UK] TELEGRAPH US NEWS USA TODAY VANITY FAIR VARIETY WALL STREET JOURNAL WASH EXAMINER WASH POST WASH TIMES WEEKLY STANDARD WORLD NET DAILY ZERO HEDGE

3 AM GIRLS CINDY ADAMS MIKE ALLEN BAZ BAMIGBOYE DAVE BARRY FRED BARNES MICHAEL BARONE PAUL BEDARD BIZARRE [SUN] BRENT BOZELL DAVID BROOKS PAT BUCHANAN DYLAN BYERS HOWIE CARR MONA CHAREN CNN: RELIABLE SOURCES [NY DAILY NEWS] CONFIDENTIAL DAVID CORN ANN COULTER LOU DOBBS MAUREEN DOWD LARRY ELDER JOSEPH FARAH RONAN FARROW SUZANNE FIELDS ROGER FRIEDMAN BILL GERTZ JONAH GOLDBERG GLENN GREENWALD LLOYD GROVE HANNITY VICTOR DAVIS HANSON STEPHEN HAYES HUGH HEWITT KATIE HOPKINS DAVID IGNATIUS LAURA INGRAHAM INSIDE BELTWAY RICHARD JOHNSON ALEX JONES MICKEY KAUS KEITH J. KELLY KRAUTHAMMER KRISTOF KRISTOL KRUGMAN HOWIE KURTZ MARK LEVIN DAVID LIMBAUGH RUSH LIMBAUGH RICH LOWRY MICHELLE MALKIN ANDREW MCCARTHY DANA MILBANK PIERS MORGAN DICK MORRIS PEGGY NOONAN PAGE SIX ANDREA PEYSER POLITICO MORNING MEDIA POLITICO PLAYBOOK BILL PRESS WES PRUDEN REX REED RICHARD ROEPER JIM RUTENBERG MICHAEL SAVAGE BRIAN STELTER ROGER STONE CAL THOMAS TV NEWSER JEFF WELLS GEORGE WILL WALTER WILLIAMS BYRON YORK

AGENCE FRANCE-PRESSE AP TOP AP RADIO BLOOMBERG DEUTSCHE PRESSE-AGENTUR INDO-ASIAN NEWS SERVICE INTERFAX ITAR-TASS KYODO MCCLATCHY [DC] PRAVDA PRESS TRUST INDIA PR NEWSWIRE REUTERS REUTERS POLITICS REUTERS WORLD XINHUA UPI YONHAP