THESES db: The algae 18S rDNA sequence-structure database for inferring phylogenies

Forskningsoutput: TidskriftsbidragArtikel i vetenskaplig tidskrift

Standard

THESES db : The algae 18S rDNA sequence-structure database for inferring phylogenies. / Rodrigues, Maria Valentina Marin; Müller, Tobias; Buchheim, Mark Alan; Canbäck, Björn; Wolf, Matthias.

I: Phycologia, Vol. 56, Nr. 2, 2017, s. 186-192.

Forskningsoutput: TidskriftsbidragArtikel i vetenskaplig tidskrift

Harvard

APA

CBE

MLA

Vancouver

Author

Rodrigues, Maria Valentina Marin ; Müller, Tobias ; Buchheim, Mark Alan ; Canbäck, Björn ; Wolf, Matthias. / THESES db : The algae 18S rDNA sequence-structure database for inferring phylogenies. I: Phycologia. 2017 ; Vol. 56, Nr. 2. s. 186-192.

RIS

TY - JOUR

T1 - THESES db

T2 - Phycologia

AU - Rodrigues, Maria Valentina Marin

AU - Müller, Tobias

AU - Buchheim, Mark Alan

AU - Canbäck, Björn

AU - Wolf, Matthias

PY - 2017

Y1 - 2017

N2 - The use of 18S rDNA sequences for inferring phylogenies, in particular for higher taxonomic level analysis, has a long tradition in phycology. Similar to ITS2, the 18S rDNA displays a conserved secondary structure that could be used simultaneously with the primary sequence to increase the amount of information used when inferring phylogenetic relationships. Sequence-structure phylogenetics is already established for ITS2 research. Secondary structures no longer simply guide alignments and trees but are used simultaneously by encoding the sequence-structure information into a 12- letter alphabet. We used the knowledge gathered from the extensive body of ITS2 research regarding sequence-structure phylogenetics and applied it to 18S rRNA data; we present THESES db, the Algae 18S rDNA Sequence-Structure Database (http://mbio-serv2.mbioekol.lu.se/THESESdb), which contains sequences and their individual secondary structures for three major groups of algae (Chlorophyta, Bacillariophyta and Rhodophyta). This database was designed to serve as the starting point for future 18S rDNA sequence-structure based phylogenetic analyses that will eventually extend beyond phycology. One hundred phylogenetic trees generated from 18S sequence-only datasets and from parallel 18S sequence-structure datasets were compared for each taxon analyzed in this study (diatoms, green algae and red algae). Half of the comparisons produced trees with different topologies that frequently related to the status of sister genera. Using the lineage information for each species as listed in GenBank, we determined that the sequence-structure approach resolved a genus as monophyletic, while the sequence-only approach failed to do so in comparisons that comprised 3% of the cases examined. The reverse was true for a total of 8.3% of the comparisons that we generated. Future work, both in our labs and among the broader phycological community, will provide additional data to test the accuracy and robustness of a sequence-structure approach at different taxonomic ranks.

AB - The use of 18S rDNA sequences for inferring phylogenies, in particular for higher taxonomic level analysis, has a long tradition in phycology. Similar to ITS2, the 18S rDNA displays a conserved secondary structure that could be used simultaneously with the primary sequence to increase the amount of information used when inferring phylogenetic relationships. Sequence-structure phylogenetics is already established for ITS2 research. Secondary structures no longer simply guide alignments and trees but are used simultaneously by encoding the sequence-structure information into a 12- letter alphabet. We used the knowledge gathered from the extensive body of ITS2 research regarding sequence-structure phylogenetics and applied it to 18S rRNA data; we present THESES db, the Algae 18S rDNA Sequence-Structure Database (http://mbio-serv2.mbioekol.lu.se/THESESdb), which contains sequences and their individual secondary structures for three major groups of algae (Chlorophyta, Bacillariophyta and Rhodophyta). This database was designed to serve as the starting point for future 18S rDNA sequence-structure based phylogenetic analyses that will eventually extend beyond phycology. One hundred phylogenetic trees generated from 18S sequence-only datasets and from parallel 18S sequence-structure datasets were compared for each taxon analyzed in this study (diatoms, green algae and red algae). Half of the comparisons produced trees with different topologies that frequently related to the status of sister genera. Using the lineage information for each species as listed in GenBank, we determined that the sequence-structure approach resolved a genus as monophyletic, while the sequence-only approach failed to do so in comparisons that comprised 3% of the cases examined. The reverse was true for a total of 8.3% of the comparisons that we generated. Future work, both in our labs and among the broader phycological community, will provide additional data to test the accuracy and robustness of a sequence-structure approach at different taxonomic ranks.

KW - 4SALE

KW - Alignment

KW - Bacillariophyta

KW - Chlorophyta

KW - Diatoms

KW - Green algae

KW - Homology modelling

KW - Internal transcribed spacer 2

KW - ITS2

KW - Red algae

KW - Rhodophyta

KW - RNA

KW - Substitution model

UR - http://www.scopus.com/inward/record.url?scp=85010618253&partnerID=8YFLogxK

U2 - 10.2216/16-71.1

DO - 10.2216/16-71.1

M3 - Article

VL - 56

SP - 186

EP - 192

JO - Phycologia

JF - Phycologia

SN - 0031-8884

IS - 2

ER -