THESES db: The algae 18S rDNA sequence-structure database for inferring phylogenies

Research output: Contribution to journalArticle


title = "THESES db: The algae 18S rDNA sequence-structure database for inferring phylogenies",
abstract = "The use of 18S rDNA sequences for inferring phylogenies, in particular for higher taxonomic level analysis, has a long tradition in phycology. Similar to ITS2, the 18S rDNA displays a conserved secondary structure that could be used simultaneously with the primary sequence to increase the amount of information used when inferring phylogenetic relationships. Sequence-structure phylogenetics is already established for ITS2 research. Secondary structures no longer simply guide alignments and trees but are used simultaneously by encoding the sequence-structure information into a 12- letter alphabet. We used the knowledge gathered from the extensive body of ITS2 research regarding sequence-structure phylogenetics and applied it to 18S rRNA data; we present THESES db, the Algae 18S rDNA Sequence-Structure Database (, which contains sequences and their individual secondary structures for three major groups of algae (Chlorophyta, Bacillariophyta and Rhodophyta). This database was designed to serve as the starting point for future 18S rDNA sequence-structure based phylogenetic analyses that will eventually extend beyond phycology. One hundred phylogenetic trees generated from 18S sequence-only datasets and from parallel 18S sequence-structure datasets were compared for each taxon analyzed in this study (diatoms, green algae and red algae). Half of the comparisons produced trees with different topologies that frequently related to the status of sister genera. Using the lineage information for each species as listed in GenBank, we determined that the sequence-structure approach resolved a genus as monophyletic, while the sequence-only approach failed to do so in comparisons that comprised 3{\%} of the cases examined. The reverse was true for a total of 8.3{\%} of the comparisons that we generated. Future work, both in our labs and among the broader phycological community, will provide additional data to test the accuracy and robustness of a sequence-structure approach at different taxonomic ranks.",
keywords = "4SALE, Alignment, Bacillariophyta, Chlorophyta, Diatoms, Green algae, Homology modelling, Internal transcribed spacer 2, ITS2, Red algae, Rhodophyta, RNA, Substitution model",
author = "Rodrigues, {Maria Valentina Marin} and Tobias M{\"u}ller and Buchheim, {Mark Alan} and Bj{\"o}rn Canb{\"a}ck and Matthias Wolf",
year = "2017",
doi = "10.2216/16-71.1",
language = "English",
volume = "56",
pages = "186--192",
journal = "Phycologia",
issn = "0031-8884",
publisher = "International Phycological Society",
number = "2",