INTRODUCTION
Dobzhansky reminds us in his famous 1973 paper (for which our title is a homage) that evolution is the driving force for all biological phenomena (Dobzhansky, Reference Dobzhansky1973). Evolution can be regarded as a continuous process of change. Similarly our understanding of evolution continuously changes as a result of rapid progress in research techniques. Typical examples from sponge evolutionary sciences can be deduced from the numerous changes in phylogenetic hypotheses. Sponge systematics was revolutionized with the introduction of cladistic analyses (e.g. Van Soest, Reference Van Soest, Vacelet and Boury-Esnault1987) and the use of molecular methods starting from single gene sequencing (Kelly-Borges et al., Reference Kelly-Borges, Bergquist and Bergquist1991) to phylogenomics (e.g. Philippe et al., Reference Philippe, Derelle, Lopez, Pick, Borchiellini, Boury-Esnault, Vacelet, Deniel, Houliston, Quéinnec, DaSilva, Wincker, Le Guyader, Leys, Jackson, Schreiber, Erpenbeck, Morgenstern, Wörheide and Manuel2009) (see for reviews and case studies e.g. Boury-Esnault, Reference Boury-Esnault2006; Erpenbeck et al., Reference Erpenbeck, Hall, Alvarez, Büttner, Sacher, Schätzle, Schuster, Vargas, Hooper and Wörheide2012a; Morrow et al., Reference Morrow, Redmond, Picton, Thacker, Collins, Maggs, Sigwar and Allcock2013). These molecular tools were soon regarded as the most promising source of phylogenetic characters to further our understanding in sponge evolutionary relationships. Morphological and chemotaxonomic characters displayed shortcomings due to environmental plasticity, homoplasy and lack of complexity, whereas ultrastructure was regarded as too laborious (see e.g. Maldonado et al., Reference Maldonado, Carmona, Uriz and Cruzado1999; Boury-Esnault, Reference Boury-Esnault2006; Erpenbeck et al., Reference Erpenbeck, Breeuwer, Parra-Velandia and Van Soest2006; Erpenbeck & Van Soest, Reference Erpenbeck and Van Soest2007; Cárdenas & Rapp, Reference Cárdenas and Rapp2013). The subsequent phylogenetic trees of molecular data resulted in few (hexactinellid taxa), to dramatic changes (the other 92% of sponge species) in poriferan systematics (see e.g. Erpenbeck & Wörheide, Reference Erpenbeck and Wörheide2007; Wörheide et al., Reference Wörheide, Dohrmann, Erpenbeck, Larroux, Maldonado, Voigt, Borchiellini and Lavrov2012; Redmond et al., Reference Redmond, Morrow, Thacker, Díaz, Boury-Esnault, Cárdenas, Hajdu, Lobo-Hajdu, Picton, Pomponi, Kayal and Collins2013).
Phylogenetic trees provide the basis for exploring and understanding the current patterns and processes observed in all fields of sponge biology, and therefore constitute an important reference for the design of future research (including grant applications). However, publications on character evolution, biochemistry, phylogeny and all other aspects of biology have reduced credibility and impact when the underlying taxonomy is erroneous. In turn, the quality of every phylogenetic tree is dependent on the correct identification of its constituent taxa. While tree reconstruction algorithms advance and facilitate the modelling of molecular evolution scenarios, their underlying data frequently suffer from erroneous taxonomy. For example, DNA sequences submitted to NCBI GenBank (http://www.ncbi.nlm.nih.gov/genbank) are not subject to any taxonomic control and frequently bear incorrect taxon names (see e.g. Ashelford et al., Reference Ashelford, Chuzhanova, Fry, Jones and Weightman2005; Shen et al., Reference Shen, Chen and Murphy2013). This is in strong contrast to the function of public DNA repositories, such as the European Nucleotide Archive (http://www.ebi.ac.uk/ena), NCBI GenBank or the DNA Data Bank of Japan (http://www.ddbj.nig.ac.jp) as arguably the most important sources for sequences in molecular phylogenetic and taxonomic studies. Different specialized taxonomically curated databases aiming to minimize taxonomic ambiguities (e.g. the Sponge Barcoding Database, http://www.spongebarcoding.org; Wörheide et al., Reference Wörheide, Erpenbeck, Menke, Custódio, Lôbo-Hajdu, Hajdu and Muricy2008) build up a reference backbone, which must rely on taxonomically correct reference material (Wörheide & Erpenbeck, Reference Wörheide and Erpenbeck2007).
The optimal taxonomic reference material for a species is the primary type, or holotype, i.e. the exact specimen used for the species description. The holotype is the single specimen upon which a new nominal species-group taxon is based (International Commission on Zoological Nomenclature (ICZN) 4th Edition, Article 73, 2012), that objectively defines the species concept, and fixes the name proposed by the original author in the original publication. The holotype (or any other secondary type specimen) is usually consulted in morphological taxonomy or systematics, but holotype examination in molecular studies is scarce. In the currently most comprehensive molecular phylogenetic trees for sponges, we find no mention of holotype examination.
Consequently, DNA sequences without unequivocal taxonomic identification are in the majority of cases published in phylogenetic trees and subsequently submitted to public DNA repositories. While the identification of the species from which the DNA was extracted and sequences produced may be subsequently revised and updated, these refinements are not necessarily made globally known, including in GenBank (and other repositories) itself. As such, these taxonomic errors are compounded in subsequent phylogenetic trees that use these sequence databases on the assumption that their original taxonomy was correct.
Conversely, using holotypes for molecular phylogenetic studies confers both taxonomic confidence and more rigour through compliance with the ICZN. Several reasons appear to influence the choice why holotypes are not the primary target for molecular systematic studies. Among the most obvious is the uncertain DNA quality due to age or history of preservation. DNA in the post-mortem cell is subject to a number of types of deterioration such as oxidative and hydrolytic damage, DNA crosslinks and microorganism nucleases (see for an overview Rizzi et al., Reference Rizzi, Lari, Gigli, De Bellis and Caramelli2012). These result in DNA fragmentation, amplification inhibition, or base deaminations leading to erroneous genotypes when PCR-amplified under standard protocols (see details in Hofreiter et al., Reference Hofreiter, Jaenicke, Serre, von Haeseler and Pääbo2001). Destructive processes increase during slow dehydration processes in which the nucleases stay active for some time, or when fixatives were used that trigger DNA-protein crosslinks (e.g. formalin; for extraction protocols see De Bruyn et al., Reference De Bruyn, Parenti and Carvalho2011). These destructive processes can be reduced by rapid dehydration (such as preservation in ethanol, silica gel or quick air drying), to inhibit the nuclease activity. However, fragmentation to templates that cannot be amplified with standard primer sets, may prevent the inclusion of holotype sequences in phylogenetic datasets but does not hinder a molecular taxonomic comparison. Here, short DNA markers, ‘minimalist DNA barcodes’, specifically developed to amplify fragmented DNA templates for molecular taxonomy can facilitate the taxonomic verification of samples by comparison with the holotype prior to publication (Hajibabaei et al., Reference Hajibabaei, Smith, Janzen, Rodriguez, Whitfield and Hebert2006).
Another obstacle for recruiting holotypes in phylogenetic analysis was their accessibility. Discovering where the holotypes are located from historical and foreign language literature, including subsequent taxonomic revisions, potential synonymy, genus transfers etc., are complex and confusing processes. This obstacle is further exacerbated by the antiquity of the current sponge systematics, whereby most genera presently considered valid were fixed by their type species described in the late 19th century, in a scattered literature, and with rarely cited museum specimen numbers, requiring painstaking detective work decades or centuries later (see introductory discussion in (and online updates of) Hooper & Wiedenmayer, Reference Hooper, Wiedenmayer and Wells1994). Secondly, accessing the various museums, and then gaining permission to subsample holotypes increases the impediment to source them. However, for sponges there are presently highly comprehensive sources of taxonomic information ranging from the Systema Porifera (up to genus level; Hooper & Van Soest, Reference Hooper and Van Soest2002), to dynamic online tools such as the World Porifera Database (up to subspecies level; Van Soest et al., Reference Van Soest, Boury-Esnault, Hooper, Rützler, de Voogd, Alvarez de Glasby, Hajdu, Pisera, Manconi, Schönberg, Janussen, Tabachnick, Klautau, Picton and Kelly2011) that provide efficient tools to retrieve information on holotypes. Moreover, as DNA can be extracted from minimal amounts of tissue and from every sponge tissue with living cells, DNA sampling in sponges can be considered minimally destructive amongst most Metazoa.
A crude estimation of the number of sequences from holotypes used for sponge systematics undertaken prior to this study yielded less than 80 published sponge holotype sequences (<1% of all described valid sponge species; Van Soest et al., Reference Van Soest, Boury-Esnault, Vacelet, Dohrmann, Erpenbeck, De Voogd, Santodomingo, Vanhoorne, Kelly and Hooper2012). This is a remarkably low number for a phylum, whose species identification is difficult and challenged by high degrees of environmentally induced plasticity. Consequently, it is obvious that there is a lack of holotype sequences available to undertake precise identifications of taxa for all aspects of sponge research. In this paper we demonstrate the application and advantages of holotype sequences in sponge science based on various case studies, and make a strong argument for their increased use in the future.
MATERIALS AND METHODS
DNA of the following specimens have been extracted in the course of several different projects: MOM INV-22285 (04 0348) (holotype Heteroxya corticata Topsent, Reference Topsent1898), BMNH 1881.81.10.21.266 (neotype Xestospongia testudinaria), BMNH 1881.81.10.21.267 (associated specimen of Ridley's Xestospongia testudinaria), MCZ PORa-6449 and MCZ PORa-6450 (syntypes Xestospongia muta (Schmidt, Reference Schmidt1870)), ZMB2889 (holotype Neopetrosia chaliniformis (Thiele, Reference Thiele1899)), BMNH 1898.12.20.49 (holotype Neopetrosia exigua (Kirkpatrick, Reference Kirkpatrick1900)), AM Z3867 (holotype Narrabeena lamellata (as Smenospongia lamellata Bergquist, Reference Bergquist1980)), USNM 1231429 (holotype Stelletta anthastra Lehnert & Stone, Reference Lehnert and Stone2014). Specimens were either dry or preserved in ethanol with no further information on other fixatives applied such as formalin. DNA was either extracted using the Qiagen DNeasy Blood & Tissue kit for the recently collected samples (Narrabeena lamellata, Stelletta anthastra) or Qiagen QiAmp Mini Kit (Heteroxya corticata) following the manufacturer's protocol, or a modified CTAB phenol-chloroform method (Porebski et al., Reference Porebski, Bailey and Baum1997) with the phenol-octanol and RNase solutions steps skipped (Neopetrosia chaliniformis, Xestospongia testudinaria, Xestospongia muta). Preferable methods regarding DNA yield and amplification success could not be identified. Fragments of the mitochondrial cytochrome oxidase subunit 1 (CO1, standard barcoding fragment) were amplified using degenerated versions of universal barcoding primers: dgLCO1490 (GGT CAA CAA ATC ATA AAG AYA TYG G) and dgHCO2198 (TAA ACT TCAG GGT GAC CAA ARA AYC A) (Meyer et al., Reference Meyer, Geller and Paulay2005) with an annealing temperature of 43°C. Fragments of the mitochondrial cytochrome oxidase subunit 2 (cox2) were amplified using the primers CO2F Por (TTT TTC ACG ATC AGA TTA TGT TTA) and CO2R Por (ATA CTC GCA CTG AGT TTG AAT AGG) (Rua et al., Reference Rua, Zilberberg and Sole-Cava2011) with an annealing temperature of 40°C. Fragments of ATP6 were amplified using an internal primer set modified from Rua et al. (Reference Rua, Zilberberg and Sole-Cava2011) (ATP6_Xt_f1: TAG GGG TAA CTT TGT TAG GG and ATP6_Xt_r1 CCA ATG AAA TAG CAC GAG CC) with an annealing temperature of 44°C. Fragments of the nuclear ribosomal 28S gene (C2-D2) were amplified using the primers 28S-C2-fwd (GAA AAG AAC TTT GRA RAG AGA GT) and 28S-D2-rev (TCC GTG TTT CAA GAC GGG) (Chombard et al., Reference Chombard, Boury-Esnault and Tillier1998) with an annealing temperature of 50°C. The 25 μL PCR mix consisted of 5 μL 5× green GoTaq ® PCR Buffer (Promega Corp, Madison, WI), 4 μL 25 mM MgCl2 (Promega Corp, Madison, WI), 2 μL 10 mM dNTPs, 2 μL BSA (100 μg mL−1), 1 μl each primer (5 μM), 7.8 μL water, 0.2 μL GoTaq ® DNA polymerase (5 u μL−1) (Promega Corp, Madison, WI) and 2 μL DNA template. The PCR regime comprised an initial denaturation phase of 94°C for 3 min followed by 35 cycles of 30 s denaturation at 94°C, 20 s annealing and 60 s elongation at 72°C each and a final elongation at 72°C for 5 min. Alternatively a two-step approach with 4 cycles of 45°C annealing temperature prior to 30 cycles of 50°C were applied. The PCR products were purified with the standard ammonium acetate-ethanol precipitation before cycle sequencing using the BigDye®-Terminator v3.1 (Applied Biosystems) following the manufacturer's protocol. Both strands of the template were sequenced on an ABI 3730 automated sequencer. The poriferan origin of the sequences was checked by a BLAST search against the NCBI GenBank database (http://www.ncbi.nlm.nih.gov). Sequences were basecalled, trimmed and assembled in CodonCode Aligner v 3.7.1.1 and subsequently aligned with other representative sequences available from GenBank in MAFFT v7.149b (Katoh & Standley, Reference Katoh and Standley2013). All sequences are deposited in the Sponge Barcoding Database (SBD, http://www.spongebarcoding.org; Wörheide et al., Reference Wörheide, Erpenbeck, Menke, Custódio, Lôbo-Hajdu, Hajdu and Muricy2008) and in NCBI GenBank (see Results and Discussion). Maximum likelihood reconstructions were performed using RAxML 7.2.5 (Stamatakis, Reference Stamatakis2006) under the GTR model of nucleotide substitution with CAT approximation of rate heterogeneity and 100 fast bootstrap replicates.
RESULTS AND DISCUSSION
Age doesn't (always) matter: BLAST the past
DNA was amplified for all of the above-mentioned type material, including specimens collected in the 19th century. For example CO1 and 28S rDNA sequences were successfully retrieved from Heteroxya corticata collected in 1895, which is the type taxon for the family Heteroxyidae Dendy 1905 (SBD# 1152; NCBI accession number KP939318). Likewise, successful amplification of the holotypes of Neopetrosia chaliniformis (collected by Sarasin in Sulawesi between 1893–1896 and described as Petrosia chaliniformis by Thiele in 1899; SBD# 1153; NCBI: KM030103), and Neopetrosia exigua (collected 1898; SBD# 1154, NCBI: KM030104), and type material of the Caribbean barrel sponge Xestospongia muta Schmidt (described 1870; SBD# 1155, #1156; NCBI: KM014756), and neotype and associated material of its Indo-Pacific congener Xestospongia testudinaria, collected in 1881 (see Ridley, Reference Ridley1884) (e.g. SBD#1157, #1158; NCBI: KM014764; see also publications of Setiawan et al. in this volume).
Consequently, there is no reason to assume a priori that the antiquity of holotypes, their uncertain preservation history, and the likelihood of strong DNA degradation and fragmentation is a hindrance for successful DNA amplification and sequencing. In fact, DNA quality in old samples might be sufficient for amplification of standard phylogenetic markers if the tissue was stored in ethanol immediately or quickly dried. To our knowledge the first century-old sponge holotype successfully amplified was the holotype of Topsentia halichondrioides (as Trachyopsis halichondrioides Dendy, Reference Dendy and Herdman1905), collected 1902, and used for phylogenetic analyses of halichondrid demosponges. Standard phylogenetic markers of 28SrDNA, cytochrome oxidase subunit 1, and the elongation-factor 1-alpha were successfully amplified (e.g. Erpenbeck et al., Reference Erpenbeck, Breeuwer, Parra-Velandia and Van Soest2006) (SBD# 1159; NCBI: e.g. AY625676).
Type material, if successfully amplified, is also prone to further contamination, particularly if it is more frequently subject to examination by taxonomists and therefore more likely exposed to contamination, including the metabolomics profile of the taxonomist(s) in question. In particular the use of universal primers such as degenerated CO1 barcoding primers (e.g. Meyer et al., Reference Meyer, Geller and Paulay2005) result in increased yield of non-sponge sequences, which in turn should be easily detectable by phenetic controls like BLAST (Altschul et al., Reference Altschul, Gish, Miller, Myers and Lipman1990), and these should be followed by (probabilistic) cladistic tree-based methods to ascertain the poriferan origin of the DNA template (see Erpenbeck et al., Reference Erpenbeck, Breeuwer, van der Velde and Van Soest2002).
Setting (chemo)taxonomy straight: Narrabeena IS a black sheep among Verongida
Species of the order Verongida are frequently subject to biochemical research as they produce bromotyrosines (among other biochemical compounds), which possess bioactive properties of major interest for pharmaceutical research (see for a recent example Mani et al., Reference Mani, Jullian, Mourkazel, Valentin, Dubois, Cresteil, Folcher, Hooper, Erpenbeck, Aalbersberg and Debitus2012). Bromotyrosines have been discovered in all genera of Verongida since the morphological revision sensu Bergquist & Cook (Reference Bergquist, Cook, Hooper and Van Soest2002), which suggested an apomorphic nature of this character (Bergquist & Cook, Reference Bergquist, Cook, Hooper and Van Soest2002) (see also Van Soest & Braekman, Reference Van Soest and Braekman1999; Erpenbeck & Van Soest, Reference Erpenbeck and Van Soest2007). Narrabeena Cook & Bergquist, Reference Bergquist, Cook, Hooper and Van Soest2002 is currently classified in the dictyoceratid family Thorectidae, and was erected for Smenospongia lamellata, which possesses fibres with a high amount of pith, unlike S. aurea, the type species of Smenospongia. Smenospongia has been regarded as the ‘point of closest similarity between Verongida and Dictyoceratida’ (Bergquist, Reference Bergquist1980). Nevertheless, despite its verongid morphology, Narrabeena was placed into Dictyoceratida due to the absence of bromotyrosines. Recent CO1 and 28S rDNA reconstructions, however, resolved Narrabeena, investigated in a molecular dataset for the first time, with Verongida (Erpenbeck et al., Reference Erpenbeck, Sutcliffe, Cook, Dietzel, Maldonado, van Soest, Hooper and Wörheide2012b). Independent 18S analyses, however, using a different specimen set recovered a Narrabeena sample among Dictyoceratida, and implied the need for reanalysis using a conclusive dataset (Redmond et al., Reference Redmond, Morrow, Thacker, Díaz, Boury-Esnault, Cárdenas, Hajdu, Lobo-Hajdu, Picton, Pomponi, Kayal and Collins2013). Consequently we analysed the holotype specimen (AM Z3867) from the Australian Museum, Sydney, with molecular methods and yielded a fragment of the C1-D2 region of 28S rDNA and for CO1 (SBD #1160; NCBI: KP939316). The phylogenetic analyses recovered AM Z3867 within the Verongida in close relationship to Suberea, Aplysina and Porphyria and distant to the dictyoceratid samples of the dataset with both markers (see 28S rDNA tree in Figure 1). The inclusion of the N. lamellata holotype sequence in this analysis therefore clearly shows the verongid relationships of Narrabeena and justifies its transfer to Verongida. The analysis also confirms that absence/presence patterns of secondary metabolites in chemotaxonomy have to be evaluated carefully. Besides the independent production of bromotyrosines in other lineages (see review in Erpenbeck & Van Soest, Reference Erpenbeck and Van Soest2007), secondary metabolite production can easily be switched off by mutations in the biosynthetic pathway and its regulatory elements.
Young holotypes of an old phylum: new species with molecular registrations
Molecular methods keep advancing throughout all aspects of sponge biology and molecular taxonomy will likely become the standard for species identification and description in the future. The description of new sponge (and most other metazoan) species will remain predominantly descriptive in the foreseeable future but barcoding approaches and molecularly supported museum database platforms (e.g. Atlas of Living Australia, see Hooper et al., Reference Hooper, Hall, Ekins, Erpenbeck, Wörheide and Jolley-Rogers2013) also provide various molecular information for the samples. As the costs for DNA barcoding are comparatively low (see e.g. Vargas et al., Reference Vargas, Schuster, Sacher, Büttner, Schätzle, Läuchli, Hall, Hooper, Erpenbeck and Wörheide2012), sequences can be easily associated with the species descriptions for subsequent analyses, even when the molecular data itself might not be incorporated into species description (which we do not advocate). Examples are the recent publications of Alaskan (Aleutian) sponges (Lehnert & Stone, Reference Lehnert and Stone2013; Lehnert & Stone, Reference Lehnert and Stone2014), for which sequences of the mitochondrial CO1 (Barcoding fragment) and 28S rDNA (C1-D2 and D3-D5 fragment; SBD# 1161; NCBI: e.g. KP939317) are submitted to the Sponge Barcoding Database. With this information included, phylogenetic trees can be reconstructed by anyone interested, or deducted directly from the Sponge Genetree Server (http://www.spongegenetrees.org; Erpenbeck et al., Reference Erpenbeck, Voigt, Gültas and Wörheide2008).
For such a procedure it is evident that samples (or a designated fragment of the sample) are immediately and optimally preserved after collection for molecular purposes in order to keep the DNA amplifiable. Immediate placement in ethanol (as highly concentrated as possible) with ethanol exchange after 24 h (as the seawater dilutes the ethanol) followed by cool storage is among the most practicable and economic methods, with immediate freezing or alternatively the preservation of small sponge crumbles in silica powder (Alvarez et al., Reference Alvarez, Crisp, Driver, Hooper and Van Soest2000), or as well storage in high-salt/DMSO buffer (Seutin et al., Reference Seutin, White and Boag1991; Dawson et al., Reference Dawson, Raskoff and Jacobs1998) also economic and effective.
CONCLUSION
The use of primary type material, and preferably also secondary types for unequivocal verification of the sponge species identification, should be considered for all aspects of evolutionary research, to build a more reliable baseline dataset upon which all new sponge molecular identifications are compared. Although older type material is traditionally infamous amongst many practitioners of molecular barcoding, for the alleged difficulties in achieving conclusive molecular data, we show here that even standard methods may frequently succeed with antiquated specimens. Optimally, molecular identification should be attempted in parallel with comparison with DNA from holotypes. The corollary is that sequence data of non-type specimens without corroboration from type material must be more cautiously interpreted in terms of the power of the evidence they present and the impact on higher systematic interpretation.
ACKNOWLEDGEMENTS
We thank Christine Morrow (Queen's University, Belfast), Michèle Bruni (Oceanographic Museum of Monaco), Simone Schätzle and Gabriele Büttner (LMU) for various contributions to this study. Jane Fromont (WA Museum) and an anonymous reviewer improved the quality of this manuscript.
FINANCIAL SUPPORT
We would like to acknowledge funding of the Lehre@LMU programme of the Ludwig-Maximilians University Munich.