Introduction
Complete fungal genomes, those assembled to gapless full chromosomes, are uncommon. This level of completion is mostly restricted to well-studied, model species, such as Aspergillus oryzae (Ahlb.) Cohn, Fusarium spp. and Penicillium chrysogenum Thom (Machida et al. Reference Machida, Asai, Sano, Tanaka, Kumagai, Terai, Kusumoto, Arima, Akita and Kashiwagi2005; Specht et al. Reference Specht, Dahlmann, Zadra, Kürnsteiner and Kück2014; King et al. Reference King, Urban, Hammond-Kosack, Hassani-Pak and Hammond-Kosack2015, Reference King, Brown, Urban and Hammond-Kosack2018), although there are an increasing number of non-model fungi with telomere-to-telomere assemblies available (Chung et al. Reference Chung, Kwon and Yang2021; Crestana et al. Reference Crestana, Taniguti, dos Santos, Benevenuto, Ceresini, Carvalho, Kitajima and Monerito-Vitorello2021; Gan et al. Reference Gan, Hiroyama, Tsushima, Masuda, Shibata, Ueno, Kumakura, Narusaka, Hoat and Narusaka2021). The scarcity of full chromosome assemblies of fungal genomes partially stems from the difficulty in obtaining pure samples. Many fungi are unculturable (Muggia et al. Reference Muggia, Kopun and Grube2017; Tedersoo et al. Reference Tedersoo, Sánchez-Ramírez, Kõljalg, Bahram, Döring, Schigel, May, Ryberg and Abarenkov2018). Furthermore, they are typically involved in complex, intimate symbioses (Smith & Read Reference Smith and Read2008; Stajich et al. Reference Stajich, Berbee, Blackwell, Hibbett, James, Spatafora and Taylor2009) and can be inseparable from their symbionts. Because fungi are often embedded in diverse microbial communities, whole-genome shotgun sequencing of environmental samples yields rich and complex metagenomes that can be challenging to analyze (Grube et al. Reference Grube, Berg, Andréson, Vilhelmsson, Dyer, Miao and Martin2013; Keepers et al. Reference Keepers, Pogoda, White, Stewart CR, Hoffman, Ruiz, McCain, Lendemer, Kane and Tripp2019; Tzovaras et al. Reference Tzovaras, Segers, Bicker, Dal Grande, Otte, Anvar, Hankeln, Schmitt and Ebersberger2020). Despite these challenges, a number of nearly complete lichen genomes have been published (Table 1). In the immediate future, highly complete genomes are poised to become the norm as widespread adoption of long-read sequencing methods expands our data generation capacity (McKenzie et al. Reference McKenzie, Walston and Allen2020; Tedersoo et al. Reference Tedersoo, Albertsent, Anslan and Callahan2021).
Fungi, especially in the phylum Ascomycota, demonstrate some of the most diverse strategies for reproduction of all eukaryotes that span the full spectrum of recombination from obligate outcrossing to clonality (Tripp Reference Tripp2016; Kendrick Reference Kendrick2017). Heterothallism involves the fusion of genetically distinct individuals expressing different mating types (Billiard et al. Reference Billiard, López-Villavicencio, Hood and Girad2012). Heterothallism is exceedingly common among lichen-forming fungi and is most likely the ancestral state for Lecanoromycetes (Pizarro et al. Reference Pizarro, Dal Grande, Leavitt, Dyer, Schmitt, Crespo, Lumbsch and Divakar2019). Homothallism represents a diverse and varied set of strategies utilized by fungi to allow for self-fertility (Wilson et al. Reference Wilson, Wilken, van der Nest, Steenkamp, Wingfield and Wingfield2015b). Fungi exhibiting primary homothallism are capable of expressing both MAT1-1 and MAT1-2 idiomorphs in a single genome. Pseudohomothallic fungi produce dikaryotic spores which contain two nuclei, each expressing a complementary MAT idiomorph. Other homothallic fungi are capable of hermaphroditic mating-type switching, so that individuals can swap which MAT idiomorph is expressed in their genome. Unisexuality represents a unique, perhaps separate kind of homothallism, identified in only four fungal species to date (Wilson et al. Reference Wilson, Gabriel, Singer, Schuerg, Wilken, van der Nest, Wingfield and Wingfield2021). The unisexual fungus Cryptococcus neoformans (San Felice) Vuill., for instance, is capable of sexual meiosis by whole-genome endoreplication or by cellular and nuclear fusion of two individuals of the same MAT identity (Wilson et al. Reference Wilson, Gabriel, Singer, Schuerg, Wilken, van der Nest, Wingfield and Wingfield2021). These examples of homothallism all involve sexual meiosis of identical clones and the potential to recombine with oneself and every other individual of the same species (Billiard et al. Reference Billiard, López-Villavicencio, Hood and Girad2012). Strictly asexual organisms reproduce exclusively through clonal spores or vegetative tissue adapted to facilitate fragmentation and dispersal. Throughout domain Eukarya, asexual reproduction is frequently observed in conjunction with facultative sexual outcrossing (Honnay & Bossuyt Reference Honnay and Bossuyt2005). Many species of fungi are called ‘imperfect’ because they have never been observed exhibiting a sexual stage, and as such might be examples of strictly exclusive asexuals (Gräser et al. Reference Gräser, Kuijpers, Presber and Hoog1999; Persinoti et al. Reference Persinoti, Martinez, Li, Dögen, Billmyre, Averette, Goldberg, Shea, Young and Zeng2018). Furthermore, it is important to recognize that the reproductive categories listed here are artificial creations, constructs of scientific theory. Fungal reproduction exists along poorly understood spectra, and species are not restricted to one category. For example, the polymorphic fungus Candida albicans (C.P. Robin) Berkhout is capable of heterothallic and unisexual reproduction, as well as parasexual reproduction, a process of ploidy reduction via concerted chromosome loss rather than meiosis (Bennett & Johnson Reference Bennett and Johnson2003; Thomson et al. Reference Thomson, Hernon, Austriaco, Shapiro, Belenky and Bennett2019; Wilson et al. Reference Wilson, Gabriel, Singer, Schuerg, Wilken, van der Nest, Wingfield and Wingfield2021).
Lepraria s. lat. is a chemically and morphologically diverse genus of c. 60 species of leprose lichens (Lendemer Reference Lendemer2013; Lendemer & Hodkinson Reference Lendemer and Hodkinson2013). Lepraria species have never been observed with sexual reproductive structures (Lendemer Reference Lendemer2013), and Lepraria is thus assumed to be exclusively clonal. Clonal reproduction is assumedly accomplished through the leprose growth form, which consists entirely of ecorticate granules that function simultaneously as the lichen thallus and as dispersal propagules (Brodo Reference Brodo, Sharnoff and Sharnoff2001). Lepraria makes effective use of the vegetative mode of reproduction, vertically transmitting the entire holobiome between indiscreet generations. Interestingly, phylogenetic studies of Lepraria based on typical species-level molecular markers result in phylogenies with short branch lengths and poor support for species that are clearly morphologically, chemically and geographically distinct (Lendemer & Hodkinson Reference Lendemer and Hodkinson2013), suggesting evolutionary processes in this genus do not mirror those observed in other lineages of lichenized fungi. While leprose thalli have arisen in other lineages of lichenized fungi, including genera such as Chrysothrix, Lecanora and Herpothallon, no other lineages exhibit the same degree of strictly asexual diversification as Lepraria (Lendemer Reference Lendemer2013). Therefore, Lepraria is an ideal genus for investigating reproductive strategies and their evolutionary outcomes.
In the present study, long-read sequence data were used from the Oxford Nanopore MinION platform to sequence the genome of Lepraria neglecta (Nyl.) Erichsen, a widespread species with a nearly global distribution. It is distinguished within the genus by its unparalleled chemical variation (i.e. there are at least seven distinct chemotypes with overlapping ranges; Lendemer Reference Lendemer2013). The nuclear and mitochondrial genomes of L. neglecta, and the organellar genomes of the photobiont, were assembled from the complex metagenomic data that resulted from a whole thallus extract. A detailed investigation of the genome assembly was conducted to locate telomeres and the mating-type locus, and to infer the number of chromosomes.
Materials and Methods
Sample collection, DNA extraction and sequencing
Lepraria neglecta was collected 330 m south-east of the Cheney Wastewater Treatment Plant and 600 m north-east of the Columbia Plateau Trailhead (47.4823°N, 117.5555°W). All tissue was taken from a single thallus occupying a rock outcrop and the thallus was removed from the substratum with a sterile butter knife. Collectors wore nitrile gloves during the collection process and samples were placed in new paper bags to avoid contamination. Samples were taken back to the laboratory and placed in a −20 °C freezer. The thallus was removed from the freezer 24 h later, cleaned of debris, and divided into 16 1.5 ml tubes. The sample voucher specimen was deposited in the Eastern Washington University herbarium (EWU; Allen 5258). Thin-layer chromatography was conducted using a small subset of the sample following standard methods (Culberson & Kristinsson Reference Culberson and Kristinsson1970). Metabolites were extracted with acetone and transferred to an aluminum-backed silica plate with 250 nm fluorescent markers (Merck KGaA, Darmstadt, Germany). The plate was run in solvent C (200 toluene: 30 glacial acetic acid), and fluorescence of metabolites under short- and long-wave ultraviolet light was recorded. The plate was then treated with water and the presence of hydrophobic spots was marked as the plate dried. Once the plate was dry, it was treated with 10% sulphuric acid and heated on a pancake griddle at 200° F (c. 93 °C) for 15 min before final interpretation.
The Qiagen DNEasy PlantPro Kit was used for DNA extraction following the manufacturer's protocol with one adjustment: tissue disruption using a Mini-BeadBeater Mill 36270-02 was completed before the addition of any solutions (Qiagen, Hombrechtikon, Switzerland). Extractions were eluted in 50 μl of the elution buffer, all 16 extractions were pooled, and total DNA was quantified using a Qubit with ssDNA Assay Kit (Invitrogen, Waltham, MA, USA). A size selection step using MagBind Total Pure NGS (Omega Bio-tek, Inc., Norcross, GA, USA) was then completed with a 0.4:1 bead solution to extract ratio, a series of 70% ethanol washes were carried out, and the sample was eluted in 55 μl of nuclease-free water (McKenzie et al. Reference McKenzie, Walston and Allen2020). Library preparation was completed using the Ligation Sequencing Kit LKS-109 according to the manufacturer's protocol and samples were run on the MinION platform using R.9.4.1 flow cells for 52 h (Oxford Nanopore Technologies, Oxford, UK). Basecalling was conducted using Guppy v. 5.0.7 with the SUP model (Oxford Nanopore Technologies, Oxford, UK).
Genome assembly and filtering
Reads were assembled using Flye v. 2.9 with the ‘--nano-hq’ mode, read error set to 0.02 and maximum overhang set to 250 (Kolmogorov et al. Reference Kolmogorov, Yuan, Lin and Pevzner2019). The resulting assembly was polished with medaka v. 1.2 after mapping all reads back to assembled contigs using minimap2 v. 2.17 (Li Reference Li2018; https://github.com/nanoporetech/medaka). Linear contigs with > 60× coverage ascribable to Ascomycota using the metagenomic binning methods in McKenzie et al. (Reference McKenzie, Walston and Allen2020) were retained for downstream analyses as the nuclear genome of Lepraria neglecta. Circular contigs ascribable to Ascomycota were retained as potential mitochondrial genomes and circular contigs ascribable to Chlorophyta were retained as potential photobiont organellar genomes. Quast v. 5.0.2 and BUSCO v. 5 were used to calculate final assembly metrics for L. neglecta and a suite of other lichen species for comparison (Mikheenko et al. Reference Mikheenko, Prjibelski, Saveliev, Antipov and Gurevich2018; Manni et al. Reference Manni, Berkeley, Seppey, Simão and Zdobnov2021; Table 1). Reads were also assembled with Flye v. 2.9 with the ‘--meta’ mode to recover the photobiont mitochondrial genome (Kolmogorov et al. Reference Kolmogorov, Yuan, Lin and Pevzner2019). High-coverage circular contigs were searched against the NCBI nucleotide database using BLASTN to recover the contigs ascribable to the Trebouxiophyceae. Organellar genomes were annotated with GeSeq on the CHLOROBOX platform hosted by the Max Planck Institute of Molecular Plant Physiology (Supplementary Material Fig. S1, available online; Tillich et al. Reference Tillich, Lehwark, Pellizzer, Ulbricht-Jones, Fischer, Bock and Greiner2017).
Nuclear genome contigs were annotated with funannotate v. 1.8.7 (https://github.com/nextgenusfs/funannotate). All default ab-initio and evidence-based predictors were used and protein evidence was derived from the following Joint Genome Institute databases: Cladonia grayi Cgr/DA2myc/ss v2.0, Lobaria pulmonaria Scotland reference genome v1.0, Usnea florida ATCC18376 v1.0, Xanthoria parietina 46-1-SA22 v1.1, and complete MAT loci from Letharia vulpina (GenBank: MK521632.1) and Letharia columbiana (GenBank: MK521629.1). BUSCO searches were conducted with Aspergillus fumigatus Fresen. as the seed species. Biosynthetic gene clusters were annotated with antiSMASH v. 6.0 (Blin et al. Reference Blin, Shaw, Kloosterman, Charlop-Powers, van Wezel, Medema and Weber2021).
Telomeres were identified using a custom set of scripts (Supplementary Material File S1, available online). Reads were searched for telomeric repeats (TTAGGG in most fungi) using the Noise Cancelling Repeat Finder (Harris et al. Reference Harris, Cechova and Makova2019). Reads with ≥ 60 bp stretches with ≥ 95% identity to the repeated telomeric repeat motif within 40 bp of the beginning or end of the read (to allow for adapter sequence) were annotated as telomeric. Reads with telomeric sequences were then mapped to the assembly graph using GraphAligner with the following parameters: ard-cigar --seeds-mxm-length 30 --seeds-mem-count 10000 --bandwidth 15 --multimap-score-fraction 0.999 --precise-clipping 0.85 --min-alignment-score 5000 --clip-ambiguous-ends 100 --overlap-incompatible-cutoff 0.15 --max-trace-count 5 (Rautiainen & Marschall Reference Rautiainen and Marschall2020). The assembly graph was then examined along with the results of GraphAligner to determine which graph edges were fully connected to each other with telomeric reads, and these results were used to create a visual representation of the results in Inkscape v. 1.2.1 (Fig. 1; https://inkscape.org/). The assembly graph and results of GraphAligner are available in text form in Supplementary Material File S2 (available online). Contigs with telomeric reads mapping to both ends are considered telomere-to-telomere assembled chromosomes. As a point of comparison, this same workflow was used to map telomeric reads in Bacidia gigantensis Lendemer et al. (Allen et al. Reference Allen, Jones and McMullin2021).
To confirm that this sample is Lepraria neglecta and determine which individuals with available sequences are most closely related to our sample, we built a single-gene phylogeny using the nuclear internal transcribed spacer (ITS). The ITS sequence of this sample was recovered from the genome assembly using a BLASTN search with a L. neglecta isolate (GenBank #KC209167.1, Lendemer 16108 (NY), alectorialic acid chemotype) ITS sequence as the query (Altschul et al. Reference Altschul, Gish, Miller, Myers and Lipman1990). The recovered ITS sequence from the genome presented here was then queried against the entire NCBI nucleotide database using BLASTN; 453 ITS sequences were recovered as the closest hits and downloaded. These sequences were then aligned with MUSCLE v. 5 and the resulting alignment was visualized and cleaned using Jalview v. 2.11.2.0 (Clamp et al. Reference Clamp, Cuff, Searle and Barton2004; Edgar Reference Edgar2021). Sequences that did not align, which were substantially shorter than the majority of sequences in the dataset, or that appeared to be potential misidentifications were removed. Remaining sequences were then realigned using MUSCLE v. 5. That alignment was then used as the input for IQ-TREE v. 2.0.3 to conduct maximum likelihood phylogenetic inference with 1000 bootstrap replicates, implementing the ModelFinder Plus option (Nguyen et al. Reference Nguyen, Schmidt, von Haeseler and Minh2015; Kalyaanamoorthy et al. Reference Kalyaanamoorthy, Minh, Wong, von Haeseler and Jermiin2017). We identified clades for Lepraria atlantica Orange, L. granulata Slav.-Bay., L. humida Slav.-Bay., L. neglecta and L. straminea Vain., using L. elobata Tønsberg as the outgroup. All other sequences were removed from the dataset and the remaining 82 sequences were realigned in MUSCLE v. 5; a second phylogeny was built using these sequences and the settings in IQ-TREE v. 2.0.3 as described above. The phylogeny was visualized in FigTree v. 1.4.4 and the final phylogeny figure was edited in Adobe Illustrator.
Annotating the mating-type locus
To search for MAT genes in the Lepraria neglecta genome, we used BLASTP to search the annotations with previously published MAT locus genes as queries. The final set of amino acid sequence annotations from funannotate were used to create a blast database (Altschul et al. Reference Altschul, Gish, Miller, Myers and Lipman1990). The complete mating-type loci from Letharia vulpina strain U042 (GenBank Accession MK521632.1), which includes APN2, MAT1-1-7, MAT1-1-1, an open reading frame, and SLA2 genes, and Letharia columbiana strain U080 (GenBank Accession MK521629.1), which includes APN2, MAT1-2-1, MAT1-2-14, an open reading frame and SLA2 genes, were then used as queries in BLASTP searches against the database of all annotations from L. neglecta (Ament-Velásquez et al. Reference Ament-Velásquez, Tuovinen, Bergström, Spribille, Vanderpool, Nascimbene, Yamamoto, Thor and Johannesson2021). The complete nucleotide sequence of the L. neglecta genome was converted to a blast database and TBLASTN was used to search the genome along with the same set of Letharia queries as above. The resulting L. neglecta annotations and DNA sequences that were best matches to the Letharia mating-type genes were then used to conduct a reciprocal blast search against the entire NCBI nucleotide database to confirm that there were no other, more similar sequences. The relative location of all recovered annotation and sequence matches were visually examined using IGV (Thorvaldsdóttir et al. Reference Thorvaldsdóttir, Robinson and Mesirov2013).
Results
Eight contigs ascribable to chromosomes were assembled for L. neglecta (Fig. 1, Table 2). Six of the contigs were assembled telomere-to-telomere. Two contigs did not assemble through telomeres on one end but are adjacent to graph edges with telomeric sequences in the assembly graph. The assembly graph for contig 27 shows a potential connection with contig 15, which does contain telomeric sequences, though no telomeric reads connecting the two contigs were recovered. The assembly at the left end of contig 17 was not resolvable through subtelomeric repeats. Three additional small contigs were assembled, two of which were comprised of telomeric and subtelomeric repeats that were not able to be assembled with confidence at the ends of large contigs (contigs 26 and 15; see edges 26 and 15 in Fig. 1), and one which was a small, lower coverage repetitive element that clearly represents an assembly artifact (contig 12). The assembly size of the mycobionts fell within the previously documented size range of lichenized fungi (41.7 Mb, 100× coverage; Table 1). The assembly was highly contiguous (L. neglecta N50 = 5.202 Mbp, L50 = 4) and highly complete (97.5% of BUSCO genes were recovered). N50 refers to the length of the shortest contig in the group of the longest contigs which together represent 50% of the assembled genome. L50 is the smallest number of contigs that includes 50% of the total sequence assembly length. A total of 14 734 genes were predicted for L. neglecta with funannotate v. 1.8.7 (Table 3). The whole mitochondrial genome was recovered from the mycobiont (Supplementary Material Fig. S1, available online: 38 898 bp, 913× coverage). In addition, a low coverage assembly of a mitochondrion that most closely matches Cladonia in a homology search was also recovered (52 980 bp, 24× coverage, closest match Cladonia stipitata GenBank Accession AVT43930). The chloroplast genome from the photobiont, which most closely matches Asterochloris sp. Armaleo s. n. sequence ID: JN573844.1 (Thüs et al. Reference Thüs, Muggia, Pérez-Ortega, Favero-Longo, Joneson, O'Brien, Nelsen, Duque-Thüs, Grube and Friedl2011), was also assembled and annotated (209 823 bp, 57× coverage), as was the photobiont mitochondrial genome (83 500 bp, 20× coverage). Telomeric read mapping of the Bacidia gigantensis genome resulted in 10 contigs with telomeric reads mapped to both ends, 13 contigs with telomeric reads mapped to one end, and the one scaffold with no telomeric reads mapping (Supplementary Material File S2, available online).
Taxonomic placement of the Lepraria neglecta specimen was confirmed with a maximum likelihood phylogeny of the ITS sequences of closely related Lepraria species (Supplementary Material Fig. S2, available online). Interestingly, the present individual demonstrated 100% ITS sequence identity with a member of the stictic acid chemotype collected from Yosemite National Park, California (GenBank #|KC209133, Lendemer 19632 (NY)). The sample sequenced here was determined by TLC to be the roccelic acid chemotype.
A mating-type locus was identified in the Lepraria neglecta genome, characterized by peripheral APN2 and SLA2 genes, oriented inwards (Fig. 2, Table 4). Complete genes were found for MAT1-2-1 and MAT1-2-14 (44% sequence identity, 1179 bp and 51% sequence identity, 630 bp, respectively, to GenBank Accession MK521629.1). A tblastn search of the Letharia vulpina MAT1-1-1 amino acid sequence (GenBank Accession MK521632.1) against the nucleotide sequence of the mating-type region in Lepraria neglecta resulted in a match of 472 bp with 53% sequence identity and e-value of 5e-17. The translation of this nucleotide sequence to amino acids revealed multiple internal stop codons. Based on the low similarity and the truncation of the Lepraria neglecta sequence, we opted to refer to this as a putative MAT1-1-x pseudogene since we cannot confidently assign it to a subtype of MAT1-1 genes. No significant blast hits were recovered for the Letharia vulpina long open reading frames (LORF) or for MAT 1-1-7. Reciprocal blastp searches of the entire NCBI nucleotide database resulted in these same genes being recovered as the best matches. Sequence synteny and orientation of the MAT locus was consistent with other species of Ascomycota (Pizarro et al. Reference Pizarro, Dal Grande, Leavitt, Dyer, Schmitt, Crespo, Lumbsch and Divakar2019; Ament-Velásquez et al. Reference Ament-Velásquez, Tuovinen, Bergström, Spribille, Vanderpool, Nascimbene, Yamamoto, Thor and Johannesson2021).
Discussion
The Lepraria neglecta genome assembly presented here, being complete and contiguous telomere-to-telomere for nearly every chromosome, represents a major advancement in resources for the study of fungi. Our results suggest that the L. neglecta genome is organized into eight chromosomes, and we successfully mapped telomere sequences to both ends of six of the eight chromosomes and to one end of two of the eight (Fig. 1). One of the contigs for which telomeres were not mappable to one end was the longest contig of the assembly (6 292 759 bp) and the assembly is not fully resolved through the subtelomeric repeats on one end of the contig into the telomere region. The other contig lacking a telomere on one end was also large (4 487 623 bp) and adjacent to a telomeric graph edge, though no reads spanned from the telomere into the contig. Thus, we are confident that this is a near-full chromosome contig that will be resolvable through both telomeres with the acquisition of additional data. By comparison, this same telomere analysis of the Allen et al. (Reference Allen, Jones and McMullin2021) Bacidia gigantensus genome assembly recovered 10 contigs that included telomeres on both ends, 13 that included telomeres on one end and one scaffold that included no telomeres (see assembly graph and mapped telomeric reads in Supplementary Material File S2, available online). An assembly of the Umbilicaria pustulata (L.) Hoffm. genome from long-read data recovered 7 scaffolds, which may represent chromosomes once telomere mapping is conducted (Tzovaras et al. Reference Tzovaras, Segers, Bicker, Dal Grande, Otte, Anvar, Hankeln, Schmitt and Ebersberger2020). Our chromosome count falls well within the range of previously published literature reporting 2–21 chromosomes in diverse species of fungi (Table 5; Supplementary Material File S2). Flow cytometry is the preferred method for measuring genome size and chromosome number due to its accuracy and precision (D'hondt et al. Reference D'hondt, Höfte, Van Bockstaele and Leus2011; Talhinhas et al. Reference Talhinhas, Tavares, Ramos, Goncalves and Loureiro2017), but it is not always feasible to implement (Poma et al. Reference Poma, Pacioni, Ranalli and Miranda1998). Short-read genome assemblies are inaccurate and imprecise for measuring genome size and inferring chromosome numbers in comparison to flow cytometry (Kooij & Pellicer Reference Kooij and Pellicer2020). In contrast, highly complete genome assemblies from long-read sequencing data are one approach to more accurately infer chromosome numbers without flow cytometry. The success of using genome assemblies to infer chromosome numbers may be inconsistent since chromosome numbers among cells in the same fungus can vary, as shown in an early account of meiosis in Neottiella rutilans (Fr.) Dennis (Rossen & Westergaard Reference Rossen and Westergaard1966). However, given a sample with homogenous chromosome counts, long-read data has proved useful for telomere-to-telomere assemblies.
Mating-type locus
The conservation of a complete mating-type locus, including multiple MAT1-2 genes in a lichenized fungal lineage that has never been observed to produce sexual reproductive structures, raises questions about reproduction in putatively asexual lichens and the functions of MAT genes in fungi generally. The presence of a typical mating-type locus that matches the structure and content of other lichenized fungi suggests that L. neglecta may have the genetic capacity to reproduce sexually (Scherrer et al. Reference Scherrer, Zippler and Honegger2005; Wang et al. Reference Wang, Wang, Xiong, James and Zhang2016; Aylward et al. Reference Aylward, Havnga, Dreyers, Roets, Wingfield and Wingfield2020). If a functional copy of the MAT1-1 gene is truly missing in this individual, the possibility of primary homothallism can be eliminated, leaving the potential for heterothallism, unisexuality and asexuality to be falsified in this taxon. Lepraria neglecta could be heterothallic if a complete MAT1-1 idiomorph is discovered in a different individual, as was found in Letharia spp., Lobaria pulmonaria (L.) Hoffm. and many other Lecanoromycetes (Honegger et al. Reference Honegger, Zippler, Gassner and Scherrer2004; Singh et al. Reference Singh, Dal Grande, Cornejo, Schmitt and Scheidegger2012; Ament-Velásquez et al. Reference Ament-Velásquez, Tuovinen, Bergström, Spribille, Vanderpool, Nascimbene, Yamamoto, Thor and Johannesson2021). The presence of a MAT1-1-x pseudogene has been previously observed in Parmeliaceae, Cladoniaceae and Umbilicariaceae, and may not be an anomaly in heterothallic fungi (Armaleo et al. Reference Armaleo, Müller, Lutzoni, Andrésson, Blanc, Bode, Collart, Dal Grande, Dietrich and Grigoriev2019; Pizarro et al. Reference Pizarro, Dal Grande, Leavitt, Dyer, Schmitt, Crespo, Lumbsch and Divakar2019; Ament-Velásquez Reference Ament-Velásquez, Tuovinen, Bergström, Spribille, Vanderpool, Nascimbene, Yamamoto, Thor and Johannesson2021). This phenomenon has also been observed in non-lichenized fungi such as Grosmannia clavigera (Rob.-Jeffr. & R.W. Davidson) Zipfel et al., in which the MAT1-2 idiomorph includes a truncated copy of MAT1-1 (Tsui et al. Reference Tsui, Diguistini, Wang, Feau, Dhillon, Bohlmann and Hamelin2013). Alternatively, if the MAT1-1-x pseudogene is the only copy of MAT1-1 in all individuals of L. neglecta, the only remaining potential reproductive modes are unisexuality and asexuality. Neurospora africana L.H. Huang & Backus, a filamentous ascomycete, was the first described example of unisexuality (Glass & Smith Reference Glass and Smith1994). Similarly, Huntiella moniliformis (Hedgc.) Z.W. de Beer et al., another non-lichenized ascomycete, possesses only the MAT1-2 idiomorph and is capable of unisexual reproduction (Wilson et al. Reference Wilson, Gondlonton, van der Nest, Wilken, Wingfield and Wingfield2015a). It is unclear whether the remaining hypotheses of sexuality for Lepraria spp. are testable and falsifiable. Future research seeking to test these hypotheses may require highly complete genomic resources.
The presence of functional mating-type genes in L. neglecta does not necessarily mean that they are capable of sexual reproduction. MAT1 genes code for transcription factors which can regulate the expression of potentially hundreds of pheromone and pheromone-receptor genes (Pöggeler et al. Reference Pöggeler, Nowrousian, Ringelberg, Loros, Dunlap and Kück2006; Bidard et al. Reference Bidard, Benkhali, Coppin, Imbeaud, Grognet, Delacroix and Debuchy2011; Böhm et al. Reference Böhm, Hoff, O'Gorman, Wolfers, Klix, Binger, Zadra, Kürnsteiner, Pöggeler and Dyer2013; Wilson et al. Reference Wilson, Gabriel, Singer, Schuerg, Wilken, van der Nest, Wingfield and Wingfield2021). These downstream pathways have been well characterized in some model fungi but remain unexplored for most taxa. Fusarium oxysporum Schltdl. is an ascomycete plant symbiont which uses MAT-derived pheromones and autocrine pheromone signalling pathways for density-dependent regulation of conidial sporulation (Vitale et al. Reference Vitale, Pietro and Turrà2019). Saccharomyces cerevisiae MAT-α cells produce α-factor pheromones which induce apoptosis in MAT-a cells under certain physiological conditions (Severin & Hyman Reference Severin and Hyman2002). The apparently universal conservation of the MAT1 locus among dikaryotic fungi implies that the MAT1 genes possess functions essential to species survival. Comparative genomic analysis of more Lepraria species and populations will be necessary to begin elucidating the functions of conserved MAT genes in this genus of presumed asexual lichens.
Conclusion
Our analysis demonstrates the efficacy of long-read sequencing from heterogeneous metagenomic samples of the lichen symbiosis. With limited time and resources, we produced highly contiguous genomic assemblies, including six of eight nuclear mycobiont chromosomes resolved telomere-to-telomere. We additionally confirmed the presence of an intact mating-type locus in a fungal lineage presumed to be exclusively asexual. Without the need for axenic cultures, long-read sequencing can simultaneously generate genomic data for the mycobiont, photobiont and other symbionts, and rapidly advance our knowledge of lichen biology and fungal genetics.
Acknowledgements
We are grateful for the comments and suggestions from two anonymous reviewers that substantially improved the manuscript, and bioinformatics guidance and support from Sean McKenzie. Funding for this research was provided by Eastern Washington University and a National Science Foundation Grant to JLA (DEB #2115191).
Author ORCID
Jessica L. Allen, 0000-0002-6152-003X.
Competing Interests
JLA jointly owns stock options in Oxford Nanopore Technologies.
Data Availability
All data associated with this project are deposited in NCBI under BioProject PRJNA887108. Reads are available in the Sequence Read Archive (SRR21853138). The organellar genome sequence assemblies are associated with the following GenBank Accession numbers: 1) Cladonia sp. mitochondrial GenBank ID OP696659, 2) Asterochloris sp. mitochondrial OP696658, 3) Asterochloris sp. chloroplast OP696657, 4) Lepraria neglecta mitochondrial OP696656.
Supplementary Material
To view Supplementary Material for this article, please visit https://doi.org/10.1017/S002428292200041X.