Phylogenetic relationships of the liverworts (Hepaticae), a basal embryophyte lineage, inferred from nucleotide sequence data of the chloroplast gene rbcL.
Sequence data from the chloroplast-encoded gene rbcL were obtained for 24 liverworts, a basal group of embryophytes. Maximum likelihood and parsimony analyses of these data, along with data from other major green plant lineages, confirm hypotheses based on morphological data, such as the paraphyly of bryophytes, and the basal position of liverworts. Molecular data corroborate the deep separation between the complex thalloid and leafy/simple thalloid liverworts implied by morphological data, but the monophyly of liverworts could not be rejected. The effects of accounting for site-to-site rate heterogeneity in these data were examined using maximum likelihood methods. Comparison of trees obtained with and without rate heterogeneity showed that simply allowing for heterogeneity had a greater improvement on likelihood score than optimization of transition/transversion bias. Incorporation of site-to-site rate heterogeneity in the larger analysis, however, did not necessarily change which topology was favored. Properties of rbcL sequences from the two liverwort groups were compared. Significantly different substitution rates were found between leafy/simple thalloid and complex thalloid liverwort taxa, with rates of rbcL sequence evolution in leafy/simple thalloid taxa being higher and more indicative of those of vascular plants, and with those of complex thalloid taxa (such as Marchantia) being slower. Codon usage in rbcL in complex thalloid liverworts was biased toward NNU and NNA, compared to the leafy/simple thalloid liverworts. Although base composition and relative substitution rates differed between the two groups, no significant differences were detected within each of the two groups of liverworts. The signal present in first and second codon sites versus third codon sites was compared. While the third codon positions in rbcL across this taxon sampling are highly variable (with only 15 constant sites of 439), the trees obtained were in general agreement with trees from the entire data set and with trees obtained from independent sources of data. The presence of signal in third codon positions across greater than 400 MY of plant evolution means that definitions of saturation based on pair-wise comparisons of sequences inadequately assess phylogenetic signal.
Lewis, LA; Mishler, BD; Vilgalys, R
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)