July 20, 2007
Phylogeny of Extant and Fossil Juglandaceae Inferred From the Integration of Molecular and Morphological Data Sets
By Manos, Paul S Soltis, Pamela S; Soltis, Douglas E; Manchester, Steven R; Et al
Abstract.- It is widely acknowledged that integrating fossils into data sets of extant taxa is imperative for proper placement of fossils, resolution of relationships, and a better understanding of character evolution. The importance of this process has been further magnified because of the crucial role of fossils in dating divergence times. Outstanding issues remain, including appropriate methods to place fossils in phylogenetic trees, the importance of molecules versus morphology in these analyses, as well as the impact of potentially large amounts of missing data for fossil taxa. In this study we used the angiosperm clade Juglandaceae as a model for investigating methods of integrating fossils into a phylogenetic framework of extant taxa. The clade has a rich fossil record relative to low extant diversity, as well as a robust molecular phylogeny and morphological database for extant taxa. After combining fossil organ genera into composite and terminal taxa, our objectives were to (1) compare multiple methods for the integration of the fossils and extant taxa (including total evidence, molecular scaffolds, and molecular matrix representation with parsimony [MRP]); (2) explore the impact of missing data (incomplete taxa and characters) and the evidence for placing fossils on the topology; (3) simulate the phylogenetic effect of missing data by creating "artificial fossils"; and (4) place fossils and compare the impact of single and multiple fossil constraints in estimating the age of clades. Despite large and variable amounts of missing data, each of the methods provided reasonable placement of both fossils and simulated "artificial fossils" in the phylogeny previously inferred only from extant taxa. Our results clearly show that the amount of missing data in any given taxon is not by itself an operational guideline for excluding fossils from analysis. Three fossil taxa (Cruciptera simsonii, Paleoplatycarya wingii, and Platycarya americana) were placed within crown clades containing living taxa for which relationships previously had been suggested based on morphology, whereas Polyptera manningii, a mosaic taxon with equivocal affinities, was placed firmly as sister to two modern crown clades. The position of Paleooreomunnea stoneana was ambiguous with total evidence but conclusive with DNA scaffolds and MRP. There was less disturbance of relationships among extant taxa using a total evidence approach, and the DNA scaffold approach did not provide improved resolution or internal support for clades compared to total evidence, whereas weighted MRP retained comparable levels of support but lost crown clade resolution. Multiple internal minimum age constraints generally provided reasonable age estimates, but the use of single constraints provided by extinct genera tended to underestimate clade ages. [Angiosperm; DNA scaffolds; Fagales; fossils; matrix representation parsimony; molecular dating; simulation; total evidence.]The importance and relevance of fossils when estimating phylogenetic relationships among living organisms has been a subject of controversy (Donoghue et al., 1989; Huelsenbeck, 1991; Novacek, 1992; Nixon, 1996; Brochu, 1997; Gatesy and O'Leary, 2001). The recent debate was apparently sparked by comments initially made two decades ago suggesting that fossils are relatively uninformative in phylogenetic analysis (Paterson, 1981; Ax, 1987). Countering this claim, early studies in two major clades, seed plants and amniotes, demonstrated that fossils are an important contribution to phylogenetic reconstructions because they often possess unique combinations of characters that clarify relationships among living and extinct taxa, potentially leading to new hypotheses that were not recovered by analyses of only living taxa (e.g., Crane, 1985; Doyle and Donoghue, 1986; Gauthier et al., 1988).
Phylogenetic studies that include both extant and fossil representatives are essential for identifying the closest living relatives of fossils (Keller et al., 1996; Magallon-Puebla et al., 1996; Shaffer et al., 1997; Herendeen and Wheeler, 1999; Crepet et al., 2004), for improving phylogenetic resolution and accuracy (Gauthier et al., 1988; Sun et al., 2002; Gatesy et al., 2003; Meyer and Zardoya, 2003; Rothwell and Nixon, 2006), and for clarifying morphological character evolution (e.g., Gatesy and Dial, 1996; Brochu, 1997; Crane et al, 2004; Clarke and Middleton, 2006). The desire to include fossils in such analyses has led to a proliferation of methodological and analytical issues, including how best to code fossil data (Doyle and Donoghue, 1987; Nixon, 1996), the effects of missing data on tree stability and branch support (Donoghue et al., 1989; Huelsenbeck, 1991; Nixon and Davis, 1991; Nixon and Wheeler, 1992; Novacek, 1992; Wilkinson, 1995; Nixon, 1996; Shaffer et al, 1997; Wiens, 1998, 2003a, 2003b, 2005; Anderson, 2001; Kearney, 2002), and predicting the frequency of new character states among fossil taxa (Wagner, 2000).
Most examples of incorporating fossils into phylogenetic analysis with extant species involve parsimony analyses of exclusively morphological data sets in which both fossil and extant taxa are scored for characters that "fossilize" (preserve readily in fossils; e.g., Crane, 1985; Doyle and Donoghue, 1986; Kenrick and Crane, 1997; Stockey et al., 1997; Grande and Bemis, 1998). With the rise of molecular systematics, well-supported phytogenies have been produced for many groups based on DNA sequences, as well as combined morphological and molecular data sets. Because fossil taxa preserve fewer morphological and anatomical characters than are available from living taxa, and generally lack DNA entirely, combined analyses with morphological and molecular data sets that include fossil taxa will inevitably lack enormous amounts of data. Thus far, the few studies taking this combined approach have not addressed how best to deal with this issue and simply resorted to the total evidence paradigm, scoring fossils in combined analyses with missing values in the DNA data sets.
Under parsimony, missing data do not contribute to minimizing the cost of the reconstruction, but they may introduce uncertainty into the analysis (Nixon, 1996; Kearney, 2002; Wiens, 2003a, 2003b, 2005). Simulation studies have demonstrated that the "missing data problem" diminishes as the number of characters increases (Wiens, 2003a, 2003b); that is, the uncertainty contributed by missing data is ameliorated by the addition of more characters. Further, the problem of placing fossils is more a result of too few characters to link fossils to modern taxa than too many missing data points that destabilize the overall topology. Wiens (2003a, 2003b) therefore suggested that the best strategy for including incomplete taxa, whether fossil or modern, is to increase the number of characters. His simulations showed that with as few as 100 characters, a "fossil" taxon could be placed correctly, even if many characters were missing. Whereas it may be possible to obtain 100 or more characters within several vertebrate clades including fossils (e.g., Brochu, 1997; Shaffer et al, 1997; O'Leary, 1999, 2001; Gatesy et al., 2003; Mayr and Clarke, 2003; Santini and Tyler, 2004; Clarke et al., 2005; Demere et al., 2005), similar sized matrices for modern and related fossil plants are less common (e.g., Nixon et al, 1994; Rothwell, 1999; Eklund et al., 2004), and more typically less than half that many characters have been scored for some of the most notable fossil flowering plant examples (Keller et al., 1996; Magallon-Puebla et al., 1996; Sun et al., 2002; Gandolfo et al., 2004). Because Wiens' simulations typically use a known phytogeny, there is still much to be gained by exploring the behavior of known and artificially derived fossils in phylogenetic analysis.
To date, several botanical studies have included both fossils and DNA data in combined phylogenetic analyses (e.g., Sun et al., 2002; Hermsen et al, 2003, 2006; Gandolfo et al, 2004; Crepet et al., 2005; Xiang et al., 2005; Rothwell and Nixon, 2006; Magallon, 2007). However, controversy surrounds the phylogenetic placement of certain critical plant fossils (e.g., Friis et al., 2003a; Gandolfo et al., 2004), and the inclusion of fossils in analyses with extant taxa has challenged the results of molecular phytogenies based of extant taxa (e.g., Fryer et al., 2001; Rothwell and Nixon, 2006). These examples therefore illustrate crucial issues in the placement of fossils, a major one being that with limited data, fossil placement is often largely influenced by the interpretations of a few relevant fossilizable data points. In addition, employing large molecular data sets appears to stabilize the relationships of modern taxa, despite the scoring of fossils as missing in these large data partitions.
The angiosperm clade Juglandaceae (walnut family) represents an excellent model for investigating the integration of fossils into a phylogenetic framework of extant taxa as it has a rich fossil record relative to low extant diversity, an excellent morphological database, and a well-supported molecular phylogeny for extant taxa. Comprising a well-supported clade of nine genera and 60 species of trees mostly distributed throughout mid and low latitudes of the Northern Hemisphere, Juglandaceae have an excellent fossil record that includes both extant and extinct genera (reviewed by Dilcher et al., 1976; Manchester, 1987, 1991; Elliott et al., 2006). Diagnostic fossil remains of fruits, flowers, pollen, leaves, and foliage (organ genera) facilitate recognition of former geographic ranges of extant and extinct genera and provide a temporal framework for interpretation of the phytogeography and phylogeny of the Juglandaceae (Dilcher et al., 1976; Manchester, 1987; Manchester and Dilcher, 1997). Recent studies have produced a phylogenetic framework for the extant members of the family based on nucleotide regions from two genomes (ca. 2000 base pairs total) and morphology (Manos and Stone, 2001), but a comprehensive synthesis of extant and extinct diversity (Fig. 1) awaits phylogenetic analysis. After justifying our approach to combining organ genera into composite and terminal taxa to be incorporated into phylogenetic analysis, we then compare multiple methods to integrate five fossil taxa with extant Juglandaceae and explore the impact of missing data (incomplete taxa) on the topology. We also simulate the effects of missing data by examining the effects of artificially derived fossils in the context of our combined morphological and molecular data sets. Lastly, with real fossils placed by phylogenetic analysis, we compare the impact of single and multiple fossil constraints in estimating the node ages of clades in Juglandaceae.
MATERIALS AND METHODS
Phylogeny and Sampling of Extant Taxa
Juglandaceae are monophyletic and comprise nine extant genera and approximately 60 species of trees (Manos and Steele, 1997; Li et al., 2004). Recent phylogenetic studies based on morphology and DNA sequences have resolved relationships among 25 species that represent all genera and all major groups within genera (Manos and Stone, 2001). To augment the taxon sampling of Manos and Stone (2001) and set the stage for integrating fossil taxa, we included nine additional accessions (collected by P. S. Manos and D. E. Stone) to produce the revised phylogeny of living Juglandaceae presented here. We added more species and intraspecific samples of the subfamily Engelhardioideae, especially within the taxonomically difficult genus Engelhardia sensu lato. Intraspecific variation also was assessed within Oreomunnea mexicana, and two individuals of the previously unsampled O. pterocarpa were included as well. For the subfamily Juglandoideae, another individual of Jugions regia was examined. Voucher information and GenBank accession numbers for most of the taxon sample is presented in Manos and Stone (2001). The same information for the additional taxa is presented in Appendix 1.
Methods of DNA extraction, PCR, and sequencing of the nuclear ribosomal ITS region and chloroplast DNA intergenic spacers (rbcL/ atpB and trnL/trnF) follow Manos and Stone (2001). In total, 34 accessions are included, representing 27 species of the modern family Jugandaceae. The optimal outgroup for the family is its sister group, Rhoiptelea chiliantha, as determined by molecular phylogenetic studies of Fagales (Manos and Steele, 1997; Li et al, 2004).
Choice of Fossil Taxa
Juglandaceae have an extensive fossil record in the Northern Hemisphere that includes fruits, flowers, pollen, leaves, and wood of both extant and extinct genera. We selected five fossil fruit taxa and their associated floral and vegetative organs for inclusion in the analysis (see Figs. 1, 2): Polyptera manningii (Manchester and Dilcher, 1982, 1997), Paleooreomunnea stoneana (Dilcher et al., 1976), Cmciptera simsonii (Manchester, 1991), Paleoplatycarya wingii (Manchester, 1987), and Platycarya atnericana (Wing and Hickey, 1984). These fossils were selected based on the relatively complete morphological data sets that could be compiled for them from wellpreserved and thoroughly investigated suites of specimens. Each of the selected taxa also provides extinct character combinations not seen in any living representative of the family. Thus, they provide additional taxonomic diversity for the morphological character matrix. Discussion of the criteria for the reconstruction of fossil taxa, summary describing the five fossils, and character state coding is presented in Appendix 2, available at http:// sytematicbiology.org.
Our matrix of 64 floral, vegetative, and fruit characters incorporates the 50-character matrix of Manos and Stone (2001) supplemented by 14 additional characters. The newly added characters, such as fruit wing venation, mean leaflet number, spacing of teeth on the lamina, prevalence of intersecondary veins, and size range of epidermal scales, include features of apparent systematic utility often preservable in fossils. These new characters and any modifications on coding the original 50 characters are presented in Appendix 2 (online), along with justifications for scoring decisions regarding the fossils. Relative amounts of missing data in total, by fossil taxon, and by character sets were quantified. For the fossil taxa, we also record the percentage of new character states and new character-state combinations relative to extant taxa. The morphological and molecular matrix and representative trees are available from TreeBASE under study accession number S1678 and matrix accession number M3036.
We used parsimony to build trees based on the morphological and molecular data sets. Tree space was searched using the following options as implemented in PAUP* (Swofford, 2002): Branch and Bound or heuristic searches with 1000 random addition replicates, tree bisection-reconnection (TBR) branch swapping, MULTREES, and the steepest descent options. Branches with a minimum length of zero were collapsed using "AMB-" option (Nixon and Carpenter, 1996). Analyses of morphological data used equally weighted characters, and trees were constructed for extant taxa only and for extant + fossil taxa. Trees were evaluated for standard measures of fit and resolving power using the total number of nodes supported in the strict consensus.
Incongruence between the molecular data sets derived for extant taxa was evaluated using the incongruence length difference (ILD) test of Farris et al. (1994), as implemented in PAUP*. Bootstrap analysis (Felsenstein, 1985) using 1000 replicates and full heuristic searches saving all trees was used to measure relative support.
Missing data were largely restricted to fossil taxa for which molecular character states could not be assessed and for which some morphological traits could not be observed in the fossil specimens so far available. We evaluated the effects of the missing data in phylogenetic analysis in several ways. First, we focused on the effects of particular taxa with greater than 50% missing data by comparing analyses with complete taxon sampling to those with mostly incomplete fossils excluded. second, we examined the impact of particular character sets on resolution and placement of fossils. Specifically, morphological data were partitioned according to their source (floral, vegetative, and fruit) and by their relative degree of completeness. For example, "fossilizable" characters corresponded to characters that were scored in nearly all fossil taxa, defined here as the complete data present in at least four of the five fossil taxa, and largely encompassing flower and fruit characters that are most commonly recovered in fossil Juglandaceae. This type of partition corresponds to the data used in recent analyses of fossil angiosperme (Sun et al., 2002). Although this process is essentially ad hoc, we emphasize the need to explore the data heuristically in order to understand better the importance of particular characters and character sets to phylogenetic reconstruction in Juglandaceae. Finally, the distribution of morphological apomorphies was explored under ACCTRAN and DELTRAN optimization using the reconstructions provided by PAUP*, and these distributions were evaluated with and without fossils to pinpoint the character states supporting the placement of fossils.
Combined analysis of fossil and living taxa for morphological and molecular data was performed using three methods. First, the total evidence approach was implemented by scoring fossils as missing for the DNA data (e.g., Sun et al., 2002). second, a molecular scaffold was used to place the fossils with morphological data (Springer et al., 2001). This approach maintains the general structure of the molecular tree for modern taxa while allowing for the placement of fossil taxa using a smaller morphological data set. A molecular- based tree was used as a "backbone constraint" during analysis of all taxa using morphology. Trees compatible with the molecular constraint were retained, allowing the fossils to be placed on the scaffold. We constructed a molecular scaffold for 34 samples of modern Juglandaceae, plus the outgroup Rhoiptelea. All clades receiving bootstrap support of >80% in a parsimony search of this molecular data set were retained in the scaffold. Heuristic searches were conducted using 1000 replicates of random taxon addition with TBR branch swapping and saving all most parsimonious trees. Bootstrap analyses were performed on the morphological data without enforcing the backbone constraint, using 1000 replicates and full heuristic searches saving all trees. Third, we used the matrix representation parsimony (MRP) function in PAUP* to produce a matrix of node data by taxon for the strict consensus of the DNA trees (sensu Doyle, 1992; Baum, 1992; Ragan 1992). Sixty-three binary characters formed the matrix, 27 of which were parsimony- informative. We combined these 27 characters (DNA-MRP) with the morphological data, treating the former as both equally weighted and weighted characters, the latter scaled according to bootstrap values. For example, bootstrap values of 90% to 100% received a weight of 5, whereas a weight of 1 was applied to nodes supported in the strict consensus with bootstrap support between 50% and 59%. The remaining scaling and weighting procedure was as follows: 60-69% = 2; 70-79% = 3; 80-89% = 4. Under the reasonable assumption that the fossil diversity examined here fits into the two extant lineages of Juglandaceae, an additional binary character was used to place each fossil taxon in the DNA tree at the subfamilial level (Engelhardioideae: Paleooreomunnea; Juglandoideae: Cruciptera, Paleoplatycarya, Platycarya, and Polyptera), but otherwise the fossils were coded as missing for the remaining cells of the matrix representing the DNA topology. Using the three methods, analyses to place the fossils included (i) all five fossils simultaneously; (ii) all but Cruciptera, the fossil with the most missing data; (iii) all but Paleooreomunnea, the fossil with the second largest amount of missing data; and (iv) all but Cruciptera and Paleooreomunnea.
Creating Artificial Fossils
We also explored the effects of missing data on the resulting topologies by using "artificial fossils" generated from the combined morphological and molecular matrix. To generate "artificial fossils/ ' we selected a species from the matrix of modern taxa, duplicated it in the matrix to produce a "fossil" for the original species, removed all of the molecular characters in the "fossil," and replaced them with "?". We then randomly selected 25% of the 64 morphological characters scored for the species and added them to the string of "?" representing the molecular data and the remaining 75% of the morphological characters. This random selection of morphological characters was repeated to generate 100 replicate data sets, each with a single "artificial fossil," each of which lacked all molecular data and contained a random sample of 25% of the morphological characters. This procedure was repeated to create 100 replicate data sets with "fossils" with 50% of the morphological characters and 75% of the morphological characters, respectively. These data sets have 16, 32, and 48 morphological characters, respectively, numbers of characters that are typical of those used in studies of fossil plants (e.g., Sun et al., 2002; Gandolfoetal.,2004).
In phylogenetic analyses, each artificial fossil should be placed as the sister to the species from which it was generated. We conducted heuristic parsimony searches of all replicate data sets and recorded the percentage of correct placements. We also recorded the effects that inclusion of these incomplete "fossil" taxa had on the remainder of the tree, both locally (the clade/genus in which the parent species was located) and more distantly. Furthermore, because the effects might differ if the "fossil" belongs in a tip clade or is more deeply nested, we repeated the entire analysis for three species from different "depths" in the tree: artificial fossils were generated from Alfaroa manningii, Platycarya strobilacea, and Juglans regia.
Characters available in fossils may not be a random subset of all morphological characters; for example, a fossil leaf will have only those characters associated with a leaf and will lack all characters obtained from flowers and fruits. The random selection of characters used in the analyses described above does not take this issue into account directly (although we argue that all characters ultimately come from a universe of independent characters; see Discussion). To mimic the more realistic situation of a fossil that possesses only a suite of organspecific characters, we also have examined the effect of using a single class of characters (vegetative, floral, or fruit) on the placement of the artificial fossils. Specifically, for each artificial fossil (Alfaroa manningii "fossil," Juglans regia "fossil," and Platycarya strobilacea "fossil"), we made three additional data sets: one each with only vegetative characters (26 characters), only floral characters (27 characters), and only fruit characters (10 characters). We then analyzed each of these nine matrices (100 random addition replicates, TBR branch swapping, saving all most parsimonious trees) to see the effect of these character partitions.
Estimation of Divergence Times
Divergence times in Juglandaceae were estimated employing the penalized likelihood method implemented in the program "tSs," which allows rates to vary across a phylogeny (Sanderson, 2002). For this analysis, we generated the maximum likelihood (ML) tree from the molecular data with an additional, more distant (but related) outgroup, Carpinus of the Betulaceae, a member of the sister clade to the Juglandaceae/Rhoipteleaceae clade, to accurately place the root node of the Juglandaceae (Manos and Steele, 1997; Li et al., 2004). Homologous sequences of all three regions sampled for Juglandaceae were available in GenBank. Carpinus rankanensis was used for the atpB-rbcL (AY014606) and trnL-trnF (AF200933) regions and C. polyneura was included for the ITS region (AF081517). The ML tree was generated in PAUP* through heuristic searches with 100 replicates of random taxon addition, TBR branch swapping, and MULTREES. The GTR+I+A model (Swofford et al., 19%) with six rate parameters, a proportion of invariable sites (0.4380), and the gamma- shape parameter (a= 0.679) was used, as determined by the hierarchical likelihood ratio test using ModelTest 3.06 (Posada and Crandall, 1998).
The molecular clock model for the ML tree was tested using the likelihood ratio test (Felsenstein, 1981). Confidence intervals of the divergence times were estimated by the nonparametric bootstrap procedure (Baldwin and Sanderson, 1998). One hundred bootstrap data matrices were generated in PHYLIP (Felsenstein, 2002), and branch lengths of the original ML topology for each bootstrap data matrix were estimated using PAUP* under the previously described GTR+I+A model. These bootstrap trees, with identical topology but differing branch lengths, were included as the source trees in the analysis of divergence times. The most distant outgroup Carpinus was pruned before the analysis. The optimal smoothing parameter of 100, determined by the cross-validation procedure using the Truncated Newton (TN) algorithm (Sanderson, 2002) from the original ML tree, was set for all source trees. The age of the Juglandaceae was fixed at 78 million years before the present (MYBP) based on fossil flowers and fruits of Caryanthus Friis, a likely sister lineage of Juglandaceae (Friis, 1983; Manchester, 1999; Sims et al., 1999). Five different combinations of minimum age constraints (Table 1) based on phylogenetic hypotheses of modern and fossil taxa were analyzed in order to examine whether known fossil ages were consistent with estimates obtained using different constraint schemes.
The revised morphological matrix constructed for assessing the phylogeny of the 27 extant and five fossil taxa of Juglandaceae consisted of 64 characters. Phylogenetic information was present in 53 characters in analyses without fossils and in 56 characters with fossils. For the fossils, of the possible 280 parsimony-informative cells of the matrix (5 taxa x 56 characters), 143 cells were scored as missing (51%). In considering all 64 characters potentially observable in the fossils or 320 cells of the matrix (5 taxa x 64 characters), roughly 50% of the data were missing. For fossil taxa, the percentage of missing data in the parsimony-informative matrix ranged from 41% in Polyptera to 73% in Cruciptera, and partitioning the data by character sets revealed that vegetative characters were more commonly missing relative to floral and fruit characters (Table 2). The amount of "fossilizable" data per character set was 31% (vegetative), 60% (floral), and 58% (fruit).
Sequence data for the nine additional accessions of Juglandaceae were aligned manually with the existing matrices published by Manos and Stone (2001), where descriptions of the general pattern of sequence divergence and congruence between data sets was assessed. To accommodate the new sequences, the three aligned nucleotide regions were expanded slightly to accommodate new indel regions, for a total of 2006 sites. No regions within the final, aligned sets of sequences were removed, and all resulting gaps were coded as missing data in subsequent analyses.
Extant taxa.-Parsimony analysis of the morphological data set yielded 26 trees of 134 steps (CI = 0.61; RI = 0.92; RC = 0.57), and the strict consensus shows a basal trichotomy (Fig. 3a). The molecular data set accounted for 231 parsimony-informative characters and generated two trees of 492 steps (CI = 0.62; RI = 0.90; RC = 0.55), and the strict consensus shows two major clades (Fig. 3b). Combined analysis based on 284 parsimony-informative characters generated three trees of 633 steps (CI = 0.61; RI = 0.90; RC = 0.55; Fig. 3C), seven steps longer than the additive tree length derived from separate analyses (!MF= 0.01): the hypothesis of incongruence between the molecular and morphological data sets as measured by the ILD test was rejected (P = 0.19). Tree resolution was lower for the morphological data, partly due to the invariant nature of character states scored for species within certain genera (e.g., Carya, Engelhardia, Oreomunnea}: molecular and combined data completely resolved all but one and three nodes, respectively.
Congruence among trees derived from separate and combined analysis was widespread, and differences between the DNA and morphology trees were not strongly supported (Fig. 3a, b). Similarities among the trees were many, especially the lack of support for the monophyly of species of Oreomunnea. There was moderate support across all three analyses (57%/70%/75%, see Fig. 3a- c) for a paraphyletic Engelhardia sensu lato, with Alfaropsis roxburghiana (formerly Engelhardia roxburghiana) resolved as sister to the Alfaroa + Oreomunnea clade. This provides further evidence for recognizing the genus Aifaropsis (Iljinskaya, 1993), which had been previously placed in the monotypic Engelhardia section Psilocarpeae (Manning, 1978). Although the position of Platycarya based on morphology was equivocal (Fig. 3a), DNA data placed it as sister to a clade of Juglans, Pterocarya, and Cyclocarya (Fig. 3b), whereas in the combined analysis it was sister to the remaining juglandoid taxa (Fig. 3c). Relationships among the lineages within this clade were varied, yet weakly supported. The weak support for the nonmonophyly of Juglans (57%) based on previous DNA analyses appears to be attributable to rate heterogeneity in the ITS region (Manos and Stone, 2001). Integrating fossil taxa.-Morphological analyses including all five fossils generated 245 trees of 149 steps (CI = 0.57; RI = 0.90; RC = 0.52) distributed in 11 islands. The strict consensus (Fig. 4a) is collapsed except for two nodes that are not fully resolved within the clade corresponding to subfamily Juglandoideae (see Figs. 1, 3c). Fossil taxa, with the exception of Paleooreomunnea, were placed within the Juglandoideae clade, and for the two nodes resolved, the fossils Platycarya americana and Paleoplatycarya (platycaryoids) were placed with the extant species, Platycarya strobilacea. The fossils Polyptera and Cruciptera formed a polytomy with extant species of the Juglandeae clade. Bootstrap support decreased for many clades resolved with high support in the morphological analysis of modern taxa (see Fig. 3a). With the addition of fossils, four deep nodes were lost in the strict consensus tree.
Analyses excluding the fossil genus Cruciptera resulted in 71 trees on five islands (148 steps). Resolution in the strict consensus was improved within the juglandoid clade, and Polyptera was resolved as sister to the clade of modern Juglandeae (results not shown). Excluding Paleooreomunnea yielded one island of 34 trees (146 steps) with resolution as in Figure 4a, but the engelhardioid clade remained intact. Analyses excluding both Cruciptera and Paleooreomunnea resulted in one island of 10 trees (145 steps): the strict consensus showed a similar level of resolution within Juglandeae, including the position of Polyptera (as above), and improved resolution within the engelhardioid clade (results not shown).
Total evidence.-Combined analysis with fossils coded as missing for the DNA data resulted in three trees of 646 steps (CI = 0.60; RI = 0.90; RC = 0.54) and a completely resolved strict consensus except for the relationships within clades of Juglandinae and New World Can/ a (Fig. 4b). Bootstrap support for the engelhardioid clade decreased with the inclusion of fossils (61% versus 100% without fossils). The placement of Paleooreomunnea was equivocal based on bootstrap support (
Analyses excluding Cruciptera resulted in a single, completely resolved tree, except for relationships within both the New World clade of Carya and the Juglandinae clade, and low bootstrap support for the engelhardioid clade. Cyclocarya + Pterocarya were sister to Juglans with moderate support (results not shown), a relationship not found in the combined analysis of extant taxa (see Fig. 3c) but present in the strict consensus of the DNA trees (Fig. 3b). Excluding Paleooreomunnea yielded three trees and a consensus identical to Fig. 4b. In contrast to the bootstrap analysis including all fossils or without only Cruciptera, removal of Paleooreomunnea resulted in 100% support for the engelhardioid clade. Removal of both Cruciptera and Paleooreomunnea also resulted in a single tree, identical to the result obtained when only Cruciptera was excluded, maintaining Polyptera as sister to the clade of modern Juglandeae with a slight increase in bootstrap support (66%; results not shown). Analyses excluding Paleooreomunnea and Polyptera yielded seven trees, with Cruciptera forming a polytomy within a well-supported Juglandinae clade (84%; results not shown).
DNA scaffolds.-The 80% bootstrap constraint tree derived from the DNA data of extant taxa was similar to the tree shown in Figure 3b, except for the collapse of six nodes. Parsimony analysis of the morphological data including fossil taxa onto this backbone constraint yielded five trees of 158 steps. The strict consensus (Fig. 5a) is well resolved and shows two major clades: engelhardioids, with the fossil Paleooreomunnea placed as sister to the Alfaroa + Oreomunnea clade, and the juglandoid clade, with the four remaining fossil taxa placed exactly as in the total evidence tree (Fig. 4b). The placement of Paleooreomunnea among engelhardioids is the only difference between this topology and that obtained in the total evidence analysis. Bootstrap analyses without the backbone constraint produced values nearly identical to those obtained from runs on morphological data including fossils, whereas those conducted with the backbone constraint were generally higher (results not shown). In either case, support for the two major clades decreased with the addition of fossils. Bootstrap support for the placement of Polyptera and Cruciptera with modern Juglandeae decreased to 69% with the backbone constraint, compared to 78% in the total evidence analysis (Fig. 4b). Without fossils, this node received only 67% support in the combined analysis (Fig. 3c). In general, these analyses provided a stabilized framework for placing bootstrap values on nodes that did not appear in the strict consensus tree derived from only analyzing morphological data with fossils (see Fig. 4a).
Analyses with the DNA scaffold excluding Paleooreomunnea, Cruciptera, and Polyptera separately and in combination were evaluated with strict consensus trees and bootstrap analyses. Removal of Cruciptera alone yielded a similar overall topology, with Cydocarya + Pterocarya as sister to Jugions, instead of the polytomy obtained when all taxa are included (see Fig. 5a). Furthermore, support for the clade of Cydocarya, Pterocarya, and Juglans increased substantially (94% versus 73%), as did support for the Juglandeae clade (76% versus 61%); thus, the sister-group placement of Polyptera to this clade was enhanced with removal of Cruciptera (results not shown). Removal of Polyptera alone yielded higher levels of bootstrap support at several nodes within the Juglandeae clade. Support for the Juglandinae clade increased when Polyptera was removed (86% versus 73%), indicating a synergism between the effects of the fossils Cruciptera and Polyptera. Removal of both Cruciptera and Paleooreomunnea did not severely affect the remaining topology; however, their removal did have a profound effect on internal support. With both of these fossils removed, support for the Juglandinae clade increased to 95% (versus 73% when both fossils were included). Furthermore, support for the engelhardioid clade also increased (to 100% versus 66% when both fossils were included).
Matrix representation parsimony.-Combined analysis of the morphological data with 27 equally weighted informative characters derived from MRP of the DNA strict consensus tree (Fig. 4b) generated 27 trees of 178 steps (CI = 0.62; RI = 0.91; RC = 0.57). The consensus tree (Fig. 5b) contained many of the subclades resolved in the separate and combined analyses, including the consistent recovery of the juglandoid clade and placement of fossil platycaryoids, as well as the placement of Cruciptera and Polyptera as in the total evidence and DNA scaffold trees. In contrast, the node corresponding to the engelhardioid clade collapsed, as in the total evidence tree (Fig. 4b), and the position of Paleooreomunnea remained uncertain. Bootstrap support was high for the three subclades of engelhardioids, and all nodes recovered within the juglandoid clade received bootstrap values over 50%, with support notably higher on two of the three branches leading to subclades containing fossils. Results of the weighted analysis generally were similar (Fig. 5c), except that the two major clades within the family were recovered in the strict consensus, and bootstrap support was comparable to or higher than the results obtained using the total evidence approach. One significant deviation from the total evidence result was the high support for the engelhardioid clade including Paleooreomunnea, similar to the results using the DNA scaffolds (Fig. 5a).
Overall, weighting the informative characters derived from MRP only slightly improved the resolution and support across the tree. Removal of Cruciptera and Paleooreomunnea individually or in combination resulted in minimal topological changes among extant and remaining extinct taxa. The position of Polyptera as sister to the Juglandinae clade generally was stable, except for analyses excluding Paleooreomunnea that resulted in a trichotomy among Juglandeae, Carya, and Polyptera (results not shown).
Character set partitions.-We describe here the most notable results of parsimony analyses that evaluated the contribution of particular partitions of the morphological data (e.g., floral: 22 characters; vegetative: 22; fruit: 12; and fossilizable: 18) in combination with DNA data, either as primary sequence, a constraint (80% bootstrap scaffold), or secondary matrix (weighted MRP). In all three treatments, floral characters consistently resolved the placement of four of the five fossils within the clades recovered by all morphological data; however, Paleooreomunnea was placed either as sister to the juglandoids (Fig. 6a; primary sequence, scaffold) or nested among engelhardioids (weighted MRP). In the analysis of primary sequence + floral characters, bootstrap support was similar to that obtained in the total evidence analysis (compare Fig. 4b with Fig. 6a), whereas for weighted MRP + floral characters (results not shown), bootstrap values showed a decrease relative to weighted MRP + all characters (Fig. 5c). Analyses with DNA + fossilizable characters alone clearly destabilized the position of most of the fossils but still placed Paleooreomunnea within the engelhardioid clade (Fig. 6b). In all treatments, analyses with only fossilizable characters produced less resolution overall, with Cruciptera, platycaryoids, and Polyptera unresolved within the juglandoid clade and Paleooreomunnea within the engelhardioid clade (Fig. 6b). Data partitions corresponding to fruit characters, the smallest partition, and vegetative characters produced the most poorly resolved consensus trees and further destabilization of the placement of fossil taxa (results not shown). Artificial fossils.- For the artificial fossils based on both Alfaroa and Juglans, very few replicates placed these fossils with their parent species (Table 3), even when 75% of the morphological characters were included. However, most replicates at least placed these fossils in the correct local clade, and neither of the two large clades (engelhardioids or juglandoids) was disrupted by the addition of the "fossils." In contrast, the "fossil" Platycarya was sister to its "parent" in 75%, 95%, and 100% of the replicates when 25%, 50%, and 75% of the morphological data were included (Table 3). The addition of this artificial fossil did not impact the remainder of the tree.
Removal of suites of organ-specific characters did not show appreciably different results. For the Alfaroa manningii "fossil," with all characters included, there were 49 trees. With vegetative characters only, there were also 49 trees, and the strict consensus was identical to that obtained with all characters. With floral characters, there were 63 trees, with some loss of resolution between Alfaroa and Oreomunnea mexicana in the strict consensus. With fruit characters, there were 73 trees, with a polytomy of all species of Alfaroa and Oreomunnea in the strict consensus. For the Juglans regia artificial fossil, analysis of the vegetative characters only resulted in the same number of trees (14) and topology as obtained in the analysis of all characters. With floral characterts, there were 44 trees, and most of Juglans collapsed to a polytomy in the strict consensus. With fruit characters there were 78 trees, and most of the juglandoid clade collapsed to a polytomy. For the Platycarya strobilacea "fossil," there was no change in topology or number of trees (7) with any partition relative to the all-characters analysis. In both the Alfaroa and Jugions "fossils," the vegetative characters contained most of the phylogenetic signal. When these 26 characters were removed, resolution was reduced. This effect is not due merely to the number of characters because the floral partition contained 27 characters.
Estimation of divergence times.-The ML tree used in estimating divergence times is identical to the MP trees (Fig. 3b) when the extra outgroup of Carpinus is included to root the Juglandaceae. Rate constancy across Juglandaceae was rejected based on the likelihood ratio test (-21ogA = 134.8; P
The Role of Fossil Taxa in Expanding and Reinterpreting Morphological Data in Juglandaceae
Juglandaceae have been the subject of several morphological analyses derived from the distinct perspectives of researchers working with fossil and modern taxa (e.g., Wing and Hickey, 1984; Manchester, 1987; Manos and Stone, 2001). Although a moderately sized morphological data set was available for modern taxa and thought to be exhaustive (50 characters; Manos and Stone, 2001), this study shows that a paleobotanical perspective can provide additional characters (14 characters) and critical reinterpretations of several previously studied characters that had been evaluated only among extant diversity (see Appendix 2, online). For example, characters of foliar morphology not usually included in the description of extant Juglandaceae, but commonly reported in the paleobotanical literature, such as presence or absence of intersecondary veins, spacing of marginal teeth, and size of the peltate trichomes, proved to be informative. Additional characters of the reproductive organs that were previously neglected, such as the placement of papillae on the stigmas (inner versus outer surface of the arms), the presence or absence of nutshell lacunae, the type of venation in winged fruits, and the position of vascular bundles in relation to ribs of the nut, were observable in fossil as well as extant genera.
Most angiosperm fossil taxa that have been incorporated into phylogenetic analyses are known only from reproductive organs, providing floral and pollen characters and/or fruit and seed characters (see review, Crepet et al., 2004). The endeavor to reconstruct "whole plants," incorporating both vegetative and reproductive characters, is not new to paleobotany, but only in recent years have such reconstructed plants been incorporated into phylogenetic analyses with related extant flowering plants. Usually these have been restricted to morphological characters, and although the rigor of the technique for phylogenetic analyses has varied widely, several other examples also indicate mat including composite fossil taxa can provide additional characters, improved resolution among extant taxa, and alternative inferences of biogeographic history (Manchester, 1989; Stockey et al., 1997).
Methods to Integrate Fossils into Phytogeny
Our integration of morphological and DNA data not only reveals the potential for increasing both taxon sampling and the number of characters studied but also demonstrates the benefits mat more data can bring to the phylogenetic stability of trees that include fossils, a result that is well substantiated in simulation studies of missing data (Wiens 2003a, 2003b, 2005) as well as with artificially generated fossils presented here. We determined that all three approaches to combining data produced highly effective and generally similar results for integrating fossil Juglandaceae into analyses with extant taxa. Highly incomplete fossil taxa are placed with moderate to high support among extant taxa whose relationships remained intact despite the introduction of large amounts of missing data.
The results of these analyses generally are robust across methods and show that most of the fossil Juglandaceae studied here are resolved in crown clades of extant taxa. Because this result could reflect a bias in sampling, it is important to note that our study did not include Cretaceous flower and fruit fossils, such as Caryanthus, Bedellia, and Endressianthus (Friis, 1983; Sims et al., 1999; Friis et al., 2003b, 2006), that might be representatives of the stem lineages of the subclades of Fagales (Juglandaceae, Rhoipteleaceae, Berulaceae, Myricaceae). Associated foliage is unknown for these Cretaceous flowers, and whether their leaves were compound or simple might be relevant in distinguishing stem Berulaceae and/or Myricaceae from stem Juglandaceae.
A range of methods for integrating fossils should continue to be explored by workers aiming to add fossil diversity to various clades of the Tree of Life. For example, in clades with a rich fossil history, the introduction of numerous fossil taxa relative to modern taxa sampled, even with the addition of DNA data, could increase uncertainty and result in poorly resolved or largely unsupported trees based on combined analysis (e.g., Shaffer et al., 1997; O'Leary, 2001). Uncertainty also could resuit from indecisive DNA data that are further compromised by the addition of missing DNA data for the fossil taxa. Of particular concern is the behavior of "wild card" fossils and the potential synergism between multiple, highly incomplete fossils and more complete taxa with homoplasy (Nixon, 1996). We know of few studies designed to analyze simultaneously multiple fossil taxa with molecular and morphological data (but see Brochu, 1997; Shaffer et al., 1997; Crepet and Nixon, 1998b; Jordan and HUl, 1999; O'Leary, 2001; Demere et al, 2005), and perhaps for the reasons mentioned above, it could be advantageous to analyze fossils one by one, as is typically the case with isolated new discoveries (e.g., Keller et al., 1996; Magallon-Puebla et al., 1996; Crepet and Nixon, 1998a; Sun et al., 2002), or at best a few at a time to better understand the dynamics of character interaction and missing data (e.g., Nixon, 1996; Shaffer et al., 1997; Hermsen et al., 2003; Magallon, 2007). For studies that aim to include a higher percentage of fossil taxa, DNA scaffolds (sensu Springer et al., 2001) and MRP approaches may better minimize the amount of missing data in DNA matrices. Previous studies have mostly used the total evidence approach (e.g., Brochu, 1997; O'Leary, 2001; Sun et al, 2002; Hermsen et al., 2003, 2006; Crepet et al., 2005; Demere et al, 2005), making it difficult to generalize on the efficacy of these methods in other studies. Analysis using total evidence is perhaps the most assumption-free approach to the integration of fossils, but more examples are needed to understand the benefits of adding incomplete taxa. A strong case has been made for their inclusion based on simulations (Wiens, 2005); however, the number of characters in which a fossil differs from extant taxa may be of more significance than the number of taxa included in phylogenetic analysis, thus overall generalities are likely to be difficult to establish.
Although there are many examples of the use of morphological data sets to place fossils in phylogenetic analyses of living plus extinct taxa (e.g., Crane, 1985; Donoghue and Doyle, 1989; Rothwell and Serbet, 1994; Kenrick and Crane, 1997; Rothwell, 1999; Hermsen et al., 2003; Rothwell and Nixon, 2006), the indirect role of DNA data in placing fossils remains little explored. Our analyses reveal that the structural framework provided by DNA data can facilitate a robust placement of fossils despite large amounts of missing data (i.e., the DNA data for fossils) in combined analysis. Thus, our empirical results agree with the simulations of Wiens (2003a, 2003b, 2005) in that the analysis of incomplete taxa "is a problem of including too few characters rather than too many missing cells" (Wiens, 2003a:297).
Character Distributions and the Evidence for Placing Fossil Juglandaceae
Although observed synapomorphies in fossil taxa may suggest phylogenetic placement, the combination of missing data and homoplasy makes it difficult to predict the results of phylogenetic analysis. To understand this interplay, especially with multiple fossils, critical inspection of the character states scored among modern taxa relative to those available in fossil taxa and their distribution on a resolved phylogenetic tree can roughly gauge how the inference was obtained under parsimony (Kearney, 2002). Indeed, crown clades consisting of fossil and modern taxa should be scrutinized closely, because some may consist of spurious associations, with most, if not all, of the fossil taxa missing the characters states that define the clade (Nixon et al., 1994; Nixon, 1996).
In this study, four out of five fossils, including OHciptera with its 73% missing data, were placed with varying degrees of confidence using several approaches that integrate morphological and molecular data. Furthermore, the positions of extant taxa relative to one another are not radically changed when the fossils are included. Although the interplay of missing data and homoplasy is to some degree data set dependent, we provide several examples to illustrate why a heuristic investigation of fossil integration may be enlightening. For the fossils studied here, our results clearly show that the amount of missing data in any given taxon is not by itself an operational guideline for excluding fossils from analysis.
Our study also demonstrates that the addition of DNA data for modern taxa can act as a buffer, specifically by stabilizing the relationships among modern taxa, despite the inclusion of even the most incomplete taxa in combined analyses. In the case of the data- poor fossil Owciptera resolved within the Juglandinae clade across all forms of combined analyses, its placement within this crown clade spans five internal nodes, encompassing minimally 37 character- state changes, 10 of which are scored in the fossil. Within this distribution, three unambiguous synapomorphies were scored for Cruciptera at the level of the juglandoid clade (unlobed pistillate bract; isodiametric sclereids in the nutshell; subparallel fruit wing venation), and one was present at the level of the Juglandinae clade (stigmatic surface papillate on the inner arms). Not surprisingly, floral and fruit characters establish the position of Cruciptera within its crown clade, but without the inclusion of the DNA data in some form, the combination of missing data and homoplasy in the morphological data alone only can support a more generalized position within the Juglandeae clade (Figs. 4, 5). Clearly, Cruciptera has enough phylogenetically informative characters to be placed within a clade of modern taxa, despite its high level of missing data. Among the missing morphological characters, pith condition, pollen pore number, and pollen symmetry would likely be most important in stabilizing the position of Cruciptera.
Inconsistent results across our analyses indicate that the most problematic fossil to integrate is Paleooreomunnea (51% missing data). Both morphological analysis and total evidence analysis render the fossil unresolved or on a short branch, sister to the juglandoid clade (Fig. 4b). In contrast, the results of weighted MRP and the DNA scaffold place Paleooreomunnea with modern engelhardioids, a position that is more consistent with previous non- cladistic studies (Dilcher et al., 1976; Crane and Manchester, 1982). Although modern engelhardioids share six unambiguous synapomorphies that span vegetative, floral, and fruit characters, the fossil bears only two of these: fruit wings formed by a bract and two bracteoles and pinnate fruit wing venation. In the case of total evidence, homoplasy within the other characters scored in the fossil that change along the branch leading to the engelhardioids, in addition to missing data, contributes to a placement outside of the engelhardioid clade, whereas both approaches that minimize the effects of missing DNA data lead to a placement within the clade. It also is clear that the inclusion of this fossil decreases bootstrap support throughout the tree, but most notably on the branch leading to the engelhardioid clade. At play here as well is the general observation that the level of apomorphy is lower within the subtropical to tropical engelhardioid clade (Manos and Stone, 2001). Thus, although homoplasy and missing data are impediments to placing fossils, the inclusion of a fossil with a greater number of plesiomorphous character states or perhaps an actual stem group fossil could also explain lack of resolution.
Highly distinctive fossils such as Platycarya americana and Paleoplatycarya, with their clear affinities to modern Platycarya, provide an instructive example of how morphology alone can have substantial impact toward unifying species-poor modern clades with the breadth of their fossil diversity. In Polyptera, the other highly distinctive fossil examined, no such unification with modern taxa is apparent in the phylogenetic results. Previous studies have interpreted its morphology as a mosaic (Manchester and Dilcher, 1997), sharing character states with particular taxa from the modern sister clades, specifically Cyclocarya of the Juglandinae and Carya of the Caryinae. For example, Polyptera and Cyclocarya share the disk-like wing oriented perpendicular to the nut axis (Fig. 1), and Polyptera and Carya share the condition of clusters of three to five staminate catkins per inflorescence.
Closer inspection of Polyptera demonstrates that overall, it shares none of the diagnostic character states for each potential sister clade, although several relevant characters are missing. For the Caryinae clade, which is unambiguously supported by three character states (subequatorial pollen pores, crystals lacking from the wood idiobasts, and fruit type), Polyptera is plesiomorphic for the first two, and "missing" for the latter. For the Juglandinae clade, which is also unambiguously supported by three character states (septate pith, four or more pollen apertures, and stigmatic surfaces with papillae on the inner surface), Polyptera is plesiomorphic for pollen type and "missing" for the other two characters. Among the missing characters, it would be especially useful to know if the pith of Polyptera is septate as in Cyclocarya, Pterocarya, and Jugions, or solid as in Carya and other Juglandaceae. Finally, although at least 12 state changes support the clade formed by Polyptera, Juglandinae, and Caryinae, only one synapomorphy, unisexual inflorescences, is unambiguous. It unfortunately is "missing" from Polyptera. The character support for Polyptera on this branch comes from five homoplasious state changes, several of which are shared with the fossil Cruciptera and Cyclocarya, especially the four chambers at the base of the fruit, presence of nutshell lacunae, and the perpendicular fruit wing orientation. As depicted in Figure 1, the early evolutionary history of the Juglandeae clade is rich in taxa with perpendicular fruit wing orientation; therefore, the position of Polyptera provides further evidence that this condition is the ancestral state of the clade.
Revisiting the Role of Missing Data with Artificially Generated Fossils
With as few as 25% of the morphological characters (i.e., 16 characters), artificially generated fossils were successfully placed in the correct genus or local clade, even if they were not all placed as sister to their "progenitors." We conclude from these studies that the amount of data per se is not important; rather, the number of characters available to link a fossil to the modern taxa determines the success in placing the fossil. It may therefore be possible to place fossil taxa with their extant sister groups even with relatively small numbers of characters if those characters preserved are synapomorphic. Furthermore, the distinctness of the fossil and its sister group from the remainder of the taxa may also affect the placement of the fossil. For example, Platycarya strobilacea and its artificial fossil are very distinct from all other taxa, most likely leading to their placement as sisters in nearly all phylogenetic analyses. In contrast, patterns of variation among species of Jugions may have contributed to the difficulty in placing the artificial fossil with /. regia. Therefore, a realistic number of characters for fossil taxa may be insufficient for correct placement of fossils unless the fossils are complete or nearly so for all characters scored for modern taxa or unless the fossils belong to a distinct clade. However, analyses in which suites of characters were systematically removed from the artificial fossil produced results that are qualitatively consistent with those obtained in the random removal of characters. In general, the "fossils" were placed close to the species that had been used to generate them. The smallest effect of missing data was observed for the Platycarya "fossil," which is very distinct from all other taxa, and the greatest effect was observed for the Juglans regia "fossil," which lies in a clade with more complicated patterns of morphological variation. To enhance the likelihood of including synapomorphic characters and of accurate placement of fossils, we recommend increasing the number of characters that are scored for the fossil, perhaps by linking organs to form composite taxa, assuming there are solid indications suggesting that certain isolated plant parts belonged to a single extinct species, as we did here for all the fossil taxa.
Fossil Placement and Minimum Age Constraints: Examples from Juglandaceae
Calibration of a phylogeny using fossil evidence to estimate divergence times requires detailed phylogenetic relationships to properly incorporate a fossil in a phylogeny. Incorporating fossil taxa in our phylogenetic analyses enables us to explicitly infer the minimum ages for several internal nodes of the Juglandaceae. For example, Polyptera dates back to the Paleocene (60 MYBP; Manchester, 1987), and as sister taxon to the Juglandeae (Figs. 4, 5) it provides a minimum age of the juglandoid clade at 60 MYBP. Within this crown clade, Cruciptera is clearly closely associated with the Juglandinae, but relationships among the four are unresolved (Figs. 4, 5). Thus, the age of the fossil Cruciptera (44 MYBP) may provide a minimum age of the Juglandinae only if it is nested within the clade; however, it cannot be used to constrain a minimum age of the Juglandinae if it is sister to the clade. Nonetheless, it does provide a conservative minimum age of the most recent common ancestor (MRCA) of Platycarya and the Juglandinae. The fossils Paleoplatycarya and Platycarya americana, which date back to 53 MYBP (Manchester, 1987), are likely sister to extant Platycarya strobilacea, suggesting that the stem lineage of Platycarya is as old as 53 MYBP. The phylogenetic position of this clade, however, is uncertain; it is sister to either the rest of the Juglandoideae (total evidence, scaffold, MRP) or the Juglandinae (DNA; Fig. 3b). Thus although the platycaryoid clade could provide a minimum age of the juglandoid clade, it is younger than the age derived using the fossil Polyptera.
Some fossils not included in our analysis also provide minimum ages for certain nodes. For example, Pterocarya fruits are known from the Early Oligocene (32 MYBP; Manchester, 1987), and the fruits of Cyclocarya, a potential sister taxon of Pterocarya, are reported from the Paleocene (58 MYBP; Manchester, 1987). The two groups have very distinctive fruit morphology (Fig. 1), and the fossils fall neatly within these extant genera as they are diagnosed. Although phylogenetic relationships of Pterocarya and Cyclocarya within Juglandinae remain largely unresolved (Fig. 3), a minimum age of the Juglandinae can be conservatively assumed to be 58 MYBP.
Results of age estimation of the Juglandaceae show that the age estimates of certain nodes in the Juglandoideae vary depending on the choice of the constraint (Fig. 7). For example, when we impose a minimum age constraint inferred from Cruciptera on the MRCA of the P/ flfyran/fl/Juglandinae clade, the 95% confidence interval for the age of the Juglandinae clade is estimated to be merely 26.5 +- 0.9 MYBP. Similarly, when we use information derived from Polyptera to constrain a minimum age of the Juglandoideae we obtain 30.2 +-1.2 MYBP for the Juglandinae. These estimates, however, are too young for the Juglandinae because the earliest reliable fossil record ofjuglans dates back to 44 MYBP and Cyclocarya fruits are known from the Early Tertiary (Manchester, 1987). By contrast, when a minimum age of the Juglandinae is constrained based on Cyclocarya fossils, we obtain much older estimates than those based on Cruciptera and Polyptera for all nodes in Juglandoideae, and they are consistent with fossil data (Fig. 1).
In the engelhardioid clade it is interesting to note that age estimations are not affected by different types of minimum age constraints nor by the number of the constraints; all constraint schemes generate similar estimations (Fig. 7). None of the nodes in the engelhardioid clade is used as a constraint, and all minimum age constraints that we imposed are located on nodes within the juglandoid clade. This implies that the optimization of a global tree-wide rate consistency was obtained by changing rates in the juglandoid clade (Sanderson, 1997, 2002). Nevertheless, the estimated ages of the MRCA of the engelhardioid clade from our molecular phylogeny (Fig. 7) are consistent with the fossil record.
Although the fossil record for the engelhardioid clade is much less rich than for the juglandoid clade, fruits with trilobed bracts, including Paleocarya and Engelhardia, are commonly reported from the Middle Eocene (46 MYBP), but unknown prior to the Eocene (Manchester, 1987). Of all Juglandaceae, Oreomunnea and Alfaroa stand out for their sparse to nonexistent fossil records.