HomeScorpionsVaejovidaeThe ProjectActivities/Products AcknowledgmentsLinks


Project Overview
Intellectual Merit
Broader Impacts

Participating Institutions

Individual participants
Principal Investigators
Graduate Students
Undergraduate Students
High School Students

Museum Collections
Databasing and Mapping
DNA Sequencing
Phylogenetic Analysis
Taxon Sampling
Morphological Data
Vouchering and Archiving
Data Analysis


Timelines and Goals
Research Goals/Products
Training Program
Project Management


Data Analysis


Reconstructing the vaejovid tree will be computationally challenging, because of the time span for divergence and size of the data set. Analyses will require heuristic searches and the exploration of varied parameters. Besides state-of-the-art desktop computers, we will use the AMNH supercomputer cluster, comprising 560 Pentium III processors (, allowing multiple analyses to address parameter sensitivity and explore the analytical space implied by the data.           
     This project applies the philosophy of “total evidence”
sensu Kluge (1989) or “simultaneous analysis” sensu Nixon & Carpenter (1996) to analysing molecular and morphological data, the advantages and disadvantages of which have been thoroughly reviewed and shall not be elaborated here. Arguments by Nixon & Carpenter (1996) concerning explanatory power, character independence, and the emergence of secondary signals are considered sufficient justification for this approach. Separate analyses of the morphological and molecular data will only be conducted in order to assess character incongruence among the various data partitions (morphology, different loci), and only trees obtained from simultaneous analysis of all evidence will be used for testing a posteriori hypotheses.         
     The inclusion of morphological data, in turn, provides justification for the use of parsimony in this project, and simultaneous analysis is a logical extension of the parsimony criterion (Nixon & Carpenter 1996). The use of multiple analytical techniques, predicated on fundamentally different philosophies (“syncretism” sensu Schuh 2000; “pluralism” sensu Giribet et al. 2001a; “methodological concordance” sensu Grant & Kluge 2003), has been criticised elsewhere (Giribet et al. 2001a; Grant & Kluge 2003; Prendini et al. 2003).      
     Searches for most parsimonious trees (Farris 1970, 1983; Kluge 1984) will use POY (Gladstein & Wheeler 1996–2000) (, NONA and Pee-Wee (Goloboff 1997a, 1997b) (, TNT (Goloboff et al. 2002) (, and PAUP* (Swofford 2002) (, each with parallel versions. POY will be used for analyses involving molecular data, because it is the only program implementing direct optimization (simultaneous alignment and tree-search), regarded as ideal in principle (Wheeler 1994, 1996, 1998a, 1999, 2001a, 2001b; Slowinski 1998; Giribet & Wheeler 1999; Giribet et al. 2000, 2001b; Wahlberg & Zimmermann 2000), albeit computationally demanding. The widely used approach to analysing sequences of unequal length by first aligning and then subjecting the prealigned sequences to a normal parsimony analysis has come under increasing criticism (Wheeler 1994, 1996, 1998b, 1999, 2000, 2001a,b; Slowinski 1998; Edgecombe et al. 1999; Giribet & Wheeler 1999; Giribet & Ribera 2000; Giribet et al. 2000, 2001b, 2002; Wahlberg & Zimmermann 2000; Giribet 2001; Prendini et al. 2003). This approach clearly violates the logic of parsimony because whether or not an indel is postulated depends on the phylogeny in question. As has been cogently argued by Wheeler (1996, 1998b, 1999, 2000, 2001a,b), a phylogeny should be evaluated according to how many substitutions and how many indels it requires postulating so that analyses should simultaneously consider the indels and substitutions required by alternative phylogenies, instead of taking them as given. Strategies for rapid parsimony analysis, e.g. the parsimony ratchet (Nixon 1999b), tree fusing and tree drift (Goloboff 1999), most of which are implemented in the latest version of POY, will enhance searches throughout tree space.         
     Partitioned Bremer support (Baker & DeSalle 1997; Baker et al. 1998) will be used to address the relative contributions of different loci and morphological character systems to the simultaneous analysis. Relative support for nodes in trees will be assessed with branch support indices (Bremer 1988, 1994; Donoghue et al. 1992) and bootstrap percentages (Felsenstein 1985; Sanderson 1989).
    Adaptational and biogeographical hypotheses will be tested by optimization on the tree obtained by simultaneous analysis, using WinClada (Nixon 1999a) or MacClade (Maddison & Maddison 1992). Ambiguous optimizations will be resolved with ACCTRAN, maximizing homology by favoring reversals over parallelisms to explain homoplasy (Swofford & Maddison 1987, 1992).

Literature Cited

Baker, R.H. & DeSalle, R. 1997. Multiple sources of character information and the phylogeny of Hawaiian Drosophilids. Systematic Biology 46: 654–673.

Baker, R.H., Yu, X.B. & DeSalle, R. 1998. Assessing the relative contribution of molecular and morphological characters in simultaneous analysis trees. Molecular Phylogenetics and Evolution 9: 427–436.

Bremer, K. 1988. The limits of amino acid sequence data in angiosperm phylogenetic reconstruction. Evolution 42: 795803.

Bremer, K. 1994. Branch support and tree stability. Cladistics 10: 295–304.

Donoghue, M.J., Olmstead, R.G., Smith, J.F. & Palmer, J.D. 1992. Phylogenetic relationships of Dipsacales based on rbcL sequence data. Annals of the Missouri Botanical Garden 79: 333345.

Edgecombe, G.D., Giribet, G. & Wheeler, W.C. 1999. Filogenia de Chilopoda: Combinado secuencias de los genes ribosómicos 18S y 28S y morfología [Phylogeny of Chilopoda: Analysis of 18S and 28S rDNA sequences and morphology]. In: Melic, A., De Haro, J.J., Mendez, M. & Ribera, I. (Eds.) Evolución y Filogenia de Arthropoda. Boletin de la Sociedad Entomología Aragonesa 26: 293–331.

Farris, J.S. 1970. Methods for computing Wagner trees. Systematic Zoology 19: 83–92.

Farris, J.S. 1983. The logical basis of phylogenetic analysis. In: Platnick, N.I. & Funk, V.A. (Eds.) Advances in Cladistics, Vol. 2. Columbia University Press, New York, 7–36.

Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39: 783–791.

Giribet, G. 2001. Exploring the behavior of POY, a program for direct optimization of molecular data. In: Giribet, G., Wheeler W.C. & Janies, D.A. (Eds.) One day symposium in numerical cladistics. Cladistics 17: S60–S70.

Giribet, G. & Ribera, C. 2000. A review of arthropod phylogeny: New data based on ribosomal DNA sequences and direct character optimization. Cladistics 16: 204–231.

Giribet, G. & Wheeler, W.C. 1999. On gaps. Molecular Phylogenetics and Evolution 13: 132–143.

Giribet, G., DeSalle, R. & Wheeler, W.C. 2001a. ‘Pluralism’ and the aims of phylogenetic research. In: DeSalle, R., Giribet, G. & Wheeler, W.C. (Eds.) Molecular Systematics and Evolution: Theory and Practice. Birkhäuser Verlag, Basel, 141–146.


The material included in this site is based upon work supported by the National Science Foundation under Grant No. 0413453.  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
© Copyright 2005-2006.  All images in this site, even if they do not include an individual statement of copyright, are protected under the U. S. Copyright Act.  They may not be "borrowed" or otherwise used without our express permission or the express permission of the photographer(s),  artist(s), or author(s).  For permission, please submit your request to