The 'geiger' package in R allows one to test many of the evolutionary model=
s for continuous character evolution that we have discussed so far. First, =
download
R and install the geiger package along with its dependencies. =
Next, download the tree for this exercise and a data table of female body w=
eights for species in this tree, which consists of the guenons and papionin=
s with a few colobines for outgroups.

First, load the 'geiger' package:

library("geiger")

Next, load the phylogenetic tree and the data:

tree_primates =3D read.nexus("57.Tree.txt")

data_primates =3D read.table("57.MassData.txt",header=3DT,sep=3D"\t")<= /pre>

We are now ready to estimate the likelihood of t= he data on this tree given a variety of evolutionary models. For example, w= e can test for a directional trend in body size by comparing the likelihood= of the "trend" model to the likelihood of the "BM" (Brownian motion) model= . Brownian motion is the simplest (fewest parameter) model of trait evoluti= on in which variance in a continuous trait accumulates at a constant rate i= n random directions (addition to or subtraction from the trait mean). Becau= se of this constant variance assumption, it is common to logarithmically tr= ansform body size data so that proportional rather than absolute changes in= size are modeled on the tree. You can very easily transform your data in R= as follows (note that lines starting with # are ignored by R).

#convert kilograms to gramsdata_primates$Mass_gm =3D data_primates$Mass_kg*1000#take the natural logarithm of gramsdata_primates$Log_Mass_gm =3D log(data_primates$Mass_gm)

Take a look at the two new columns of data you added to your dataframe:<= /p>

data_primateshist(data_primates$Log_Mass_gm)

You can see that the structure of the data table= is preserved but the weights are all now the natural logarithm of grams. N= ow calculate the likelihood of the data under Brownian motion by typing:

brownian =3D fitContinuous(tree_primates, data_primates$Log_Mass_gm,mo= del=3D"BM")

Then call "brownian". R will print the following=

$Trait1$Trait1$lnl[1] -30.98123$Trait1$beta[1] 0.02371125$Trait1$aic[1] 65.96246$Trait1$aicc[1] 66.27825$Trait1$k[1] 2

R is first saying these are the values from the table for trai= t 1 (there is only one trait in this case). After the second $ sign is says= which parameter or result is printed on the next line. So, the "$Trait1$ln= l" means the next line shows the log likelihood of the data under Brownian = motion, which is -30.98. We can calculate the comparable likelihood under m= odel of Brownian motion with a directional trend by typing:

trend =3D fitContinuous(tree_primates, data_primates$Log_Mass_gm,model= =3D"trend")

If you type "trend," R will return= to the command line:

$Trait1 $Trait1$lnl [1] -30.86872 $Trait1$mean [1] 9.867361 $Trait1$beta [1] 0.02358446 $Trait1$mu [1] -0.05444671 $Trait1$aic [1] 67.73744 $Trait1$aicc [1] 68.38609 $Trait1$k [1] 3

Here R gives the likelihood of the data under the trend model is -30.87, th=
e mean log body size is 9.87, and the directional trend is for lineages to =
evolve larger body sizes over time (mu =3D 0.024). Remember, though, that m=
ore complex models will always have higher (less negative) likelihoods than=
simpler models like Brownian motion. We will use the likelihood ratio test=
to decide if the additional parameter for a trend (mu) is justified for th=
is data set. This is easily done in R by calculating the probability of the=
chi square value under one degree of freedom, where the chi square is 2 ti=
mes the difference in log likelihoods. So, the likelihood ratio statistic i=
s 2(-30.87-(-30.98))=3D0.22. We can find the chance probability of this deg=
ree of improvement in likelihood from adding one parameter by referring to =
the chi square distribution. Type:

pchisq(0.22,1,lower.tail=3DF)

R will print

[1] 0.6390399

So, the p-value associated with this difference in log likelihoods is not s=
ignificant, and we do not have evidence that a Brownian model of evolution =
should be rejected in favor of one with a directional trend.

Let us look at another important model of trait change - Pagel's lambda val=
ue for phylogenetic signal. Lambda is a multiplier of the internal branches=
of a phylogeny. These internal branches directly express the degree of exp=
ected correlation among the traits of species in the study. Thus, when lamb=
da =3D 1, the model is equivalent to Brownian change in the traits values o=
n the untransformed branch lengths. When lambda is greater than zero but le=
ss than one, phylogenetic signal is present, but there is less of a phyloge=
netic effect than we would expect from the original branch lengths. You can=
calculate the likelihood of the data under the lambda transformation by ty=
ping:

lambda =3D fitContinuous(tree_primates, data_primates$Log_Mass_gm,mode= l=3D"lambda")

When you call "lambda" you will see that the likelihood is reported at -30.=
42. When you calculate the difference between this value and the one for Br=
ownian motion 2*(-30.42-(-30.98)) =3D 1.12, and apply the "pchisq" to calcu=
late the probability under the chi square distibution, the reported p-value=
is 0.29. So, the lambda transform in this case does not make the data sign=
ificantly more likely under as compared to the simpler Brownian motion mode=
l on the untransformed branch lengths. When you called "lambdatest", R will=
have printed among the results the following:

$Trait1$lambda [1] 0.811984

The estimate of lambda that makes the data most likely is 0.81. If applying=
the additional lambda parameter had significantly improved the model, then=
a value equal to 0.8 essentially says that there is less covariance due to=
ancestry than the untransformed branch lengths imply under Brownian motion=
(lambda =3D 0 is a star phylogeny with no phylogenetic signal). Although t=
he maximum likelihood estimate of lambda is not 0 in this case, the small i=
mprovement of likelihood does not justify the additional lambda parameter.

Another popular model is the Ornstein-Uhlenbeck model, also known as an OU =
model. The simple Brownian motion model assumes that the trait can continue=
to get larger or smaller indefinitely; that is, it assumes variance can co=
ntinue to accumulate no matter how small or large the trait value becomes. =
This is probably unrealistic for many biological traits, like body mass, th=
at have a necessary minimum value of 0 and likely an upper physical constra=
int as well. These lower and upper 'walls' stop variance from continuing to=
accumulate in one direction at extreme values. The OU model attempts to re=
flect this reality by fitting a central tendency for the trait, and a param=
eter that creates a trend for variance to accumulate toward this central te=
ndency. This model is often described as Brownian motion, but with a rubber=
band that tethers the traits values to a particular point. Like a rubber b=
and, the further values get from this central tendency, the greater the pul=
l back to the center.

You can fit the OU model by typing:

OU =3D fitContinuous(tree_primates, data_primates$Log_Mass_gm,model=3D= "OU")

Type "OU" and you will see the likelihood for this model is -30.7. Although=
the OU model may seem more realistic, it does not significantly improve th=
e likelihood of the data (its improvement in the likelihood is less than fo=
r the lambda parameter). Essentially, the body mass data is such that the v=
ery simple Brownian motion model adequately describes how change accumulate=
d in this character on the phylogeny.

Several other evolutionary models and branch transformations can be applied=
through geiger. For example, you could fit an early burst model (also call=
ed the ACDC model) in which trait evolution is rapid early in a radiation a=
nd slower later on by setting "model=3D"EB"". If you try this one you will =
see that it does not significantly improve the likelihood.

'Geiger' also outputs AIC values for all models. You can also use those to =
assess significance instead of the log likelihoods. This is typically done =
when comparing models that are not nested. Models are nested when the simpl=
er of two models (the one with fewer parameters) can be derived from the mo=
re complex one simply by eliminating some parameters. In such a case, the s=
impler model is 'nested' within the more complex one, because it is just a =
special case of the complex one where some parameters are set to 0. Models =
are not nested when the simpler model has parameters that do not appear in =
the more complex model. The likelihood ratio test is not valid for comparin=
g likelihood scores from non-nested models. It is this situation in which p=
eople use AIC as one criteria for model selection. AIC is essentially the l=
ikelihood after it has been penalized for the number of parameters in a mod=
el. For example, the OU and lambda models have the same number of parameter=
s, but different parameters.

Harmon, L., J. Weir, C. Brock, R. Glor, W. Challenger, and G. Hunt (2009). = geiger: Analysis of evolutionary diversification. R package version 1.3-1.&= nbsp; http://CRAN.R-project.org/package=3Dgeiger

R Development Core Team (2010). R: A language and environment for statistic= al computing. R Foundation for

Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.