Georg Steinert

marine ecology and systematics

MrBayes Block Tutorial

This is a very short example of how to set up a MrBayes block.

Useful links:

MrBayes Homepage

MrBayes Manual

JModeltest Homepage

Phylogenetics: MrBayes Lab

Bodega Phylogenetics Wiki - (1) MrBayes Tutorial and (2) MrBayes Tutorial

Frederik Ronquist's page of MrBayes resources - great!

The following block was generated for my Mytilus data set. This data set was a concatenated alignment of the two linked loci COI and VD1; both mtDNA. In this short example, I will show how to conduct a partitioned analysis with MrBayes.

The MrBayes block should be placed at the end of your Nexus file under your DATA block:

BEGIN mrbayes;

[ start log file and replace the existing one ]

log start filename=coivd1_haplo.log replace;

[ setting the outgroup and sets ]

outgroup haplo55;

charset COI = 1-399;

charset VD1 = 400-.;

partition by_gene = 2: COI, VD1;

set partition=by_gene;

[ model settings and unlink model parameters ]

lset applyto=(all) nst=6 rates=invgamma;

unlink shape=(all);

prset applyto=(1)

revmatpr=fixed(0.0100,10.4168,0.0100,3.8538,37.1448,1.0000)

statefreqpr=fixed(0.2631,0.1665,0.2120,0.3583)

pinvarpr=fixed(0.3460)

shapepr=exponential(0.7640);

prset applyto=(2)

revmatpr=fixed(0.6604,9.7015,0.6585,2.3585,23.9207,1.0000)

statefreqpr=fixed(0.2822,0.1483,0.2659,0.3036)

pinvarpr=fixed(0.3060)

shapepr=exponential(1.6090);

mcmcp

nchains=4

ngen=2000000

samplefreq=100

savebrlens=yes

printfreq=1000;

mcmc

sumt;

sump;

log stop;

END;

Some commands are self-explanatory (i.e., log or outgroup), but some steps need a brief explanation:

charset = with this command you can associate names with different sets of characters, in this case COI from position 1 to 399 and VD1 from position 400 to the end.

partition = in this case I had 2 sets of characters, COI and VD1, and defined them as genes with the command by_gene.

unlink = here I have unlinked the alpha shape parameter of the gamma distribution for all subsets.

lset = is used to define the structure of the model; in this case I have applied the General Time Reversible model with a proportion of invariable sites and a gamma-shaped distribution of rates across sites (nst=6 rates=invgamma) for all subsets.

prset = this command is used to define the prior probability distributions on the parameters of the model; therefore, I used JModeltest to carry out the statistical selection of the best-fit models of nucleotide substitution. With applyto= (1) or (2) I was able to define the different parameters of nucleotide substitution for the two subsets:

revmatpr = for the six substitution rates of the GTR rate matrix

statefreqpr = for the stationary nucleotide frequencies of the GTR rate matrix

pinvar = for the proportion of invariable sites

shapepr = for the shape parameter of the gamma ditribution of rate variation

Finally, the commands for the execution of the analysis:

mcmc = start the Markov chain Monte Carlo analysis with four chains (nchains = 4) - in this case the nchains command is quite unnecessary, due to the fact that 4 is the default number of chains (3 heated chains and 1 cold chain).

ngen = number of generations for which the analysis will run; in this case 2.000.000 generations

samplefreq = determines how often the chain is sampled; again, in this case every 100th generation is the default value

savebrlens = save the branch lengths when it saves the sampled tree topology

printfreq = the frequency with which the state of the chains is printed to screen

At this point I should write a tutorial of how to summarize the MCMC analysis output (i.e., the sump and sumt commands) and interpreting the subsequent results. By then, have fun with your own analysis.

Home

Research