This is a very short example of how to set up a MrBayes block.
Useful links:
Bodega Phylogenetics Wiki - (1) MrBayes Tutorial and (2) MrBayes Tutorial
Frederik Ronquist's page of MrBayes resources - great!
The following block was generated for my Mytilus data set. This data set was a concatenated alignment of the two linked loci COI and VD1; both mtDNA. In this short example, I will show how to conduct a partitioned analysis with MrBayes.
The MrBayes block should be placed at the end of your Nexus file under your DATA block:
BEGIN mrbayes;
[ start log file and replace the existing one ]
log start filename=coivd1_haplo.log replace;
[ setting the outgroup and sets ]
outgroup haplo55;
charset COI = 1-399;
charset VD1 = 400-.;
partition by_gene = 2: COI, VD1;
set partition=by_gene;
[ model settings and unlink model parameters ]
lset applyto=(all) nst=6 rates=invgamma;
unlink shape=(all);
prset applyto=(1)
revmatpr=fixed(0.0100,10.4168,0.0100,3.8538,37.1448,1.0000)
statefreqpr=fixed(0.2631,0.1665,0.2120,0.3583)
pinvarpr=fixed(0.3460)
shapepr=exponential(0.7640);
prset applyto=(2)
revmatpr=fixed(0.6604,9.7015,0.6585,2.3585,23.9207,1.0000)
statefreqpr=fixed(0.2822,0.1483,0.2659,0.3036)
pinvarpr=fixed(0.3060)
shapepr=exponential(1.6090);
mcmcp
nchains=4
ngen=2000000
samplefreq=100
savebrlens=yes
printfreq=1000;
mcmc
sumt;
sump;
log stop;
END;
Some commands are self-explanatory (i.e., log or outgroup), but some steps need a brief explanation:
charset = with this command you can associate names with different sets of characters, in this case COI from position 1 to 399 and VD1 from position 400 to the end.
partition = in this case I had 2 sets of characters, COI and VD1, and defined them as genes with the command by_gene.
unlink = here I have unlinked the alpha shape parameter of the gamma distribution for all subsets.
lset = is used to define the structure of the model; in this case I have applied the General Time Reversible model with a proportion of invariable sites and a gamma-shaped distribution of rates across sites (nst=6 rates=invgamma) for all subsets.
prset = this command is used to define the prior probability distributions on the parameters of the model; therefore, I used JModeltest to carry out the statistical selection of the best-fit models of nucleotide substitution. With applyto= (1) or (2) I was able to define the different parameters of nucleotide substitution for the two subsets:
revmatpr = for the six substitution rates of the GTR rate matrix
statefreqpr = for the stationary nucleotide frequencies of the GTR rate matrix
pinvar = for the proportion of invariable sites
shapepr = for the shape parameter of the gamma ditribution of rate variation
Finally, the commands for the execution of the analysis:
mcmc = start the Markov chain Monte Carlo analysis with four chains (nchains = 4) - in this case the nchains command is quite unnecessary, due to the fact that 4 is the default number of chains (3 heated chains and 1 cold chain).
ngen = number of generations for which the analysis will run; in this case 2.000.000 generations
samplefreq = determines how often the chain is sampled; again, in this case every 100th generation is the default value
savebrlens = save the branch lengths when it saves the sampled tree topology
printfreq = the frequency with which the state of the chains is printed to screen
At this point I should write a tutorial of how to summarize the MCMC analysis output (i.e., the sump and sumt commands) and interpreting the subsequent results. By then, have fun with your own analysis.