
Maximum Likelihood
This short example describes how to use Treeline to optimize maximum likelihood (ML) trees, as described in:ES Wright (2026) "Treeline Provides a Unified Strategy for Optimising Phylogenetic Trees Under Alternative Criteria." Molecular Ecology Resources, 26(5), e70163.
How do I build a maximum likelihood phylogenetic tree?
First it is necessary to install DECIPHER and load the library in R. Next, provide Treeline with a sequence alignment and model of evolution that will be used to optimize the tree.# load the DECIPHER library in R
> library(DECIPHER)
>
> # load the target sequences from a file
> fas <- "<<REPLACE WITH PATH TO FASTA FILE>>"
> seqs <- readDNAStringSet(fas) # use AA, DNA, or RNA
> seqs
DNAStringSet object of length 317:
width seq names
[1] 819 ATGGCTT...AAGAAAA Rickettsia prowaz...
[2] 822 ATGGGAA...GAAAAAG Porphyromonas gin...
[3] 822 ATGGGAA...GAAAAAG Porphyromonas gin...
[4] 822 ATGGGAA...GAAAAAG Porphyromonas gin...
[5] 819 ATGGCTA...TGGTAAA Pasteurella multo...
... ... ...
[313] 819 ATGGCAA...TACTAAA Pectobacterium at...
[314] 822 ATGCCTA...CGTCAAG Acinetobacter sp....
[315] 864 ATGGGCA...TCAGTCT Thermosynechococc...
[316] 831 ATGGCAC...GAAGAAG Bradyrhizobium ja...
[317] 840 ATGGGCA...GCGAGGT Gloeobacter viola...
>
> # align coding sequences
> seqs <- AlignTranslation(seqs,
+ type="DNAStringSet") # choose AA or DNA
Determining distance matrix based on shared 5-mers:
|========================================| 100%
Time difference of 0.32 secs
Clustering into groups by similarity:
|========================================| 100%
Time difference of 0.02 secs
Aligning Sequences:
|========================================| 100%
Time difference of 0.49 secs
Iteration 1 of 2:
Determining distance matrix based on alignment:
|========================================| 100%
Time difference of 0.05 secs
Reclustering into groups by similarity:
|========================================| 100%
Time difference of 0.03 secs
Realigning Sequences:
|========================================| 100%
Time difference of 0.22 secs
Iteration 2 of 2:
Determining distance matrix based on alignment:
|========================================| 100%
Time difference of 0.05 secs
Reclustering into groups by similarity:
|========================================| 100%
Time difference of 0.02 secs
Realigning Sequences:
|========================================| 100%
Time difference of 0.03 secs
>
> # optimize the tree
> tree <- Treeline(seqs,
+ method="ML",
+ model=MODELS, # choose a model or test all
+ showPlot=TRUE,
+ processors=NULL) # use all CPUs
Fitting initial tree to models:
JC69 -ln(L) = 71307, AICc = 147914, BIC = 146854
K80 -ln(L) = 70693, AICc = 146721, BIC = 145633
F81 -ln(L) = 71270, AICc = 147947, BIC = 146800
HKY85 -ln(L) = 70650, AICc = 146745, BIC = 145568
T92 -ln(L) = 70682, AICc = 146736, BIC = 145619
TN93 -ln(L) = 70308, AICc = 146096, BIC = 144890
SYM -ln(L) = 69984, AICc = 145449, BIC = 144242
GTR -ln(L) = 69699, AICc = 144991, BIC = 143692
The selected model was: GTR
Optimizing up to 400 candidate trees:
Tree #133. -ln(L) = 69081.540 (0.000%), 13 Climbs, 0 Grafts of 12
Finalizing the best tree (#90):
-ln(L) = 69081.539 (0.000%), 0 Climbs
Model parameters:
Frequency(A) = 0.253
Frequency(C) = 0.227
Frequency(G) = 0.369
Frequency(T) = 0.151
Rate A <-> C = 1.760
Rate A <-> G = 2.133
Rate A <-> T = 3.914
Rate C <-> G = 1.107
Rate C <-> T = 5.616
Rate G <-> T = 1.000
Time difference of 307.1 secs
>
> # optionally, output a Newick file
> WriteDendrogram(tree, file="")
((((('Thermoanaerobacter tengcongensis MB4':0.2144063,('Clostridium acetobutylicum ATCC 824':0.1143762,...