This (stand-alone projects attached) adds a few simple things to the Bio.Phylogenetics namespace:
- one simple data structure – DistanceMatrix
-
3 classes of algorithms:
- ISubstitutionModel for computing a DistanceMatrix from an aligned sequence
- IHierarchicalClusteringAlgorithm for inferring a phylogenetic tree from a distance matrix
- ITreeEvaluator for calculating a parsimony score for a tree given an aligned sequence
- And the 3 corresponding standard simple implementations of those algorithms: JukesCantorSubstitutionModel, UpgmaClusteringAlgorithm, FitchTreeEvaluator
Also included is a simple command-line sample that uses these.
I’ve tried to modify my code to conform to your conventions, and include a lot of comments (with references to original papers, etc.). If you’d like to include this in MBF there are a few more things I should probably do:
-
Port my unit tests to your test framework for inclusion
- I’m curious, by the way, why you use NUnit instead of VS unit testing (I’ll have to learn NUnit to port my tests from VS – no big deal, I’ve already started looking at it and it’s similar)
- Localize exception messages (for now English is hard-coded – although I see MBF isn’t completely consistent about localizing them)
- Perhaps move a couple utilities into existing MBF classes (eg. see TreeUtils.cs)
See e-mail thread with Michael Zyskowski for more details