clmdist

Name clmdist
Description

clmdist allows the user to compute the distance between two or more partitions (clusterings).

TribeMCL is a method for clustering proteins into related groups, which are termed 'protein families'. This clustering is achieved by analysing similarity patterns between proteins in a given dataset, and using these patterns to assign proteins into related groups. In many cases, proteins in the same protein familywill have similar functional properties.

TribeMCL uses a novel clustering method (Markov Clustering or MCL) which solves problems which normally hinder protein sequence clustering. These problems include: multi-domain proteins, peptide fragments and proteins which possess domains which are very widespread (promiscuous domains). The efficiency of the method makes it applicable to the clustering of very large datasets.

The algorithm is composed of the core MCL algorithm (written by Stijn Van Dongen) and the modules for biological sequence clustering written by Anton Enright. Both are written in the C language, and source code is available.


Homepage http://www.ebi.ac.uk/research/cgg/tribe/  
Remote Documentation http://www.ebi.ac.uk/research/cgg/tribe/manual.txt