clmresidue

Name clmresidue
Description

clmresidue extends clustering of subgraph to clustering of graph.

TribeMCL is a method for clustering proteins into related groups, which are termed 'protein families'. This clustering is achieved by analysing similarity patterns between proteins in a given dataset, and using these patterns to assign proteins into related groups. In many cases, proteins in the same protein familywill have similar functional properties.

TribeMCL uses a novel clustering method (Markov Clustering or MCL) which solves problems which normally hinder protein sequence clustering. These problems include: multi-domain proteins, peptide fragments and proteins which possess domains which are very widespread (promiscuous domains). The efficiency of the method makes it applicable to the clustering of very large datasets.

The algorithm is composed of the core MCL algorithm (written by Stijn Van Dongen) and the modules for biological sequence clustering written by Anton Enright. Both are written in the C language, and source code is available.


Homepage http://www.ebi.ac.uk/research/cgg/tribe/  
Remote Documentation http://www.ebi.ac.uk/research/cgg/tribe/manual.txt
 

The cluster file presumably contains a clustering of a subgraph of the graph G contained by the matrix file, implying that the row domain of that clustering is a subset of the node domain of the graph. clmresidue will compute a simple upwards projection of that clustering, resulting in a clustering of G.