Skip to content

ConsensusCluster doesn't seem to use consensus matrix for clustering #25

@gageblack

Description

@gageblack

Report

When reviewing the ConsensusCluster implementation (src/flowsom/models/consensus_cluster.py, v0.2.2), it seems that fit_predict() ignores the fit() function that does the consensus clustering and just runs AgglomerativeClustering once on raw data (lines 105-109):

    if self.z_score:
        data = self._z_score(data)
    return self.cluster(n_clusters=self.n_clusters, linkage=self.linkage).fit_predict(data)

Based on the consensus clustering paper and the R FlowSOM implementation, it seems that fit_predict() should:

  1. Build consensus matrix using fit()
  2. Convert to distance matrix: distance_matrix = 1 - self.Mk
  3. Cluster using consensus distances (or at least have an option to cluster on the matrix or data, as in the ConsensusCluster class: AgglomerativeClustering(n_clusters=k, metric='precomputed').fit_predict(distance_matrix)

Versions

v0.2.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions