8.1.7. sklearn.cluster.Ward

class sklearn.cluster.Ward(n_clusters=2, memory=Memory(cachedir=None), connectivity=None, copy=True, n_components=None)

Ward hierarchical clustering: constructs a tree and cuts it.

Parameters :

n_clusters : int or ndarray

The number of clusters to find.

connectivity : sparse matrix.

Connectivity matrix. Defines for each sample the neigbhoring samples following a given structure of the data. Default is None, i.e, the hiearchical clustering algorithm is unstructured.

memory : Instance of joblib.Memory or string

Used to cache the output of the computation of the tree. By default, no caching is done. If a string is given, it is the path to the caching directory.

copy : bool

Copy the connectivity matrix or work inplace.

n_components : int (optional)

The number of connected components in the graph defined by the connectivity matrix. If not set, it is estimated.


children_ array-like, shape = [n_nodes, 2] List of the children of each nodes. Leaves of the tree do not appear.
labels_ array [n_points] cluster labels for each point
n_leaves_ int Number of leaves in the hiearchical tree.


fit(X) Fit the hierarchical clustering on the data
get_params([deep]) Get parameters for the estimator
set_params(**params) Set the parameters of the estimator.
__init__(n_clusters=2, memory=Memory(cachedir=None), connectivity=None, copy=True, n_components=None)

Fit the hierarchical clustering on the data

Parameters :

X : array-like, shape = [n_samples, n_features]

The samples a.k.a. observations.

Returns :

self :


Get parameters for the estimator

Parameters :

deep: boolean, optional :

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Set the parameters of the estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns :self :