8.1.7. sklearn.cluster.Ward¶
- class sklearn.cluster.Ward(n_clusters=2, memory=Memory(cachedir=None), connectivity=None, copy=True, n_components=None)¶
Ward hierarchical clustering: constructs a tree and cuts it.
Parameters : n_clusters : int or ndarray
The number of clusters to find.
connectivity : sparse matrix.
Connectivity matrix. Defines for each sample the neigbhoring samples following a given structure of the data. Default is None, i.e, the hiearchical clustering algorithm is unstructured.
memory : Instance of joblib.Memory or string
Used to cache the output of the computation of the tree. By default, no caching is done. If a string is given, it is the path to the caching directory.
copy : bool
Copy the connectivity matrix or work inplace.
n_components : int (optional)
The number of connected components in the graph defined by the connectivity matrix. If not set, it is estimated.
Attributes
children_ array-like, shape = [n_nodes, 2] List of the children of each nodes. Leaves of the tree do not appear. labels_ array [n_points] cluster labels for each point n_leaves_ int Number of leaves in the hiearchical tree. Methods
fit(X) Fit the hierarchical clustering on the data set_params(**params) Set the parameters of the estimator. - __init__(n_clusters=2, memory=Memory(cachedir=None), connectivity=None, copy=True, n_components=None)¶
- fit(X)¶
Fit the hierarchical clustering on the data
Parameters : X : array-like, shape = [n_samples, n_features]
The samples a.k.a. observations.
Returns : self :
- set_params(**params)¶
Set the parameters of the estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns : self :