This documentation is for scikit-learn version 0.10Other versions

Citing

If you use the software, please consider citing scikit-learn.

This page

8.16.4.1. sklearn.metrics.pairwise.euclidean_distances

sklearn.metrics.pairwise.euclidean_distances(X, Y=None, Y_norm_squared=None, squared=False)

Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vectors.

For efficiency reasons, the euclidean distance between a pair of row vector x and y is computed as:

dist(x, y) = sqrt(dot(x, x) - 2 * dot(x, y) + dot(y, y))

This formulation has two main advantages. First, it is computationally efficient when dealing with sparse data. Second, if x varies but y remains unchanged, then the right-most dot-product dot(y, y) can be pre-computed.

Parameters :

X : {array-like, sparse matrix}, shape = [n_samples_1, n_features]

Y : {array-like, sparse matrix}, shape = [n_samples_2, n_features]

Y_norm_squared : array-like, shape = [n_samples_2], optional

Pre-computed dot-products of vectors in Y (e.g., (Y**2).sum(axis=1))

squared : boolean, optional

Return squared Euclidean distances.

Returns :

distances : {array, sparse matrix}, shape = [n_samples_1, n_samples_2]

Examples

>>> from sklearn.metrics.pairwise import euclidean_distances
>>> X = [[0, 1], [1, 1]]
>>> # distance between rows of X
>>> euclidean_distances(X, X)
array([[ 0.,  1.],
       [ 1.,  0.]])
>>> # get distance to origin
>>> euclidean_distances(X, [[0, 0]])
array([[ 1.        ],
       [ 1.41421356]])