This documentation is for scikit-learn version 0.10Other versions

Citing

If you use the software, please consider citing scikit-learn.

This page

8.3.3. sklearn.cross_validation.KFold

class sklearn.cross_validation.KFold(n, k, indices=True)

K-Folds cross validation iterator

Provides train/test indices to split data in train test sets. Split dataset into k consecutive folds (without shuffling).

Each fold is then used a validation set once while the k - 1 remaining fold form the training set.

Parameters :

n: int :

Total number of elements

k: int :

Number of folds

indices: boolean, optional (default True) :

Return train/test split as arrays of indices, rather than a boolean mask array. Integer indices are required when dealing with sparse matrices, since those cannot be indexed by boolean masks.

See also

StratifiedKFold
take label information into account to avoid building

folds, classification

Notes

All the folds have size trunc(n_samples / n_folds), the last one has the complementary.

Examples

>>> from sklearn import cross_validation
>>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([1, 2, 3, 4])
>>> kf = cross_validation.KFold(4, k=2)
>>> len(kf)
2
>>> print kf
sklearn.cross_validation.KFold(n=4, k=2)
>>> for train_index, test_index in kf:
...    print "TRAIN:", train_index, "TEST:", test_index
...    X_train, X_test = X[train_index], X[test_index]
...    y_train, y_test = y[train_index], y[test_index]
TRAIN: [2 3] TEST: [0 1]
TRAIN: [0 1] TEST: [2 3]
__init__(n, k, indices=True)