This page

Citing

Please consider citing the scikit-learn.

9.13.3. sklearn.cross_validation.KFold

class sklearn.cross_validation.KFold(n, k, indices=False)

K-Folds cross validation iterator

Provides train/test indices to split data in train test sets. Split dataset into k consecutive folds (without shuffling).

Each fold is then used a validation set once while the k - 1 remaining fold form the training set.

Parameters :

n: int :

Total number of elements

k: int :

Number of folds

indices: boolean, optional (default False) :

Return train/test split with integer indices or boolean mask. Integer indices are useful when dealing with sparse matrices that cannot be indexed by boolean masks.

See also

StratifiedKFold
take label information into account to avoid building

folds, classification

Notes

All the folds have size trunc(n_samples / n_folds), the last one has the complementary.

Examples

>>> from sklearn import cross_validation
>>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([1, 2, 3, 4])
>>> kf = cross_validation.KFold(4, k=2)
>>> len(kf)
2
>>> print kf
sklearn.cross_validation.KFold(n=4, k=2)
>>> for train_index, test_index in kf:
...    print "TRAIN:", train_index, "TEST:", test_index
...    X_train, X_test = X[train_index], X[test_index]
...    y_train, y_test = y[train_index], y[test_index]
TRAIN: [False False  True  True] TEST: [ True  True False False]
TRAIN: [ True  True False False] TEST: [False False  True  True]
__init__(n, k, indices=False)