This page

Citing

Please consider citing the scikit-learn.

9.13.8. sklearn.cross_validation.ShuffleSplit

class sklearn.cross_validation.ShuffleSplit(n, n_iterations=10, test_fraction=0.10000000000000001, indices=False, random_state=None)

Random permutation cross-validation iterator.

Yields indices to split data into training and test sets.

Note: contrary to other cross-validation strategies, random splits do not guarantee that all folds will be different, although this is still very likely for sizeable datasets.

Parameters :

n : int

Total number of elements in the dataset.

n_iterations : int (default 10)

Number of re-shuffling & splitting iterations.

test_fraction : float (default 0.1)

Should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split.

indices : boolean, optional (default False)

Return train/test split with integer indices or boolean mask. Integer indices are useful when dealing with sparse matrices that cannot be indexed by boolean masks.

random_state : int or RandomState

Pseudo-random number generator state used for random sampling.

See also

Bootstrap
cross-validation using re-sampling with replacement.

Examples

>>> from sklearn import cross_validation
>>> rs = cross_validation.ShuffleSplit(4, n_iterations=3, test_fraction=.25,
...                             random_state=0)
>>> len(rs)
3
>>> print rs
... 
ShuffleSplit(4, n_iterations=3, test_fraction=0.25, indices=False, ...)
>>> for train_index, test_index in rs:
...    print "TRAIN:", train_index, "TEST:", test_index
...
TRAIN: [False  True  True  True] TEST: [ True False False False]
TRAIN: [ True  True  True False] TEST: [False False False  True]
TRAIN: [ True False  True  True] TEST: [False  True False False]
__init__(n, n_iterations=10, test_fraction=0.10000000000000001, indices=False, random_state=None)