8.3.7. sklearn.cross_validation.Bootstrap¶
- class sklearn.cross_validation.Bootstrap(n, n_bootstraps=3, n_train=0.5, n_test=None, random_state=None)¶
Random sampling with replacement cross-validation iterator
Provides train/test indices to split data in train test sets while resampling the input n_bootstraps times: each time a new random split of the data is performed and then samples are drawn (with replacement) on each side of the split to build the training and test sets.
Note: contrary to other cross-validation strategies, bootstrapping will allow some samples to occur several times in each splits. However a sample that occurs in the train split will never occur in the test split and vice-versa.
If you want each sample to occur at most once you should probably use ShuffleSplit cross validation instead.
Parameters : n : int
Total number of elements in the dataset.
n_bootstraps : int (default is 3)
Number of bootstrapping iterations
n_train : int or float (default is 0.5)
If int, number of samples to include in the training split (should be smaller than the total number of samples passed in the dataset).
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split.
n_test : int or float or None (default is None)
If int, number of samples to include in the training set (should be smaller than the total number of samples passed in the dataset).
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split.
If None, n_test is set as the complement of n_train.
random_state : int or RandomState
Pseudo number generator state used for random sampling.
See also
- ShuffleSplit
- cross validation using random permutations.
Examples
>>> from sklearn import cross_validation >>> bs = cross_validation.Bootstrap(9, random_state=0) >>> len(bs) 3 >>> print bs Bootstrap(9, n_bootstraps=3, n_train=5, n_test=4, random_state=0) >>> for train_index, test_index in bs: ... print "TRAIN:", train_index, "TEST:", test_index ... TRAIN: [1 8 7 7 8] TEST: [0 3 0 5] TRAIN: [5 4 2 4 2] TEST: [6 7 1 0] TRAIN: [4 7 0 1 1] TEST: [5 3 6 5]
- __init__(n, n_bootstraps=3, n_train=0.5, n_test=None, random_state=None)¶