Model Selection

KFold

class naplib.model_selection.KFold(n_splits, shuffle=False, random_state=None)[source]

KFold splitter which works on a naplib.Data object or a list-like sequence.

Parameters:
  • n_splits (int) -- Number of folds. Must be at least 2.

  • shuffle (bool, default=False) -- Whether to shuffle the data before splitting into batches. Note that the samples within each split will not be shuffled.

  • random_state (int, RandomState instance or None, default=None) -- When shuffle is True, random_state affects the ordering of the indices, which controls the randomness of each fold. Otherwise, this parameter has no effect. Pass an int for reproducible output across multiple function calls.

Examples

>>> from naplib.model_selection import KFold
>>> list1 = [1,2,3] # this could be a field of a Data object, like data['resp']
>>> list2 = [5,6,7] # this could be another field of a Data object, like data['aud']
>>> kfold = KFold(3)
>>> for train_data, test_data, train_data2, test_data2 in kfold.split(list1, list2):
>>>    print(train_data, test_data, train_data2, test_data2)
[2, 3] [1] [6, 7] [5]
[1, 3] [2] [5, 7] [6]
[1, 2] [3] [5, 6] [7]
split(*args)[source]

Generate splits of the data.

Parameters:

*args (Data or list-like objects) -- Sets of data which will be split into train and test groups.

Yields:
  • train (Data or list-like objects) -- The training set for that split.

  • test (Data or list-like objects) -- The testing set for that split.