1、基础
StratifiedKFold——执行分层采样
sklearn.model_selection.StratifiedKFold(n_splits=,random_state=,shuffle=)
y:样本集标记序列
n:整数,数据集大小
n_flods:整数k,大于等于2
shuffle:布尔值,是否混洗数据
random_state整数——随机数种子,否则为随机数生成器split(X[,y,groups])
X:训练数据集(n_samples,n_features)
y:标记信息(n_samples,)
划分数据集为训练集、测试集
2、代码
X=np.array([[1,2,3,4],
[11,12,13,14],
[21,22,23,24],
[31,32,33,34],
[41,42,43,44],
[51,52,53,54],
[61,62,63,64],
[71,72,73,74]])
y=np.array([1,1,0,0,1,1,0,0])
# 普通交叉切分
folder=KFold(n_splits=4,shuffle=False)
for train_index,test_index in folder.split(X,y):
print("Train Index:",train_index)
print("Test Index:",test_index)
print("y_train:",y[train_index])
print("y