提问人:Biche 提问时间:3/9/2023 最后编辑:JosefZBiche 更新时间:3/9/2023 访问量:37
使用 to_categorical 应用一个热编码后数据形状更改
data shape changes after applying one hot encoding using to_categorical
问:
我定义了一个函数get_data从具有特定数据大小的最小数据集中随机选择两位数字。然后应用to_categorical进行一次热编码。但是每次运行函数时,数据形状都会发生变化。不明白为什么。我假设形状应该是 和 ,因为它们只是两个类,但它给了我不同的值。之间 请给出详细的解释,因为我是机器学习的新手。(100, 2)
(20, 2)
def get_data(train_size, test_size):
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Generate two random digits between 0 and 9
generate_digits = np.random.choice(np.arange(10), size=2, replace=False)
# Get a sub dataset only with the generated digits
train_digits = np.isin(y_train, generate_digits)
test_digits = np.isin(y_test, generate_digits)
x_train_sub, y_train_sub = x_train[train_digits], y_train[train_digits]
x_test_sub, y_test_sub = x_test[test_digits], y_test[test_digits]
# Split the dataset into train and test
x_train_sub, x_test_sub, y_train_sub, y_test_sub = train_test_split(
x_train_sub, y_train_sub, train_size=train_size, test_size=test_size,
random_state=0, stratify=y_train_sub)
y_train_sub = keras.utils.to_categorical(y_train_sub)
y_test_sub = keras.utils.to_categorical(y_test_sub)
return x_train_sub, y_train_sub, x_test_sub, y_test_sub
train_size = 100
test_size = 20
x_train_sub, y_train_sub, x_test_sub, y_test_sub = get_data(train_size, test_size)
print (y_train_sub.shape)
print(y_test_sub.shape)
一个样本结果
(100, 10)
(20, 10)
另一个示例结果
(100, 6)
(20, 6)
我尝试了很多东西,但没有成功。
答: 暂无答案
评论