使用自定义损失函数训练 Keras 时形状不兼容

Incompatible shapes when training Keras with custom loss function

提问人:this_is_david 提问时间:11/17/2023 最后编辑:toyota Suprathis_is_david 更新时间:11/17/2023 访问量:23

问:

运行下面的代码时,我收到了来自 Keras 的不兼容形状。我见过几个关于自定义损失函数的类似问题,但没有一个是形状不兼容的。这个问题是由我的自定义丢失本身引起的,还是由Keras中更深层次的东西引起的?

张量流==2.13.0

import numpy as np
import pandas as pd
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

N = 1000
df = pd.DataFrame({
    'Feature1': np.random.normal(loc=0, scale=1, size=N),
    'Feature2': np.random.normal(loc=1, scale=2, size=N),
    'Label': np.random.choice([0, 1], size=N)
})

df_train = df.sample(frac = 0.80, random_state = 42)
df_test = df[~df.index.isin(df_train.index)]
print(f"df_train.shape = {df_train.shape}")
print(f"df_test.shape = {df_test.shape}")

X_train, y_train = df_train[['Feature1', 'Feature2']], df_train['Label']
X_test, y_test = df_test[['Feature1', 'Feature2']], df_test['Label']

def my_loss(data, y_pred):
    y_true = data[:, 0]
    amount = data[:, 1]
    amount_true = amount * y_true
    amount_pred = amount * y_pred
    error = amount_pred - amount_true
    return sum(error)

y_train_plus_amt = np.append(y_train.values.reshape(-1, 1),
    X_train['Feature1'].values.reshape(-1, 1), axis = 1)

M = Sequential()
M.add(Dense(16, input_shape=(X_train.shape[1],), activation = 'relu'))
M.compile(optimizer='adam', loss = my_loss, run_eagerly = True)
M.fit(X_train, y_train_plus_amt, epochs=10, batch_size=64)


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/venv/lib/python3.9/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "<stdin>", line 5, in my_loss
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __wrapped__Mul_device_/job:localhost/replica:0/task:0/device:CPU:0}} Incompatible shapes: [64] vs. [64,16] [Op:Mul] name: 
Python TensorFlow 机器学习 Keras 深度学习

评论


答:

0赞 BlueOyster 11/17/2023 #1

损失函数的部分

amount_pred = amount * y_pred

正在尝试对大小为 (64, 1) 和 (64, 16) 的矩阵进行矩阵乘法运算。这是不可能的。

要定义矩阵乘法,要求两个矩阵具有兼容的类型。也就是说,对于某些 m、nq,两个矩阵必须具有大小 (m, n) 和 (nq)。您的矩阵大小不满足此条件,因此从字面上看,乘法没有为它们定义(在传统意义上)。

评论

0赞 this_is_david 11/17/2023
谢谢。我只是忘了定义我的输出层:M.add(Dense(1, activation = 'sigmoid'))