训练集和验证集的准确性都很高，但分类报告极差-解网

问：

我正在开发我的新计算机视觉神经网络。最近我第一次尝试训练它。我对训练结果非常满意（我在训练集上有 ~99% 的准确率，在验证集上有 ~94% 的准确率），但是当我尝试预测测试集的标签时，我得到了糟糕的分类报告：

              precision    recall  f1-score   support

           0       0.02      0.01      0.02        71
           1       0.29      0.29      0.29       483
           2       0.19      0.17      0.18       273
           3       1.00      0.00      0.00         6
           4       0.24      0.25      0.25       401
           5       0.05      0.07      0.06        57
           6       0.00      0.00      0.00        14
           7       0.17      0.19      0.18       253
           8       0.00      0.00      0.00         8

    accuracy                           0.22      1566    
   macro avg       0.22      0.11      0.11      1566 
weighted avg       0.22      0.22      0.22      1566

我也尝试为我的训练集生成分类报告。我想，如果我用它来训练网络并获得了非常高的准确性，我需要在那里获得很好的结果。但我没有：

              precision    recall  f1-score   support

           0       0.03      0.03      0.03       504
           1       0.32      0.32      0.32      3377
           2       0.17      0.16      0.17      1911
           3       0.00      0.00      0.00        44
           4       0.26      0.26      0.26      2808
           5       0.05      0.06      0.06       397
           6       0.01      0.01      0.01        98
           7       0.17      0.19      0.18      1768
           8       0.00      0.00      0.00        56

    accuracy                           0.23     10963
   macro avg       0.11      0.11      0.11     10963
weighted avg       0.23      0.23      0.23     10963

这是我的代码：

import tensorflow as tf
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Flatten, Dropout
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import classification_report
import numpy as np
from keras.optimizers import Adam

# Defining paths to folders with train, test, and validation data
train_data = 'Train'
validation_data = 'Validation'
test_data = 'Test'

# Setting image width and height
img_width, img_height = 224, 224

# Creating ImageDataGenerator for training data with data augmentation
train_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
    train_data,
    target_size=(img_width, img_height),
    batch_size=32,
    class_mode='categorical',
    shuffle=True
)
# Creating ImageDataGenerator for validation data
validation_datagen = ImageDataGenerator()
validation_generator = validation_datagen.flow_from_directory(
    validation_data,
    target_size=(img_width, img_height),
    batch_size=32,
    class_mode='categorical',
    shuffle=True
)

test_datagen = ImageDataGenerator()
test_generator = test_datagen.flow_from_directory(
    test_data,
    target_size=(img_width, img_height),
    batch_size=32,
    class_mode='categorical',
    shuffle=False
)

# Creating a Sequential model
model = Sequential()

# Using a pretrained ResNet50 model with weights from 'imagenet'
pretrained_model = tf.keras.applications.resnet.ResNet50(
                    include_top=False,
                    weights='imagenet',
                    input_shape=(img_width, img_height, 3),
                    pooling='avg',
                    classes=9
                    )

#Freezing the layers of the pretrained model so they won't be trained
for layer in pretrained_model.layers:
    layer.trainable = False

# Adding the pretrained ResNet50 model to our sequential model
model.add(pretrained_model)

# Flattening the output of the pretrained model
model.add(Flatten())

# Adding a dense layer with 1024 units and ReLU activation
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))

# Adding the output layer with 9 units (one for each class) and softmax activation
model.add(Dense(9, activation='softmax'))

custom_adam = Adam(learning_rate=0.001)
# Compiling the model with the Adam optimizer and categorical_crossentropy loss
model.compile(optimizer=custom_adam, loss='categorical_crossentropy', metrics=['accuracy'])

# Setting up callbacks for early stopping and model checkpointing
early_stopping = EarlyStopping(monitor='val_accuracy', patience=10)
model_checkpoint = ModelCheckpoint('weightsResNet50test.hdf5', monitor='val_loss', save_best_only=True)

# Training the model using the ImageDataGenerators and saving the training history
history = model.fit(train_generator, validation_data=validation_generator, epochs=1000)

# Plotting the training and validation accuracy over epochs
fig1 = plt.gcf()
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.axis(ymin=0.4, ymax=1)
plt.grid()
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epochs')
plt.legend(['train', 'validation'])
plt.show()

predictions = model.predict(test_generator)
test_labels = test_generator.classes
class_labels = np.argmax(predictions, axis=1)
report = classification_report(test_labels, class_labels, zero_division=1)
print(report)

Python 机器学习 Keras 计算机视觉

训练集和验证集的准确性都很高，但分类报告极差

High accuracy on both training and validation set, but extremely poor classification report

评论

评论