PyTorch head2head 模型中的CUDA_OUT_OF_MEMORY

CUDA_OUT_OF_MEMORY in PyTorch head2head model

提问人:Mayank Tiwari 提问时间:3/5/2021 更新时间:3/5/2021 访问量:55

问:

我正在执行 Github 存储库中提供的 head2head 模型。 当我使用以下命令运行代码时:

./scripts/train/train_on_target.sh Obama head2headDataset

train_on_target.sh文件的内容为:

target_name=$1
dataset_name=$2

python train.py --checkpoints_dir checkpoints/$dataset_name \
                --target_name $target_name \
                --name head2head_$target_name \
                --dataroot datasets/$dataset_name/dataset \
                --serial_batches

然后我收到以下错误:

Traceback (most recent call last):
  File "train.py", line 108, in <module>
    flow_ref, conf_ref, t_scales, n_frames_D)
  File "/home/nitin/head2head/util/util.py", line 48, in get_skipped_flows
    flow_ref_skipped[s], conf_ref_skipped[s] = flowNet(real_B[s][:,1:], real_B[s][:,:-1])
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet.py", line 38, in forward
    flow, conf = self.compute_flow_and_conf(input_A, input_B)
  File "/home/nitin/head2head/models/flownet.py", line 55, in compute_flow_and_conf
    flow1 = self.flowNet(data1)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet2_pytorch/models.py", line 156, in forward
    flownetfusion_flow = self.flownetfusion(concat3)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet2_pytorch/networks/FlowNetFusion.py", line 62, in forward
    concat0 = torch.cat((out_conv0,out_deconv0,flow1_up),1)
RuntimeError: CUDA out of memory. Tried to allocate 82.00 MiB (GPU 0; 5.80 GiB total capacity; 4.77 GiB already allocated; 73.56 MiB free; 4.88 GiB reserved in total by PyTorch)

我已经在文件 options/base_options.py 中检查了批量大小。它已设置为 1。如何解决上述异常。我的系统有 6 GB NVIDIA GTX 1660 Super GPU。

蟒蛇 pytorch gpu 深度伪造

评论


答:

1赞 Leonardo Kanashiro Felizardo 3/5/2021 #1

数据管理:

您可以尝试减少用于训练的数据集,以检查是否存在硬件限制。 此外,如果是图像数据集,则可以通过减小 dpi 来减小图像的尺寸。

模型参数管理:

另一种方法是减少模型的参数数量。第一个建议是更改密集层大小,然后更改其他神经网络超参数。