在深度学习量化投资领域,如何对输入特征进行标准化?

In the field of deep learning quantitative investment, how can we standardize the input features?

提问人:Yaohua Guo 提问时间:11/10/2023 更新时间:11/10/2023 访问量:18

问:

当使用深度学习模型进行量化投资时,输入特征在不同维度上可能具有不同的尺度。我们如何标准化这些特征,使模型训练更加稳定,避免计算中的梯度爆炸或nan/inf值等问题?

例如,我们的输入特征是 [最低价、开盘价、最高价、收盘价、数量、成交量]。体积和体积通常比其他功能大得多。如果我们将它们作为向量直接输入到模型中,我们可能会遇到价值溢出的风险。标准化输入特征的科学方法是什么?我们一次使用一个样本来训练模型,而不是批量训练。

如果我们的输入样本是 20 天的历史数据,则它是一个形状为 (20, 6) 的二维张量。我们不是分批训练,而是一次向模型输入一个样本。如果我们使用 Z-Score 来标准化每个特征维度,它会减少样本之间的差异吗?例如,在对两个样本的体积进行归一化后,值很接近,但归一化前的值却相差很大。有没有更好的方法来标准化功能?

谢谢!

我使用 BatchNorm2d 进行规范化。

class TransformerNet(nn.Module):
    def __init__(
        self, input_size, d_model, output_size, activation: str = "leaky_relu"
    ):
        super(TransformerNet, self).__init__()
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.input_size = input_size
        self.output_size = output_size
        self.activation = activation
        self.negative_slope = 0.01

        # BatchNorm Layer
        self.batch_norm = torch.nn.BatchNorm2d(
            num_features=self.input_size, affine=False
        )
        self.encoder_layer = nn.Linear(self.input_size, d_model)
        self.pos_encoding = PositionalEncodingLayer(d_model)

        # Transformer Encoder Layer
        self.transformer_encoder_layer = nn.TransformerEncoderLayer(
            d_model=d_model,
            nhead=8,
            dim_feedforward=4 * d_model,
            dropout=0.1,
            activation="gelu",
            layer_norm_eps=1e-05,
            batch_first=True,
        )

        self.fc_layer = nn.Linear(d_model, output_size)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        _shape = x.shape  # (batch_size, seq_len, feature_dim)
        x = x.contiguous().view(-1, _shape[-1])  # (batch_size * seq_len, feature_dim)
        x = x.unsqueeze(-1).unsqueeze(-1)  # (batch_size * seq_len, feature_dim, 1, 1)
        
    ---------------------
    #FIXME: Sometimes, this batch_norm layer outputs a tensor with nan values, even though the input of the layer x does not have nan/inf values. I do not know why this happens.
        x = (
            self.batch_norm(x).squeeze(-1).squeeze(-1)
        )  # shape: (batch_size * seq_len, feature_dim)
    ----------------------
        
        x = x.contiguous().view(_shape)  # (batch_size, seq_len, feature_dim)
        x = self.encoder_layer(x)  # (batch_size, seq_len, d_model)
        x = self.pos_encoding(x)  # (batch_size, seq_len, d_model)
        seq_len = x.shape[1]
        attn_mask = nn.Transformer.generate_square_subsequent_mask(
            seq_len, self.device
        ).bool()
        # Transformer Encoder Layer
        x = self.transformer_encoder_layer(
            x, src_mask=attn_mask, is_causal=True
        )  # (batch_size, seq_len, d_model)
        x = x[:, -1, :]  # (batch_size, d_model)

        # FC layer
        if self.activation == "relu":
            output = F.relu(self.fc_layer(x))
        elif self.activation == "leaky_relu":
            output = F.leaky_relu(self.fc_layer(x), self.negative_slope)
        else:
            raise ValueError("Unknown activation function " + str(self.activation))

        return output
深度学习 PyTorch 量化金融

评论


答: 暂无答案