提问人:Yaohua Guo 提问时间:11/10/2023 更新时间:11/10/2023 访问量:18
在深度学习量化投资领域,如何对输入特征进行标准化?
In the field of deep learning quantitative investment, how can we standardize the input features?
问:
当使用深度学习模型进行量化投资时,输入特征在不同维度上可能具有不同的尺度。我们如何标准化这些特征,使模型训练更加稳定,避免计算中的梯度爆炸或nan/inf值等问题?
例如,我们的输入特征是 [最低价、开盘价、最高价、收盘价、数量、成交量]。体积和体积通常比其他功能大得多。如果我们将它们作为向量直接输入到模型中,我们可能会遇到价值溢出的风险。标准化输入特征的科学方法是什么?我们一次使用一个样本来训练模型,而不是批量训练。
如果我们的输入样本是 20 天的历史数据,则它是一个形状为 (20, 6) 的二维张量。我们不是分批训练,而是一次向模型输入一个样本。如果我们使用 Z-Score 来标准化每个特征维度,它会减少样本之间的差异吗?例如,在对两个样本的体积进行归一化后,值很接近,但归一化前的值却相差很大。有没有更好的方法来标准化功能?
谢谢!
我使用 BatchNorm2d 进行规范化。
class TransformerNet(nn.Module):
def __init__(
self, input_size, d_model, output_size, activation: str = "leaky_relu"
):
super(TransformerNet, self).__init__()
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.input_size = input_size
self.output_size = output_size
self.activation = activation
self.negative_slope = 0.01
# BatchNorm Layer
self.batch_norm = torch.nn.BatchNorm2d(
num_features=self.input_size, affine=False
)
self.encoder_layer = nn.Linear(self.input_size, d_model)
self.pos_encoding = PositionalEncodingLayer(d_model)
# Transformer Encoder Layer
self.transformer_encoder_layer = nn.TransformerEncoderLayer(
d_model=d_model,
nhead=8,
dim_feedforward=4 * d_model,
dropout=0.1,
activation="gelu",
layer_norm_eps=1e-05,
batch_first=True,
)
self.fc_layer = nn.Linear(d_model, output_size)
def forward(self, x: torch.Tensor) -> torch.Tensor:
_shape = x.shape # (batch_size, seq_len, feature_dim)
x = x.contiguous().view(-1, _shape[-1]) # (batch_size * seq_len, feature_dim)
x = x.unsqueeze(-1).unsqueeze(-1) # (batch_size * seq_len, feature_dim, 1, 1)
---------------------
#FIXME: Sometimes, this batch_norm layer outputs a tensor with nan values, even though the input of the layer x does not have nan/inf values. I do not know why this happens.
x = (
self.batch_norm(x).squeeze(-1).squeeze(-1)
) # shape: (batch_size * seq_len, feature_dim)
----------------------
x = x.contiguous().view(_shape) # (batch_size, seq_len, feature_dim)
x = self.encoder_layer(x) # (batch_size, seq_len, d_model)
x = self.pos_encoding(x) # (batch_size, seq_len, d_model)
seq_len = x.shape[1]
attn_mask = nn.Transformer.generate_square_subsequent_mask(
seq_len, self.device
).bool()
# Transformer Encoder Layer
x = self.transformer_encoder_layer(
x, src_mask=attn_mask, is_causal=True
) # (batch_size, seq_len, d_model)
x = x[:, -1, :] # (batch_size, d_model)
# FC layer
if self.activation == "relu":
output = F.relu(self.fc_layer(x))
elif self.activation == "leaky_relu":
output = F.leaky_relu(self.fc_layer(x), self.negative_slope)
else:
raise ValueError("Unknown activation function " + str(self.activation))
return output
答: 暂无答案
评论