提问人:markedperformance 提问时间:10/22/2023 最后编辑:Rawley Fowlermarkedperformance 更新时间:10/25/2023 访问量:48
如何使用包含嵌套架构的 Perl 客户端将创建文档插入 Elasticsearch
How to insert create document into Elasticsearch using the Perl client that contains a nested schema
问:
就我而言,问题是我正在从 SQL 数据库中提取数据,并尝试使用 Search::Elasticsearch 插件将这些记录创建到 Elasticsearch 中,其中弹性模式充满了嵌套对象。例如,我正在查询 SQL 数据库,合并信息以构建格式,并尝试写入 Elasticsearch。
这有效:
$bulk->create({ id => 1, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}},{ id => 2, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}});
这不起作用:
my $query = "{ id => 1, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}},{ id => 2, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}}";
$bulk->create($query);
看起来信息需要采用哈希或哈希列格式而不是字符串。我不确定是否有一种方法可以转换字符串,或者是否有更好的方法。
如果它的架构是平面的,您可以将信息直接推送到数组,然后将其写出。但是,由于它包含嵌套,因此可以有多个应用程序循环访问该信息。
这是我正在尝试做的事情的草图。未完成,例如硬编码的“13”......但试图解决主要问题......
my $e = Search::Elasticsearch->new( nodes => '192.168.1.11:9200' );
my $index_exists = $e->indices->exists( index => 'sql_software' );
if ($index_exists) { $e->indices->delete( index => 'sql_software' ); }
$e->indices->create( index => 'sql_software' );
my $bulk = $e->bulk_helper( index => 'sql_software');
my $dbh = DBI->connect("dbi:ODBC:Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1;Server=192.168.1.11;Database=Software;UID=sa;PWD=xxxx");
my $sth = $dbh->prepare("SELECT FD.DeviceId AS 'idlist' FROM Software.Device FD");
$sth->execute();
my $list = $sth->fetchall_arrayref({});
foreach my $res (@{ $list }) {
$sth = $dbh->prepare("SELECT FCR.DeviceId, RT.ResourceTypeName AS 'source', DC.CpeDesc AS 'cpe', FCR.InstallDate AS 'firstSeen', FCR.LDate AS 'lastSeen' FROM [Software].[DimCpe] DC JOIN [Software].[CpeResults] FCR ON FCR.CpeId = DC.CpeId JOIN Software.DimResource DR ON DR.ResourceId = FCR.ResourceId JOIN Software.ResourceType RT ON RT.ResourceTypeId = DR.ResourceTypeId WHERE FCR.DeviceId = ? order by FCR.DeviceId");
$sth->execute($res->{idlist}) or die $dbh->errstr;
my $saveval = $res->{idlist};
my $savesoftware="";
my $i=0;
while ( my @row = $sth->fetchrow_array ) {
my $deviceid = @row[0];
my $source = @row[1];
my $cpe = @row[2];
my $firstseen = @row[3];
my $lastseen = @row[4];
if ($i==0) {
$savesoftware = "{ id => $deviceid, source => { applications => [ ";
} elsif ($i==13) {
$savesoftware = $savesoftware . "{ source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }]}}";
} else {
$savesoftware = $savesoftware . "{ source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen },";
}
$i++;
}
$bulk->create( $savesoftware );
}
$sth->finish;
$bulk->flush;
$dbh->disconnect;
答:
2赞
Rawley Fowler
10/22/2023
#1
您可以将字符串转换为哈希值。eval
my $query = eval "{ id => 1, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}},{ id => 2, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}}";
$bulk->create($query);
但我可能只是重构你的代码来操作哈希而不是制作字符串,更惯用的 Perl 方式是这样的:
my %item;
$item{applications} = [];
my $i=0;
while ( my @row = $sth->fetchrow_array ) {
my $deviceid = $row[0];
my $source = $row[1];
my $cpe = $row[2];
my $firstseen = $row[3];
my $lastseen = $row[4];
if ($i==0) {
$item{deviceid} = $row[0];
} else {
push @{$item{applications}}, { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen };
}
$i++;
}
$bulk->create(\%item);
评论
1赞
markedperformance
10/22/2023
谢谢!第二种解决方案是稍作修改。使用 Elasticsearch,有一个主要来源来封装每个文档的其余数据。我们还调用了另一个字段源,所以我可以看到这会如何令人困惑。但无论如何,对于任何寻找最终输出的人来说,我使用了上面的第二个模组和这些模组: #$item{applications} = [];推送 @{$item{source}{applications}}, { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen };code
code
0赞
Rawley Fowler
10/25/2023
@markedperformance 如果您想提交带有最终结果的编辑,我很乐意接受!
评论