如何使用包含嵌套架构的 Perl 客户端将创建文档插入 Elasticsearch

How to insert create document into Elasticsearch using the Perl client that contains a nested schema

提问人:markedperformance 提问时间:10/22/2023 最后编辑:Rawley Fowlermarkedperformance 更新时间:10/25/2023 访问量:48

问:

就我而言,问题是我正在从 SQL 数据库中提取数据,并尝试使用 Search::Elasticsearch 插件将这些记录创建到 Elasticsearch 中,其中弹性模式充满了嵌套对象。例如,我正在查询 SQL 数据库,合并信息以构建格式,并尝试写入 Elasticsearch。

这有效:

$bulk->create({ id => 1, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}},{ id => 2, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}});

这不起作用:

my $query = "{ id => 1, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}},{ id => 2, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}}";

$bulk->create($query);

看起来信息需要采用哈希或哈希列格式而不是字符串。我不确定是否有一种方法可以转换字符串,或者是否有更好的方法。

如果它的架构是平面的,您可以将信息直接推送到数组,然后将其写出。但是,由于它包含嵌套,因此可以有多个应用程序循环访问该信息。

这是我正在尝试做的事情的草图。未完成,例如硬编码的“13”......但试图解决主要问题......

my $e = Search::Elasticsearch->new( nodes => '192.168.1.11:9200' );
my $index_exists = $e->indices->exists( index => 'sql_software' );
if ($index_exists) { $e->indices->delete( index => 'sql_software' ); }
$e->indices->create( index => 'sql_software' );
my $bulk = $e->bulk_helper( index => 'sql_software');

my $dbh = DBI->connect("dbi:ODBC:Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1;Server=192.168.1.11;Database=Software;UID=sa;PWD=xxxx");

        my $sth = $dbh->prepare("SELECT FD.DeviceId AS 'idlist' FROM Software.Device FD");
        $sth->execute();
        my $list = $sth->fetchall_arrayref({});

        foreach my $res (@{ $list }) {
                $sth = $dbh->prepare("SELECT FCR.DeviceId, RT.ResourceTypeName AS 'source', DC.CpeDesc AS 'cpe', FCR.InstallDate AS 'firstSeen', FCR.LDate AS 'lastSeen' FROM [Software].[DimCpe] DC JOIN [Software].[CpeResults] FCR ON FCR.CpeId = DC.CpeId JOIN Software.DimResource DR ON DR.ResourceId = FCR.ResourceId JOIN Software.ResourceType RT ON RT.ResourceTypeId = DR.ResourceTypeId WHERE FCR.DeviceId = ? order by FCR.DeviceId");

                $sth->execute($res->{idlist}) or die $dbh->errstr;

                my $saveval = $res->{idlist};

                my $savesoftware="";
                my $i=0;
                while ( my @row = $sth->fetchrow_array ) {

                        my $deviceid = @row[0];
                        my $source = @row[1];
                        my $cpe = @row[2];
                        my $firstseen = @row[3];
                        my $lastseen = @row[4];

                        if ($i==0) {
                                $savesoftware = "{ id => $deviceid, source => { applications => [ ";
                        } elsif ($i==13) {
                                $savesoftware = $savesoftware . "{ source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }]}}";
                        } else {
                                $savesoftware = $savesoftware . "{ source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen },";
                        }
                        $i++;
                }
                $bulk->create( $savesoftware );
        }

$sth->finish;
$bulk->flush;
$dbh->disconnect;
Perl Elasticsearch 嵌套

评论


答:

2赞 Rawley Fowler 10/22/2023 #1

您可以将字符串转换为哈希值。eval

my $query = eval "{ id => 1, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}},{ id => 2, source => { applications => [ { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen }, { source => $source2, cpe => $cpe2, firstseen => $firstseen2, lastseen => $lastseen2 }]}}";

$bulk->create($query);

但我可能只是重构你的代码来操作哈希而不是制作字符串,更惯用的 Perl 方式是这样的:

my %item;

$item{applications} = [];

my $i=0;
while ( my @row = $sth->fetchrow_array ) {
    my $deviceid = $row[0];
    my $source = $row[1];
    my $cpe = $row[2];
    my $firstseen = $row[3];
    my $lastseen = $row[4];

    if ($i==0) {
        $item{deviceid} = $row[0];
    } else {
        push @{$item{applications}}, { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen };
    }

    $i++;
}

$bulk->create(\%item);

评论

1赞 markedperformance 10/22/2023
谢谢!第二种解决方案是稍作修改。使用 Elasticsearch,有一个主要来源来封装每个文档的其余数据。我们还调用了另一个字段源,所以我可以看到这会如何令人困惑。但无论如何,对于任何寻找最终输出的人来说,我使用了上面的第二个模组和这些模组: #$item{applications} = [];推送 @{$item{source}{applications}}, { source => $source, cpe => $cpe, firstseen => $firstseen, lastseen => $lastseen };codecode
0赞 Rawley Fowler 10/25/2023
@markedperformance 如果您想提交带有最终结果的编辑,我很乐意接受!