Wildfly/JGroups DNS_Ping发现机制似乎泄露了线程

Wildfly/JGroups DNS_Ping discovery mechanism seems to leak threads

提问人:Marc Dätwyler 提问时间:11/6/2023 更新时间:11/6/2023 访问量:39

问:

我们目前在 Kubernetes 环境中使用 Wildfly/JGroups 集群时遇到了一个问题。 我们有不同数量的 Wildfly (30.0.0) 节点,这些节点需要相互通信并形成一个集群来处理 ArtemisMQ JMS 消息。我们正在使用 dns。DNS_PING用于在集群中发现,TCP 作为 JGroups 的主要协议。

我们使用以下 Wildfly CLI 命令来设置 JGroups 集群:

`echo "Kubernetes interface and bindings"/interface=kubernetes:add(nic=eth0)/interface=private:add(inet-address="${jboss.bind.address.private:127.0.0.1}")/interface=dns:add(site-local-address=true)/socket-binding-group=standard-sockets/socket-binding=jgroups-tcp:add(interface=dns, port=7800)/socket-binding-group=standard-sockets/socket-binding=jgroups-tcp-fd:add(interface=dns, port=57800)/socket-binding-group=standard-sockets/socket-binding=http:write-attribute(name=interface,value=dns)/socket-binding-group=standard-sockets/socket-binding=https:write-attribute(name=interface,value=dns)
echo "JGroups"/extension=org.jboss.as.clustering.jgroups:add()/subsystem=jgroups:add()#/subsystem=jgroups:write-attribute(name=default-stack,value=tcp)
echo "TCP stack"batch/subsystem=jgroups/stack=tcp:add()#/subsystem=jgroups/stack=tcp:add/subsystem=jgroups/stack=tcp/transport=TCP:add(socket-binding=jgroups-tcp)/subsystem=jgroups/stack=tcp/protocol=MERGE3:add/subsystem=jgroups/stack=tcp/protocol=FD_SOCK:add(socket-binding=jgroups-tcp-fd)/subsystem=jgroups/stack=tcp/protocol=VERIFY_SUSPECT:add/subsystem=jgroups/stack=tcp/protocol=pbcast.NAKACK2:add/subsystem=jgroups/stack=tcp/protocol=UNICAST3:add/subsystem=jgroups/stack=tcp/protocol=pbcast.STABLE:add/subsystem=jgroups/stack=tcp/protocol=pbcast.GMS:add/subsystem=jgroups/stack=tcp/protocol=MFC:add/subsystem=jgroups/stack=tcp/protocol=FRAG3:addrun-batch
echo "JGroups Channel"/subsystem=jgroups/channel=ee:add(stack=tcp)/subsystem=jgroups/channel=ee:write-attribute(name=stack,value=tcp)#/subsystem=jgroups/channel=ee:write-attribute(name=cluster,value=kubernetes)/subsystem=jgroups:write-attribute(name=default-channel,value=ee)
echo "DNS_PING Protocol"/subsystem=jgroups/stack=tcp/protocol=dns.DNS_PING:add(add-index=0,properties={dns_query="_ping._tcp.avaloq-wb-sync-manager-ping.namespace001.svc.cluster.local.",dns_record_type=SRV})`

DNS_PING查询指向一个 Kubernetes 服务,该服务公开了我们希望在集群中拥有的节点。

现在,在高效部署中,我们获得了DNS_PING创建的大量线程。我们还看到,一个线程阻塞了其他线程,并挂在“PlainSocket.socketConnect”方法中。我们sock_conn_timeout JGroups 设置为 300 毫秒,所以这种等待应该不会真正发生。
最后,Wildfly 无法再启动任何线程(无法再创建操作系统级别的线程)。我们仍然不确定究竟是什么原因导致了这个问题,但我们假设这可能是达到了文件描述符限制。最后,我们有大约 4000 个线程,其中大约 75% 与 DNS-Ping 相关。

挂线如下所示:

        {
            "thread-id" => 109424945L,
            "thread-name" => "Timer temp thread-20460,ee,avaloq-wb-sync-manager-0",
            "thread-state" => "RUNNABLE",
            "blocked-time" => -1L,
            "blocked-count" => 1L,
            "waited-time" => -1L,
            "waited-count" => 1L,
            "lock-info" => undefined,
            "lock-name" => undefined,
            "lock-owner-id" => -1L,
            "lock-owner-name" => undefined,
            "stack-trace" => [
                {
                    "file-name" => "PlainSocketImpl.java",
                    "line-number" => -2,
                    "class-name" => "java.net.PlainSocketImpl",
                    "method-name" => "socketConnect",
                    "native-method" => true
                },
                {
                    "file-name" => "AbstractPlainSocketImpl.java",
                    "line-number" => 412,
                    "class-name" => "java.net.AbstractPlainSocketImpl",
                    "method-name" => "doConnect",
                    "native-method" => false
                },
                {
                    "file-name" => "AbstractPlainSocketImpl.java",
                    "line-number" => 255,
                    "class-name" => "java.net.AbstractPlainSocketImpl",
                    "method-name" => "connectToAddress",
                    "native-method" => false
                },
                {
                    "file-name" => "AbstractPlainSocketImpl.java",
                    "line-number" => 237,
                    "class-name" => "java.net.AbstractPlainSocketImpl",
                    "method-name" => "connect",
                    "native-method" => false
                },
                {
                    "file-name" => "SocksSocketImpl.java",
                    "line-number" => 392,
                    "class-name" => "java.net.SocksSocketImpl",
                    "method-name" => "connect",
                    "native-method" => false
                },
                {
                    "file-name" => "Socket.java",
                    "line-number" => 609,
                    "class-name" => "java.net.Socket",
                    "method-name" => "connect",
                    "native-method" => false
                },
                {
                    "file-name" => "Util.java",
                    "line-number" => 461,
                    "class-name" => "org.jgroups.util.Util",
                    "method-name" => "connect",
                    "native-method" => false
                },
                {
                    "file-name" => "TcpConnection.java",
                    "line-number" => 96,
                    "class-name" => "org.jgroups.blocks.cs.TcpConnection",
                    "method-name" => "connect",
                    "native-method" => false
                },
                {
                    "file-name" => "TcpConnection.java",
                    "line-number" => 88,
                    "class-name" => "org.jgroups.blocks.cs.TcpConnection",
                    "method-name" => "connect",
                    "native-method" => false
                },
                {
                    "file-name" => "BaseServer.java",
                    "line-number" => 295,
                    "class-name" => "org.jgroups.blocks.cs.BaseServer",
                    "method-name" => "getConnection",
                    "native-method" => false
                },
                {
                    "file-name" => "BaseServer.java",
                    "line-number" => 208,
                    "class-name" => "org.jgroups.blocks.cs.BaseServer",
                    "method-name" => "send",
                    "native-method" => false
                },
                {
                    "file-name" => "TCP.java",
                    "line-number" => 91,
                    "class-name" => "org.jgroups.protocols.TCP",
                    "method-name" => "send",
                    "native-method" => false
                },
                {
                    "file-name" => "BasicTCP.java",
                    "line-number" => 146,
                    "class-name" => "org.jgroups.protocols.BasicTCP",
                    "method-name" => "sendUnicast",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1638,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "sendToSingleMember",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1632,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "doSend",
                    "native-method" => false
                },
                {
                    "file-name" => "NoBundler.java",
                    "line-number" => 38,
                    "class-name" => "org.jgroups.protocols.NoBundler",
                    "method-name" => "sendSingleMessage",
                    "native-method" => false
                },
                {
                    "file-name" => "NoBundler.java",
                    "line-number" => 30,
                    "class-name" => "org.jgroups.protocols.NoBundler",
                    "method-name" => "send",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1620,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "send",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1353,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "_send",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1262,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "down",
                    "native-method" => false
                },
                {
                    "file-name" => "DNS_PING.java",
                    "line-number" => 189,
                    "class-name" => "org.jgroups.protocols.dns.DNS_PING",
                    "method-name" => "sendDiscoveryRequest",
                    "native-method" => false
                },
                {
                    "file-name" => "DNS_PING.java",
                    "line-number" => 182,
                    "class-name" => "org.jgroups.protocols.dns.DNS_PING",
                    "method-name" => "findMembers",
                    "native-method" => false
                },
                {
                    "file-name" => "Discovery.java",
                    "line-number" => 217,
                    "class-name" => "org.jgroups.protocols.Discovery",
                    "method-name" => "invokeFindMembers",
                    "native-method" => false
                },
                {
                    "file-name" => "Discovery.java",
                    "line-number" => 228,
                    "class-name" => "org.jgroups.protocols.Discovery",
                    "method-name" => "lambda$findMembers$0",
                    "native-method" => false
                },
                {
                    "file-name" => undefined,
                    "line-number" => -1,
                    "class-name" => "org.jgroups.protocols.Discovery$$Lambda$968/0x0000000840b0bc40",
                    "method-name" => "run",
                    "native-method" => false
                },
                {
                    "file-name" => "TimeScheduler3.java",
                    "line-number" => 324,
                    "class-name" => "org.jgroups.util.TimeScheduler3$Task",
                    "method-name" => "run",
                    "native-method" => false
                },
                {
                    "file-name" => "ContextReferenceExecutor.java",
                    "line-number" => 49,
                    "class-name" => "org.jboss.as.clustering.context.ContextReferenceExecutor",
                    "method-name" => "execute",
                    "native-method" => false
                },
                {
                    "file-name" => "ContextualExecutor.java",
                    "line-number" => 70,
                    "class-name" => "org.jboss.as.clustering.context.ContextualExecutor$1",
                    "method-name" => "run",
                    "native-method" => false
                },
                {
                    "file-name" => "Thread.java",
                    "line-number" => 829,
                    "class-name" => "java.lang.Thread",
                    "method-name" => "run",
                    "native-method" => false
                }
            ],
            "suspended" => false,
            "in-native" => false,
            "locked-monitors" => [{
                "class-name" => "java.net.SocksSocketImpl",
                "identity-hash-code" => 139076230,
                "locked-stack-depth" => 1,
                "locked-stack-frame" => {
                    "file-name" => "AbstractPlainSocketImpl.java",
                    "line-number" => 412,
                    "class-name" => "java.net.AbstractPlainSocketImpl",
                    "method-name" => "doConnect",
                    "native-method" => false
                }
            }],
            "locked-synchronizers" => [{
                "class-name" => "java.util.concurrent.locks.ReentrantLock$FairSync",
                "identity-hash-code" => 740591308
            }]
        },

还有一个典型的等待线程:

       "thread-id" => 109424946L,
            "thread-name" => "Timer temp thread-20461,ee,avaloq-wb-sync-manager-0",
            "thread-state" => "WAITING",
            "blocked-time" => -1L,
            "blocked-count" => 1L,
            "waited-time" => -1L,
            "waited-count" => 1L,
            "lock-info" => {
                "class-name" => "java.util.concurrent.locks.ReentrantLock$FairSync",
                "identity-hash-code" => 740591308
            },
            "lock-name" => "java.util.concurrent.locks.ReentrantLock$FairSync@2c2486cc",
            "lock-owner-id" => 109424945L,
            "lock-owner-name" => "Timer temp thread-20460,ee,avaloq-wb-sync-manager-0",
            "stack-trace" => [
                {
                    "file-name" => "Unsafe.java",
                    "line-number" => -2,
                    "class-name" => "jdk.internal.misc.Unsafe",
                    "method-name" => "park",
                    "native-method" => true
                },
                {
                    "file-name" => "LockSupport.java",
                    "line-number" => 194,
                    "class-name" => "java.util.concurrent.locks.LockSupport",
                    "method-name" => "park",
                    "native-method" => false
                },
                {
                    "file-name" => "AbstractQueuedSynchronizer.java",
                    "line-number" => 885,
                    "class-name" => "java.util.concurrent.locks.AbstractQueuedSynchronizer",
                    "method-name" => "parkAndCheckInterrupt",
                    "native-method" => false
                },
                {
                    "file-name" => "AbstractQueuedSynchronizer.java",
                    "line-number" => 943,
                    "class-name" => "java.util.concurrent.locks.AbstractQueuedSynchronizer",
                    "method-name" => "doAcquireInterruptibly",
                    "native-method" => false
                },
                {
                    "file-name" => "AbstractQueuedSynchronizer.java",
                    "line-number" => 1263,
                    "class-name" => "java.util.concurrent.locks.AbstractQueuedSynchronizer",
                    "method-name" => "acquireInterruptibly",
                    "native-method" => false
                },
                {
                    "file-name" => "ReentrantLock.java",
                    "line-number" => 317,
                    "class-name" => "java.util.concurrent.locks.ReentrantLock",
                    "method-name" => "lockInterruptibly",
                    "native-method" => false
                },
                {
                    "file-name" => "BaseServer.java",
                    "line-number" => 277,
                    "class-name" => "org.jgroups.blocks.cs.BaseServer",
                    "method-name" => "getConnection",
                    "native-method" => false
                },
                {
                    "file-name" => "BaseServer.java",
                    "line-number" => 208,
                    "class-name" => "org.jgroups.blocks.cs.BaseServer",
                    "method-name" => "send",
                    "native-method" => false
                },
                {
                    "file-name" => "TCP.java",
                    "line-number" => 91,
                    "class-name" => "org.jgroups.protocols.TCP",
                    "method-name" => "send",
                    "native-method" => false
                },
                {
                    "file-name" => "BasicTCP.java",
                    "line-number" => 146,
                    "class-name" => "org.jgroups.protocols.BasicTCP",
                    "method-name" => "sendUnicast",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1638,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "sendToSingleMember",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1632,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "doSend",
                    "native-method" => false
                },
                {
                    "file-name" => "NoBundler.java",
                    "line-number" => 38,
                    "class-name" => "org.jgroups.protocols.NoBundler",
                    "method-name" => "sendSingleMessage",
                    "native-method" => false
                },
                {
                    "file-name" => "NoBundler.java",
                    "line-number" => 30,
                    "class-name" => "org.jgroups.protocols.NoBundler",
                    "method-name" => "send",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1620,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "send",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1353,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "_send",
                    "native-method" => false
                },
                {
                    "file-name" => "TP.java",
                    "line-number" => 1262,
                    "class-name" => "org.jgroups.protocols.TP",
                    "method-name" => "down",
                    "native-method" => false
                },
                {
                    "file-name" => "DNS_PING.java",
                    "line-number" => 189,
                    "class-name" => "org.jgroups.protocols.dns.DNS_PING",
                    "method-name" => "sendDiscoveryRequest",
                    "native-method" => false
                },
                {
                    "file-name" => "DNS_PING.java",
                    "line-number" => 182,
                    "class-name" => "org.jgroups.protocols.dns.DNS_PING",
                    "method-name" => "findMembers",
                    "native-method" => false
                },
                {
                    "file-name" => "Discovery.java",
                    "line-number" => 217,
                    "class-name" => "org.jgroups.protocols.Discovery",
                    "method-name" => "invokeFindMembers",
                    "native-method" => false
                },
                {
                    "file-name" => "Discovery.java",
                    "line-number" => 228,
                    "class-name" => "org.jgroups.protocols.Discovery",
                    "method-name" => "lambda$findMembers$0",
                    "native-method" => false
                },
                {
                    "file-name" => undefined,
                    "line-number" => -1,
                    "class-name" => "org.jgroups.protocols.Discovery$$Lambda$968/0x0000000840b0bc40",
                    "method-name" => "run",
                    "native-method" => false
                },
                {
                    "file-name" => "TimeScheduler3.java",
                    "line-number" => 324,
                    "class-name" => "org.jgroups.util.TimeScheduler3$Task",
                    "method-name" => "run",
                    "native-method" => false
                },
                {
                    "file-name" => "ContextReferenceExecutor.java",
                    "line-number" => 49,
                    "class-name" => "org.jboss.as.clustering.context.ContextReferenceExecutor",
                    "method-name" => "execute",
                    "native-method" => false
                },
                {
                    "file-name" => "ContextualExecutor.java",
                    "line-number" => 70,
                    "class-name" => "org.jboss.as.clustering.context.ContextualExecutor$1",
                    "method-name" => "run",
                    "native-method" => false
                },
                {
                    "file-name" => "Thread.java",
                    "line-number" => 829,
                    "class-name" => "java.lang.Thread",
                    "method-name" => "run",
                    "native-method" => false
                }
            ],
            "suspended" => false,
            "in-native" => false,
            "locked-monitors" => [],
            "locked-synchronizers" => []
        },

有没有人遇到过类似的问题?

Kubernetes Wildfly JG组

评论


答: 暂无答案