提问人:Kashyap 提问时间:3/10/2023 更新时间:6/27/2023 访问量:268
pyspark log4j2:如何记录完整的异常堆栈跟踪?
pyspark log4j2: How to log full exception stack trace?
问:
我试过了
logger.error('err', e)
logger.error('err', exc_info=e) # syntax for python's logging
>>>
>>> logger = spark.sparkContext._jvm.org.apache.log4j.LogManager.getLogger('my-logger')
>>>
>>> try: 1/0
... except Exception as e: logger.error('err', e)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/home/kash/project1/.venv/lib/python3.9/site-packages/pyspark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1313, in __call__
File "/home/kash/project1/.venv/lib/python3.9/site-packages/pyspark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1283, in _build_args
File "/home/kash/project1/.venv/lib/python3.9/site-packages/pyspark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1283, in <listcomp>
File "/home/kash/project1/.venv/lib/python3.9/site-packages/pyspark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", line 298, in get_command_part
AttributeError: 'ZeroDivisionError' object has no attribute '_get_object_id'
>>>
>>>
>>>
>>> try: 1/0
... except Exception as e: logger.error('err', exc_info=e)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: __call__() got an unexpected keyword argument 'exc_info'
>>>
当然,我可以自己转换堆栈跟踪并将其作为字符串传递给 log4j 而不是异常对象。但如果我能避免的话,我不想做这一切。
>>> try: 1/0
... except Exception as e: l.error(f'err {"".join(traceback.TracebackException.from_exception(e).format())}')
...
23/03/09 11:38:47 ERROR my-logger: err Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
>>>
答:
1赞
Kashyap
6/27/2023
#1
TL的;dr - 我们不能在 py4j 函数调用中传递纯 python 对象,因为该对象在目标 JVM 中不存在。
错误的原因是:期望将 Java 异常对象传递给它。AttributeError: 'ZeroDivisionError' object has no attribute '_get_object_id'
org.apache.logging.log4j.Logger.info/err()
为此,请执行以下操作:
- 异常应源自 JVM(或请参阅下面的注释)
- 您的 python 代码应将该 Java 异常对象的句柄作为 python 包装器对象 (
py4j.java_gateway.JavaObject
)。 - 将该包装器对象作为参数传递给
Logger.info/error()
注意:如果存在源自 python 运行时的纯 python Error/Exception 对象,则必须在 JVM 中创建相应的 Java Exception 对象,然后传递句柄(wrapper py4j.java_gateway.JavaObject
) 添加到该对象。Logger.info/error()
我们最终只是将其作为字符串作为日志消息的一部分传递。
>>>
>>> import traceback
>>> def e2s(ex: Exception):
... return {''.join(traceback.TracebackException.from_exception(ex).format())}
...
>>>
>>> l = spark.sparkContext._jvm.org.apache.log4j.LogManager.getLogger('my_logger')
>>>
>>> try: 1/0
... except Exception as e: l.error(f'Some exception: {e2s(e)}')
...
ERROR Some exception: {'Traceback (most recent call last):\n File "<stdin>", line 1, in <module>\nZeroDivisionError: division by zero\n'}
>>>
评论
0赞
Panda
6/28/2023
谢谢你的回答。我发布了一个问题,玩具是否使用这样的东西?
0赞
Panda
6/28/2023
stackoverflow.com/questions/76567823/......
评论