加载带有日期列的 sas7bdat 文件时出现 Pandas“read_sas”错误

Pandas `read_sas` error when loading sas7bdat file with a column with dates

提问人:Pasamonte 提问时间:10/10/2023 最后编辑:Pasamonte 更新时间:10/10/2023 访问量:29

问:

我在 Pandas 中遇到错误:read_sas


Traceback (most recent call last):
  File "pandas/_libs/tslibs/timedeltas.pyx", line 372, in pandas._libs.tslibs.timedeltas._maybe_cast_from_unit
  File "pandas/_libs/tslibs/conversion.pyx", line 126, in pandas._libs.tslibs.conversion.cast_from_unit
OverflowError: Python int too large to convert to C long

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/u/TSODCGM/workspaces/nordea/pythononzos/src/python/sas_reader_pandas_1.py", line 25, in <module>
  File "/u/TSODCGM/workspaces/nordea/nrd_venv/lib/python3.11/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/u/TSODCGM/workspaces/nordea/nrd_venv/lib/python3.11/site-packages/pandas/io/sas/sasreader.py", line 161, in read_sas
    reader = SAS7BDATReader(
             ^^^^^^^^^^^^^^^
  File "/u/TSODCGM/workspaces/nordea/nrd_venv/lib/python3.11/site-packages/pandas/io/sas/sas7bdat.py", line 207, in __init__
    self._get_properties()
  File "/u/TSODCGM/workspaces/nordea/nrd_venv/lib/python3.11/site-packages/pandas/io/sas/sas7bdat.py", line 294, in _get_properties
    self.date_created = epoch + pd.to_timedelta(x, unit="s")
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/u/TSODCGM/workspaces/nordea/nrd_venv/lib/python3.11/site-packages/pandas/core/tools/timedeltas.py", line 211, in to_timedelta
    return _coerce_scalar_to_timedelta_type(arg, unit=unit, errors=errors)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/u/TSODCGM/workspaces/nordea/nrd_venv/lib/python3.11/site-packages/pandas/core/tools/timedeltas.py", line 219, in _coerce_scalar_to_timedelta_type
    result = Timedelta(r, unit)
             ^^^^^^^^^^^^^^^^^^
  File "pandas/_libs/tslibs/timedeltas.pyx", line 1695, in pandas._libs.tslibs.timedeltas.Timedelta.__new__
  File "pandas/_libs/tslibs/timedeltas.pyx", line 351, in pandas._libs.tslibs.timedeltas.convert_to_timedelta64
  File "pandas/_libs/tslibs/timedeltas.pyx", line 374, in pandas._libs.tslibs.timedeltas._maybe_cast_from_unit
pandas._libs.tslibs.np_datetime.OutOfBoundsTimedelta: Cannot cast 1.3033002481878809e+41 from s to 'ns' without overflow.

输出自 :pip list


pandas                 1.5.1.post0
pandas-datareader      0.10.0

代码如下:


df = pd.read_sas(url,format = 'sas7bdat')

其中包含文件的路径。url

这里的问题是我无法避免投射。我添加了一个包含我要读取的表格的图像。

有什么方法可以在 Pandas 中做好准备,以便我以后可以读取表格并分配数据类型?我无法避免铸造。read_sas

另一种方法是将天或月分配给时间增量,而不是让它不会溢出。ns

欢迎任何帮助。

谢谢

我尝试使用多个阅读器,但它是一个 SAS sas7bdat 文件。我需要使用或类似的东西。pandas.read_sas

我试着用他们的


from sas7bdat import SAS7BDAT

with SAS7BDAT(url) as reader:
    for row in reader:
        print(row)

但它不起作用。

python-3.x pandas 铸造 SAS

评论


答: 暂无答案