在 Numpy 数组中存储浮点数的高速和低内存方式-解网

问：

我有这个数字：.我想将其保存在 Numpy 数组中，我认为虽然位数较低，但它仍然更好。这是对的吗？19576.4125

我试图在一半和一人中保存，但我不知道为什么它会改变数字。

我的号码：19576.4125
半：19580.0
单：19576.412

这个数字是由我创建的用于访问浮点数的方法生成的。我可以使用时间戳，但我不需要秒和毫秒，所以我尝试创建自己的方法，只保存日期、小时和分钟。（我的数据库不接受 s 和 s）。datetimedatetimetimedelta

这是我的生成器方法：

from datetime import datetime


def get_timestamp() -> float:
    now = datetime.now()
    now.replace(microsecond=0, second=0)
    _1970 = datetime(1970, 1, 1, 0, 0, 0)
    td = now - _1970
    days = td.days
    hours, remainder = divmod(td.seconds, 3600)
    minutes, second = divmod(remainder, 60)
    timestamp = days + hours / 24 + minutes / 1440
    return round(timestamp, 4)

我如何创建数组：

from numpy import array, half, single


__td = get_timestamp()
print(__td)
__array = array([__td], dtype=half)
print(type(__array[0]))
print(__array[0])
__array = array([__td], dtype=single)
print(type(__array[0]))
print(__array[0])

已编辑 08/07 11h02 AM

您好，这样的评论说，我认为这个数字不能保存在一半或单一类型中。那么，如何以更好的性能保存这个数字呢？最好像 int 并乘以 10000、float64 或字符串一样保存？

而不是，我不想要更好的方法来保存这个具有更好性能的浮点数。但感谢您的其他回复。datetime

python 数组 numpy 性能浮点

熊猫

In [64]: import pandas as pd

In [65]: df = pd.DataFrame({'a':arr, 'b':barr})

In [66]: df
Out[66]: 
              a                   b
0    19576.3799 2023-08-07 09:07:00
1    19576.3799 2023-08-07 09:07:00
2    19576.3799 2023-08-07 09:07:00
3    19576.3799 2023-08-07 09:07:00
4    19576.3799 2023-08-07 09:07:00
..          ...                 ...
995  19576.3799 2023-08-07 09:07:00
996  19576.3799 2023-08-07 09:07:00
997  19576.3799 2023-08-07 09:07:00
998  19576.3799 2023-08-07 09:07:00
999  19576.3799 2023-08-07 09:07:00

[1000 rows x 2 columns]

In [67]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype        
---  ------  --------------  -----        
 0   a       1000 non-null   float64      
 1   b       1000 non-null   datetime64[s]
dtypes: datetime64[s](1), float64(1)
memory usage: 15.8 KB

有趣的是，如果我将时间戳列表直接保存到数据帧中，它会更快

In [81]: df = pd.DataFrame({'c':alist})

In [82]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   c       1000 non-null   datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 7.9 KB

In [83]: timeit df = pd.DataFrame({'c':alist})
5.29 ms ± 22.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

上一个：恢复正弦波相移时的数值噪声（numpy）

下一个：在 Python 和 NumPy 中量化正态分布的浮点数

在 Numpy 数组中存储浮点数的高速和低内存方式

High-speed and low-memory way to store a float number inside a Numpy array

评论

熊猫