提问人:nropgrammer 提问时间:12/13/2022 更新时间:12/13/2022 访问量:79
给定整数的存储桶预测?
Bucket prediction given an integer?
问:
如果我有一个整数字典,其中值是排序的两个整数的数组,其中两个值是最大的值,是桶 n
中存在的 k
个值中的最小值,我如何预测/估计给定数字会出现哪个桶。
该词典的一个例子是:
{
1: [
541247,
14771
],
15: [
9789,
9621
],
22: [
8984,
8923
],
64: [
7278,
7263
],
93: [
6846,
6804
],
126: [
5789,
5765
],
139: [
5484,
5462
],
208: [
4326,
4314
],
285: [
3788,
3783
],
322: [
3634,
3631
],
337: [
3586,
3582
],
356: [
3517,
3513
],
448: [
3232,
3229
],
459: [
3199,
3196
],
539: [
2990,
2989
],
581: [
2891,
2890
],
596: [
2858,
2856
],
640: [
2770,
2768
],
675: [
2704,
2702
],
679: [
2697,
2695
],
686: [
2684,
2682
],
733: [
2602,
2600
],
879: [
2387,
2386
],
961: [
2283,
2282
],
980: [
2261,
2260
],
995: [
2244,
2243
],
1000: [
2238,
2237
],
1169: [
2064,
2063
],
1175: [
2058,
2057
],
1236: [
2005,
2004
],
1259: [
1985,
1984
],
1281: [
1967,
1966
],
1348: [
1913,
1912
],
1424: [
1855,
1855
],
1429: [
1852,
1851
],
1514: [
1796,
1795
],
1564: [
1764,
1764
],
1596: [
1744,
1744
],
1625: [
1726,
1726
],
1661: [
1704,
1703
],
1695: [
1683,
1682
],
1710: [
1675,
1674
],
1713: [
1673,
1673
],
1720: [
1669,
1669
],
1835: [
1609,
1608
],
1846: [
1603,
1602
],
1861: [
1594,
1594
],
1864: [
1593,
1592
],
1916: [
1562,
1561
],
2093: [
1462,
1462
],
2159: [
1424,
1423
],
2175: [
1414,
1413
],
2189: [
1405,
1405
],
2209: [
1394,
1393
],
2266: [
1359,
1358
],
2319: [
1327,
1326
],
2352: [
1306,
1306
],
2429: [
1260,
1260
],
2512: [
1214,
1214
],
2629: [
1141,
1141
],
2704: [
1088,
1088
],
2736: [
1064,
1063
],
2745: [
1057,
1057
],
2766: [
1042,
1042
],
2779: [
1033,
1032
],
2790: [
1025,
1024
],
2798: [
1019,
1018
],
2880: [
964,
964
],
2945: [
928,
927
],
3032: [
885,
884
],
3115: [
838,
837
],
3120: [
835,
834
],
3140: [
823,
822
],
3169: [
806,
805
],
3235: [
770,
770
],
3252: [
762,
762
],
3258: [
759,
758
],
3366: [
715,
714
],
3414: [
690,
690
],
3444: [
675,
674
],
3507: [
645,
644
],
3543: [
630,
629
],
3758: [
516,
515
],
3764: [
512,
511
],
3770: [
509,
508
],
3931: [
446,
446
],
4013: [
416,
415
],
4055: [
390,
390
],
4116: [
378,
377
],
4255: [
370,
370
],
4286: [
362,
361
],
4325: [
360,
360
],
4337: [
355,
355
],
4349: [
347,
347
],
4355: [
344,
343
],
4366: [
339,
339
],
4379: [
333,
333
],
4460: [
312,
311
],
4474: [
309,
309
],
4487: [
308,
308
],
4545: [
259,
257
],
4555: [
242,
241
]
}
如果我正在搜索包含以下内容的存储桶,我将如何执行此操作?存储桶数据是从网络抓取中收集的,所以我知道存储桶包含 ,但我正在寻找一个更通用的解决方案,允许我只需要抓取 x
个存储桶,然后我就有足够的数据来预测哪个存储桶将保留任何价值。我要补充一点,字典中的第一个和最后一个条目保证同时包含最大和最小的可搜索值。2672
692
2672
这已经是一个已知问题了吗?如果是这样,它叫什么?
我目前的解决方案如下:
def bucket_search(buckets, n):
start_idx = None
end_idx = None
bucket_keys = list(buckets.keys())
bucket_values = list(buckets.values())
for idx, bv in enumerate(bucket_values):
high,low = bv
if n > low and n < high:
start_idx = idx
end_idx = idx
break
elif high < n:
end_idx = idx
break
if (end_idx != start_idx):
continue_search = True
tmp_start_idx = end_idx - 1
while continue_search:
high, low = bucket_values[tmp_start_idx]
if (low > n):
continue_search = False
elif (low < n and high < n):
tmp_start_idx -= 1
start_idx = tmp_start_idx
start_bucket_value = bucket_keys[start_idx]
end_bucket_value = bucket_keys[end_idx]
start_value = bucket_values[start_idx][1]
end_value = bucket_values[end_idx][0]
bucket_value_difference = end_bucket_value - start_bucket_value
value_difference = start_value - end_value
increment_amount = bucket_value_difference / value_difference
tmp_value = start_value
tmp_bucket_value = start_bucket_value
for i in range(bucket_value_difference):
tmp_value -= increment_amount
tmp_bucket_value += 1
if (n > tmp_value):
break
return tmp_bucket_value
else:
return bucket_keys[start_idx]
目前(其中 data
是上面提供的示例数据)返回 。这很接近,但我认为有一个更好、更准确/精确的解决方案。bucket_search(data, 2672)
704
692
虽然我使用了 python,但只要答案可以用另一种语言重写,任何语言的答案都可以,即请不要使用任何利基库。
如果已经有一个回答的 SO 问题可以解决这个问题,我深表歉意,因为我不确定该怎么称呼这种问题。
答: 暂无答案
评论