提问人:Victoria 提问时间:7/19/2023 最后编辑:ReinderienVictoria 更新时间:7/19/2023 访问量:22
无法从 2D 直方图生成 bin 边界内的随机分布点
Trouble generating randomly distributed points within bin bounds from 2D histogram
问:
我的目标是从 2D 直方图生成散点图,其中如果一个 bin 的计数为 n,则在 bin 边界内随机生成 n 个点。有点像这样:
但是,我在生成箱边界内的点以获得更均匀的分布时遇到了问题。例如,以下热图没有生成应有的散点图。
2D 直方图:
不反映 2D 直方图的散点图:
我在下面添加了代码和函数调用。如何修复我的代码,使其正确生成积分?
def repopulateScatterHelper(x,y,m):
"""
generate a random point within bounds
"""
# compute x and y axis min and max
maxX = max(xedges) #the max x value from edges
maxY = max(yedges)
minX = min(xedges)
minY = min(yedges)
# compute bin boundaries
x1 = float(x)/m * (maxX-minX) + minX
x2 = float(x+1)/m *(maxX-minX) + minX
y1 = float(y)/m * (maxY-minY) + minY
y2 = float(y+1)/m * (maxY-minY) + minY
# generate random point within bin boundaries
the_x = uniform(x1, x2)
the_y = uniform(y1, y2)
return the_x, the_y
# enddef
def repopulateScatter(H, m):
"""
@params
H - 2D array of counts
m - number of bins along each axis
@returns
new_x, new_y - Generated corresponding x and y coordinates of points
"""
new_x = []
new_y = []
for i in range(0,m): # rows
for j in range(0,m): #colomns
if H[i][j] > 0: # if count is greater than zero, generate points
for point in range(0, int(H[i][j])):
x_i, y_i = repopulateScatterHelper(i,j,m)
new_x.append(x_i)
new_y.append(y_i)
#endfor
#endif
#endfor
#endfor
return new_x,new_y
#enddef
def plotHistToScatter(new_x, new_y):
"""
new_x, new_y - x,y coordinates to plot
"""
new_x = np.array(new_x)
new_y = np.array(new_y)
# plot data points
fig, ax = plt.subplots()
ax.scatter(new_x,new_y)
# add LOBF to plot - https://www.statology.org/line-of-best-fit-python/
a, b = np.polyfit(new_x,new_y, 1)
a = float(a)
plt.plot(new_x, a*new_x+b, color = "red")
print("DP LOBF:", a , "*(x) +" , b)
# label the plot
plt.xlabel(xAxisLabel)
plt.ylabel(yAxisLabel)
plt.title("heatmap to scatterplot for " + xAxisLabel + ' vs ' + yAxisLabel + "epsilon =" + str(epsilon))
plt.show()
#enddef
我的函数调用是:
H, xedges, yedges = np.histogram2d(df[xAxisLabel],df[yAxisLabel], bins=(m, m)) # plot 2D histogram
new_x,new_y = repopulateScatter(H,m)
plotHistToScatter(new_x, new_y)
我试图更改 repopulateScatterHelper() 函数来修复它,但是我没有成功。
答: 暂无答案
下一个:生成用于测试的随机数据
评论
m = 70 randx = np.random.randint(0,20, m) randy = np.random.randint(0,20, m) xAxisLabel = 'xAxis' yAxisLabel = 'yAxis' epsilon = '0.01'