使用 LMFIT 将曲线拟合应用于三角形 PDF 时出现意外结果

unexpected results while applying curve fitting to triangular pdf using lmfit

提问人:Abdulrahman Sheikho 提问时间:6/20/2023 最后编辑:ReinderienAbdulrahman Sheikho 更新时间:6/21/2023 访问量:51

问:

我有一组带有 x 和 y 坐标的点

x = np.array([0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330, 360, 390])
y = np.array([0, 1.2, 2.8, 1.7, 1.4, 1.2, 1.1, 0.91, 0.74, 0.61, 0.5, 0.28, 0.17, 0])

这些点可以公平地绘制一个三角形,所以我正在使用 lmfit 库进行三点估计。

注意:我知道scipy.stats.triang可以帮助我,但它有点有限,因为x和c应该在[0,1]之间,而且我已经在我的项目中做了很多lmfit。

代码:

import numpy as np
import matplotlib.pylab as plt
import lmfit

# defining triangular function
def triangular_pdf(x, amplitude, start, shape, end):
    model = np.zeros_like(x) # initialize model array with zeros
    mask1 = (x < start) | (x > end)
    mask2 = (x >= start) & (x < shape)
    mask3 = (x == shape)
    mask4 = (x >= shape) & (x < end)
    
    model[mask1] = 0
    model[mask2] = 2*amplitude*(x[mask2]-start) / ((end-start)*(shape-start))
    model[mask3] = 2*amplitude / (end - start)
    model[mask4] = 2*amplitude*(end-x[mask4]) / ((end-start)*(end-shape))
    model = np.where((x>=start)&(x<=end), model, 0) 
    return model

model_triangular = lmfit.Model(triangular_pdf)

params_init_triangular    = model_triangular.make_params(amplitude={'value':y.max(), 'vary':True},
                                        start={'value':x[0], 'vary':False},
                                        end={'value':x[-1], 'vary':False},
                                        shape={'value':(x[-1]+x[0])/2, 'max':x[-1], 'min':x[0]})

result     = model_triangular.fit(y, params_init_triangular, x=x)  #, nan_policy='propagate')
params_opt = result.params
print(result.fit_report())

x_samp = np.linspace(x.min(), x.max(), 300)
y_samp_model = model_triangular.eval(params_opt, x=x_samp)

plt.plot(x, y, 'o', label='points')
plt.plot(x_samp, y_samp_model, '-', label='triang')
plt.legend()
plt.show()

现在,当我运行这个东西时,结果:

[[Model]]
    Model(triangular_pdf)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 3
    # data points      = 14
    # variables        = 2
    chi-square         = 18.8851000
    reduced chi-square = 1.57375833
    Akaike info crit   = 8.19042290
    Bayesian info crit = 9.46853756
    R-squared          = -1.50895005
##  Warning: uncertainties could not be estimated:
    amplitude:  at initial value
    shape:      at initial value
[[Variables]]
    amplitude:  2.80000000 (init = 2.8)
    start:      0 (fixed)
    shape:      195.000000 (init = 195)
    end:        390 (fixed)

points & wrong result

现在这显然是错误的,参数(振幅和形状)应该与初始值不同,这并没有发生。

为了检查&是否正常工作,我尝试了以下操作:triangular_pdfmodel_triangular

x_test = np.linspace(0,400,50)
y_test = triangular_pdf(x_test, 1000, 0, 100*np.random.rand(), 400) + np.random.rand(1,len(x_test))[0]

result_test     = model_triangular.fit(y_test, params_init_triangular, x=x_test)#, nan_policy='propagate')
params_opt_test = result_test.params
print(result_test.fit_report())

x_samp = np.linspace(x_test.min(), x_test.max(), 300)
y_samp_model = model_triangular.eval(params_opt_test, x=x_samp)

plt.plot(x_test, y_test, 'o', label='points')
plt.plot(x_samp, y_samp_model, '-', label='triang')
plt.legend()
plt.show()

结果是

[[Model]]
    Model(triangular_pdf)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 34
    # data points      = 50
    # variables        = 2
    chi-square         = 9.11427030
    reduced chi-square = 0.18988063
    Akaike info crit   = -81.1090828
    Bayesian info crit = -77.2850368
    R-squared          = 0.92041334
[[Variables]]
    amplitude:  1125.37941 +/- 21.3058857 (1.89%) (init = 2.8)
    start:      0 (fixed)
    shape:      92.1393307 +/- 3.07739313 (3.34%) (init = 195)
    end:        390 (fixed)

points with valid result

现在参数(振幅和形状)发生了变化,结果实际上很好,这意味着该方法应该可以正常工作

我对问题出在哪里感到困惑。

谁能帮我?

Python 曲线拟合 lmfit 三角形

评论

0赞 R. C. 6/20/2023
如果您的测试用例有效,而实际案例无效,那么测试用例的初始参数比实际用例更幸运。我会尝试给它不同的初始参数或更改方法(拟合的方法参数,参见 lmfit.github.io/lmfit-py/model.html#lmfit.model.Model.fit)。
0赞 M Newville 6/20/2023
参数的初始值始终很重要。您可能想阅读(或重新阅读)有关尝试改变用作离散变量的参数的常见问题解答条目:lmfit.github.io/lmfit-py/...
0赞 Abdulrahman Sheikho 6/20/2023
我实际上尝试并更改了参数的初始值,并使其接近可能正确答案的 5%,但它根本不起作用。这就像我在告诉算法“嗨,你的参数就在那里”,它不适用并发生变化。我不认为这与正确的参数初始化有关,因为您可以从测试中看到振幅发生了变化,变成了另一个值。我担心这是我无法分辨的triangular_pdf问题。
0赞 Abdulrahman Sheikho 6/20/2023
@MNewville,感谢您指出常见问题解答,它超出了我的脑海。我将相应地重新审查我的代码。

答:

0赞 Abdulrahman Sheikho 6/21/2023 #1

正如@MNewville所建议的,问题出在基函数上,解决方案如下:

from scipy import special

def triangular_pdf(x, amplitude, start, shape, end):
    model1 = np.zeros_like(x)
    model2 = (x-start) / (shape-start)
    model3 = (end-x) / (end-shape)
    
    mask2 = (1-special.erf(x - shape)) / 2
    mask3 = (special.erf(x - shape)+1) / 2
    model = 2*amplitude* (model1 + model2 * mask2 + model3 * mask3) / (end - start)
    return model