提问人:Pedro Guidolim 提问时间:9/6/2023 最后编辑:Pedro Guidolim 更新时间:9/7/2023 访问量:66
python 和 lmfit:拟合多个不同长度的数据集
python and lmfit: Fit Multiple Data Sets with different lengths
问:
在 lmfit 文档中拟合多个数据集的示例中,数据变量具有相等长度的行。 我尝试调整示例以使用不同长度的数据集运行拟合。我所做的唯一更改是更改了五个不同数据集中每个数据集的 x 变量的长度,但没有成功。
我得到的错误是:
data = np.array(data)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (5,) + inhomogeneous part.
这是我尝试运行的代码:
import numpy as np
import matplotlib.pyplot as plt
from lmfit import minimize, Parameters, report_fit
def gauss(x, amp, cen, sigma):
"basic gaussian"
return amp*np.exp(-(x-cen)**2/(2.*sigma**2))
def gauss_dataset(params, i, x):
"""calc gaussian from params for data set i
using simple, hardwired naming convention"""
amp = params['amp_%i' % (i+1)].value
cen = params['cen_%i' % (i+1)].value
sig = params['sig_%i' % (i+1)].value
return gauss(x, amp, cen, sig)
def objective(params, x, data):
""" calculate total residual for fits to several data sets held
in a 2-D array, and modeled by Gaussian functions"""
ndata, nx = data.shape
resid = 0.0*data[:]
# make residual per data set
for i in range(ndata):
resid[i, :] = data[i, :] - gauss_dataset(params, i, x)
# now flatten this to a 1D array, as minimize() needs
return resid.flatten()
# create 5 datasets
data = []
for i in np.arange(5):
#Now x has diferrent legth each iterarion
x = np.linspace( -1, 2, 151+i)
params = Parameters()
amp = 0.60 + 9.50*np.random.rand()
cen = -0.20 + 1.20*np.random.rand()
sig = 0.25 + 0.03*np.random.rand()
dat = gauss(x, amp, cen, sig) + np.random.normal(size=len(x), scale=0.1)
data.append(dat)
# data has shape (5, 151)
data = np.array(data)
assert(data.shape) == (5, 151)
# create 5 sets of parameters, one per data set
fit_params = Parameters()
for iy, y in enumerate(data):
fit_params.add( 'amp_%i' % (iy+1), value=0.5, min=0.0, max=200)
fit_params.add( 'cen_%i' % (iy+1), value=0.4, min=-2.0, max=2.0)
fit_params.add( 'sig_%i' % (iy+1), value=0.3, min=0.01, max=3.0)
# but now constrain all values of sigma to have the same value
# by assigning sig_2, sig_3, .. sig_5 to be equal to sig_1
for iy in (2, 3, 4, 5):
fit_params['sig_%i' % iy].expr='sig_1'
# run the global fit to all the data sets
result = minimize(objective, fit_params, args=(x, data))
report_fit(result)
# plot the data sets and fits
plt.figure()
for i in range(5):
y_fit = gauss_dataset(fit_params, i, x)
plt.plot(x, data[i, :], 'o', x, y_fit, '-')
plt.show()
答:
0赞
M Newville
9/7/2023
#1
以下是同时拟合多个不同大小和形式的数据集的示例:
数据集 1:高斯与振幅、中心、西格玛 数据集 2:具有放大、频率、衰减、相移的衰减正弦波
我们假设高斯中心等于频率,高斯西格玛平方是正弦波的衰减参数。
import matplotlib.pyplot as plt
import numpy as np
from lmfit import Parameters, minimize, report_fit
from lmfit.lineshapes import gaussian
# parameters for mock data
gaus_amp = 200
gaus_cen = 4.5
gaus_sig = 7.0
sine_amp = 17
sine_shift = -0.75
sine_freq = gaus_cen
sine_decay = gaus_sig**2
# dataset 1: gaussian
x1 = np.linspace(-30, 30, 121)
y1 = gaussian(x1, gaus_amp, gaus_cen, gaus_sig)
y1 += np.random.normal(scale=0.25, size=len(y1))
# dataset 2: decaying sine wave
x2 = np.linspace(0, 150, 601)
y2 = sine_amp * np.sin(sine_shift +x2/sine_freq) * np.exp(-x2**2/sine_decay**2)
y2 += np.random.normal(scale=0.5, size=len(y2))
def objective(params, x1, x2, y1, y2):
gamp = params['g_amp']
gcen = params['g_cen']
gsig = params['g_sig']
gresid = y1 - gaussian(x1, gamp, gcen, gsig)
samp = params['s_amp']
sshift = params['s_shift']
sfreq = gcen
sdecay = gsig**2
sresid = y2 - samp * np.sin(sshift +x2/sfreq) * np.exp(-x2**2/sdecay**2)
# now concatenate the residuals of the data sets
return np.concatenate((gresid, sresid))
params = Parameters()
params.add('g_amp', value=150, min=0)
params.add('g_cen', value=5, min=0)
params.add('g_sig', value=5, min=0)
params.add('s_amp', value=10, min=0)
params.add('s_shift', value=-0.5, min=-3, max=3)
# Run the fit and show the fitting result
out = minimize(objective, params, args=(x1, x2, y1, y2))
report_fit(out.params)
这里的关键是从各个数据集的残差中制作一个一维残差数组。np.concatenate()
此示例将运行并打印出
[[Variables]]
g_amp: 201.092389 +/- 1.74123974 (0.87%) (init = 150)
g_cen: 4.50385672 +/- 0.00477268 (0.11%) (init = 5)
g_sig: 7.00094469 +/- 0.01821519 (0.26%) (init = 5)
s_amp: 16.8828293 +/- 0.08010449 (0.47%) (init = 10)
s_shift: -0.74422074 +/- 0.00555315 (0.75%) (init = -0.5)
[[Correlations]] (unreported correlations are < 0.100)
C(g_cen, s_shift) = +0.7679
C(g_sig, s_amp) = -0.5904
C(g_amp, g_sig) = +0.1502
我将把它留给读者一个练习,绘制与两个数据集的拟合,并考虑将其扩展到 30 个数据集;)。
评论