提问人:tpetzoldt 提问时间:3/8/2023 最后编辑:tpetzoldt 更新时间:3/8/2023 访问量:81
从 mlr3 中拟合的 regr.nnet 对象中提取权重
Extract weights from fitted regr.nnet object in mlr3
问:
这个问题与@Sebastian为上一个问题提供的解决方案有关。它展示了如何使用自定义(=固定)重采样策略和克隆学习者对学习者进行重复训练。regr.nnet
library(mlr3learners)
library(dplyr)
library(ggplot2)
set.seed(4123)
x <- 1:20
obs <- data.frame(
x = rep(x, 3),
f = factor(rep(c("a", "b", "c"), each = 20)),
y = c(3 * dnorm(x, 10, 3), 5 * dlnorm(x, 2, 0.5), dexp(20 - x, .5)) +
rnorm(60, sd = 0.02)
)
x_test <- seq(0, 20, length.out = 100)
test <- expand.grid(
x = x_test,
f = c("a", "b", "c"),
y = c(3 * dnorm(x_test, 10, 3), 5 * dlnorm(x_test, 2, 0.5),
dexp(20 - x_test, .5)) + rnorm(60, sd = 0.02)
)
dat <- rbind(obs, test)
task <- as_task_regr(dat, target = "y")
resampling <- rsmp("custom")
resampling$instantiate(task, list(train = 1:60), test = list(61:90060))
learner = lrn("regr.nnet", size=5, trace=FALSE)
learners <- replicate(10, learner$clone())
design <- benchmark_grid(
tasks = task,
learners = learners,
resampling
)
bmr <- benchmark(design)
现在的下一部分是进一步评估基准,并使用该模型在 mlr3 内部和外部进行进一步评估。在下文中,我尝试评估模型性能并绘制测试数据的预测图:
## evaluate quality criteria
bmr$aggregate()[learner_id == "regr.nnet"] # ok
bmr$aggregate(msr("time_train")) # works
# bmr$aggregate(msr("regr.rmse"), msr("regr.rsq"), msr("regr.bias")) # not possible
## select the best fit
i_best <- which.min(bmr$aggregate()$regr.mse)
best <- bmr$resample_result(i_best)
## do prediction
pr <- as.data.table(best$predictions()[[1]])$response
## visualization
pred_test <- test |> mutate(y = pr)
ggplot(obs, aes(x, y)) + geom_point() +
geom_line(data = pred_test, mapping = aes(x, y)) +
facet_wrap(~f)
R6 风格当然有其优点,我自己也参与了 R6 之前的原型包的开发,但有时要找到访问内部数据的最佳方式并不容易。mlr3 这本书非常有帮助,但问题仍然存在:
- 是否可以很容易地提取其他措施,例如 从基准对象?
msr("regr.rmse")
- 我对我的代码行不满意,但还没有找到更好的方法。
pr <- as.data.table() .....
- 最后,我想访问拟合的内部数据结构,以提取原始权重,以便在 R 之外“离线”使用神经网络。
nnet
答:
(我自己添加)如果您使用 mlr3 创建教材并想分享它,您可以在 mlr-org/mlr3website 存储库中创建问题或 PR。在 mlr-org 网站上,我们有一个资源选项卡,我们可以在其中链接类似的东西:)https://mlr-org.com/resources.html
是否可以很容易地提取其他措施,例如 从基准对象?
msr("regr.rmse")
该方法采用一系列度量值(例如,可由 构造)。(见下面的代码)$aggregate()
msrs()
- 我对我的代码行.....不满意,但还没有找到更好的方法。
pr <- as.data.table()
您可以不进行转换。best$predictions()[[1]]$response
- 最后,我想访问拟合 nnet 的内部数据结构,以提取原始权重,以便在 R 之外“离线”使用神经网络。
我们不干涉拟合物体的内部结构。它们可以通过训练有素的学习者的插槽访问(请参阅下面的代码)。$model
library(mlr3)
library(mlr3learners)
learner = lrns(c("classif.rpart", "classif.nnet"))
task = tsk("iris")
resampling = rsmp("holdout")
design = benchmark_grid(
tasks = task,
learners = learner,
resamplings = resampling
)
bmr = benchmark(design, store_models = TRUE)
#> INFO [07:13:52.802] [mlr3] Running benchmark with 2 resampling iterations
#> INFO [07:13:52.889] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/1)
#> INFO [07:13:52.921] [mlr3] Applying learner 'classif.nnet' on task 'iris' (iter 1/1)
#> # weights: 27
#> initial value 118.756408
#> iter 10 value 58.639749
#> iter 20 value 45.676852
#> iter 30 value 21.336083
#> iter 40 value 8.646964
#> iter 50 value 6.041813
#> iter 60 value 5.906140
#> iter 70 value 5.902865
#> iter 80 value 5.898339
#> final value 5.898161
#> converged
#> INFO [07:13:52.946] [mlr3] Finished benchmark
bmr$aggregate(msrs(c("classif.acc", "time_train")))
#> nr resample_result task_id learner_id resampling_id iters
#> 1: 1 <ResampleResult[21]> iris classif.rpart holdout 1
#> 2: 2 <ResampleResult[21]> iris classif.nnet holdout 1
#> classif.acc time_train
#> 1: 0.98 0.007
#> 2: 1.00 0.005
# get the first resample result
rr1 = bmr$resample_result(1)
# Get the model from the first resampling iteration of this ResampleResult
rr1$learners[[1]]$model
#> n= 100
#>
#> node), split, n, loss, yval, (yprob)
#> * denotes terminal node
#>
#> 1) root 100 66 versicolor (0.32000000 0.34000000 0.34000000)
#> 2) Petal.Length< 2.45 32 0 setosa (1.00000000 0.00000000 0.00000000) *
#> 3) Petal.Length>=2.45 68 34 versicolor (0.00000000 0.50000000 0.50000000)
#> 6) Petal.Width< 1.75 37 4 versicolor (0.00000000 0.89189189 0.10810811) *
#> 7) Petal.Width>=1.75 31 1 virginica (0.00000000 0.03225806 0.96774194) *
创建于 2023-03-08 with reprex v2.0.2
评论
根据 @Sebastian 的回答,我已经能够解决一些剩余的小细节,尤其是如何将其与上一个线程相结合。我采用了与 @Sebastian 的答案类似的示例,但将其修改为作为输入和目标的回归任务。然后,我集成了“蛮力”方法,以使用相同的设置训练多个网络,并将相同的细分应用于训练和测试数据。iris
Species
Petal.width
library(mlr3learners)
#> Loading required package: mlr3
set.seed(123)
data(iris) # to show that we use this data set
## half of indices for training and other for test subset
id_train <- sample(1:nrow(iris), nrow(iris) %/% 2)
id_test <- which(!((1:nrow(iris)) %in% id_train))
## create task, learner and custom resampling strategy
task <- as_task_regr(iris, target = "Petal.Width")
learner <- lrn("regr.nnet", size = 5, trace = FALSE)
resampling <- rsmp("custom")
resampling$instantiate(task, train = list(id_train), test = list(id_test))
## replicate the learners, create and run a benchmark design
learners <- replicate(10, learner$clone())
design <- benchmark_grid(
tasks = task,
learners = learners,
resamplings = resampling
)
bmr <- benchmark(design, store_models = TRUE)
## summary of results and "best" result according to mse
bmr$aggregate(msrs(c("regr.mse", "regr.rmse", "regr.rsq", "time_train")))
(i_best <- which.min(bmr$aggregate()$regr.mse))
#> [1] 3
best <- bmr$resample_result(i_best)
## prediction for the test set
best$predictions()[[1]]$response
## or for both subsets
best$learners[[1]]$predict(task, row_ids = id_train)$response
best$learners[[1]]$predict(task, row_ids = id_test)$response
## extract raw model and the weights with coef() method from package "nnet"
best$learners[[1]]$model
#> a 5-5-1 network with 36 weights
#> inputs: Petal.Length Sepal.Length Sepal.Width Speciesversicolor Speciesvirginica
#> output(s): Petal.Width
#> options were - linear output units
coef(best$learners[[1]]$model)
#> b->h1 i1->h1 i2->h1 i3->h1 i4->h1 i5->h1
#> 0.33696598 5.87087353 5.86170711 2.01521854 -0.40507201 0.59342370
#> b->h2 i1->h2 i2->h2 i3->h2 i4->h2 i5->h2
#> -0.28521282 -0.66194804 -2.53390441 -1.83519953 -0.36281388 0.08545735
#> b->h3 i1->h3 i2->h3 i3->h3 i4->h3 i5->h3
#> -0.73454761 -0.57450360 0.15459231 -0.92357099 -1.44787867 3.87019807
#> b->h4 i1->h4 i2->h4 i3->h4 i4->h4 i5->h4
#> 4.15247439 -0.31732640 -0.07613404 -0.33243948 -1.59663476 -5.46270442
#> b->h5 i1->h5 i2->h5 i3->h5 i4->h5 i5->h5
#> 1.31088094 -0.51583287 3.79240044 3.43102418 -0.56412793 -0.46786541
#> b->o h1->o h2->o h3->o h4->o h5->o
#> 0.30750849 12.55643667 3.18489708 -2.57517235 -2.36851810 -10.44863295
创建于 2023-03-08 with reprex v2.0.2
评论
as.data.table()
Predictions
data
?Predictions
best$predictions()[[1]][["data"]][["response"]]
?Predictions