提问人:Vittoria Roatti 提问时间:4/28/2023 最后编辑:Gilles QuénotVittoria Roatti 更新时间:4/28/2023 访问量:35
计算具有条件中断的 while 循环的持续时间
Calculate duration through while loop with conditional break
问:
我正在尝试计算在行为观察期间离开视线的时间。我有一个 df(链接),如果有“In_Sight”或观察结束(“End_Focal”),“Out_Sight”就会停止。我想计算每个焦点观察的视线外时间(每个观察都有一个唯一的“ObsID”——持续时间将出现在每行“foc_summary”中,如下图所示)。
我创建了一个循环,其中属于每个 ObsID 的记录应单独处理。如果新Time_OutSight变量遇到Out_Sight并且没有“In_Sight”或“End_Focal”(“停止”),则 for 循环会将时间加到新变量中。
df$HMS <- lubridate::hms(df$Time, roll = F)
library(dplyr)
foc_summary <- df %>%
mutate(ObsID=factor(ObsID)) %>%
group_by(ObsID) %>%
slice(1)
detach('package:dplyr')
foc_summary <- as.data.frame(foc_summary)
df <- df[order(df$ObsID, df$HMS),] # Order focal data to have the right sequence
stop <- list("In_Sight","End_Focal")
foc_summary$Time_OutSight <- NULL
for (i in 1:nrow(foc_summary)){
I <- foc_summary$ObsID[i]
print(I)
tabI <- droplevels(subset(df, df$ObsID == I))
Time_OutSight <- c()
for(j in 1:nrow(tabI)){
if (tabI$Setting[j] %in% "Out_Sight"){
k=j+1
while(k < nrow(tabI) &&
tabI$Setting[k] %!in% stop){k=k+1}
print(tabI$Setting[k]); print(tabI$HMS[k])
Time_OutSight <- c(Time_OutSight, as.numeric(tabI$HMS[k])-as.numeric(tabI$HMS[j]))
}
}
foc_summary$Time_OutSight[i] <- sum(Time_OutSight)
}
该代码包括一个“不在”函数:
'%!in%' <- function(x,y)!('%in%'(x,y))
该代码仅部分起作用,因为它似乎不会停在“End_Focal”处,从而导致超长的视线时间。
有人可以帮我理解为什么代码不起作用吗?提前感谢您的帮助!
答:
1赞
r2evans
4/28/2023
#1
我建议 / 循环对于总结这样的操作来说不是最佳或有效的。我认为我们可以简单地按组进行总结,如下所示。for
while
我将在 dplyr 中演示,尽管这可以在基础 R 中完成或多做一些工作。data.table
首先,让我们找到我们需要使用的所有行。
library(dplyr)
focal %>%
mutate(HMS = lubridate::hms(HMS)) %>%
filter(Setting %in% c("Out_Sight", "In_Sight", "End_Focal")) %>%
group_by(ObsID) %>%
mutate(outofsight = cumsum(Setting == "Out_Sight")) %>%
select(ObsID, outofsight, Time, HMS, Setting)
# # A tibble: 37 × 5
# # Groups: ObsID [6]
# ObsID outofsight Time HMS Setting
# <int> <int> <chr> <Period> <chr>
# 1 548 1 13:00:26 13H 0M 26S Out_Sight
# 2 548 1 15:08:59 15H 8M 59S End_Focal
# 3 1535 0 11:12:35 11H 12M 35S End_Focal
# 4 1731 1 8:15:20 8H 15M 20S Out_Sight
# 5 1731 1 8:15:31 8H 15M 31S In_Sight
# 6 1731 2 8:15:47 8H 15M 47S Out_Sight
# 7 1731 2 8:22:59 8H 22M 59S In_Sight
# 8 1731 3 8:30:31 8H 30M 31S Out_Sight
# 9 1731 3 8:31:11 8H 31M 11S In_Sight
# 10 1731 4 8:32:27 8H 32M 27S Out_Sight
# # … with 27 more rows
# # ℹ Use `print(n = ...)` to see more rows
该变量是第二个分组变量,其中我们确信每个 的第一行,其中:outofsight
ObsID, outofsight
- 如果第一行是并且有两行或更多行,那么我们需要第一行和最后一行之间的时间跨度;
"Out_Sight"
- 如果第一行是并且只有一行,那么我们看不到结束设置,因此我们将丢弃该组;
"Out_Sight"
- 如果第一行不是 ,那么它从未经历过看不见,所以我们将丢弃该组。
"Out_Sight"
ObsID
有了这个,我们可以添加一些总结:
focal %>%
mutate(HMS = lubridate::hms(HMS)) %>%
filter(Setting %in% c("Out_Sight", "In_Sight", "End_Focal")) %>%
group_by(ObsID) %>%
mutate(outofsight = cumsum(Setting == "Out_Sight")) %>%
select(ObsID, outofsight, Time, HMS, Setting) %>%
group_by(ObsID, outofsight) %>%
filter(first(Setting) == "Out_Sight", any(Setting %in% c("In_Sight", "End_Focal"))) %>%
summarize(time = last(HMS) - first(HMS)) %>%
ungroup()
# # A tibble: 16 × 3
# ObsID outofsight time
# <int> <int> <Period>
# 1 548 1 2H 8M 33S
# 2 1731 1 11S
# 3 1731 2 7M 12S
# 4 1731 3 1M -20S
# 5 1731 4 22S
# 6 1731 5 31S
# 7 1731 6 21S
# 8 1731 7 1M -15S
# 9 1731 8 1M 13S
# 10 1731 9 2M 28S
# 11 3097 1 3M -29S
# 12 3097 2 12M 36S
# 13 3100 1 9M 43S
# 14 3440 1 27S
# 15 3440 2 29S
# 16 3440 3 11M -1S
就个人而言,我更喜欢秒而不是句点,因此将最后一行更改为:
... %>%
summarize(secs = period_to_seconds(last(HMS) - first(HMS)))
# # A tibble: 16 × 3
# # Groups: ObsID [5]
# ObsID outofsight secs
# <int> <int> <dbl>
# 1 548 1 7713
# 2 1731 1 11
# 3 1731 2 432
# 4 1731 3 40
# 5 1731 4 22
# 6 1731 5 31
# 7 1731 6 21
# 8 1731 7 45
# 9 1731 8 73
# 10 1731 9 148
# 11 3097 1 151
# 12 3097 2 756
# 13 3100 1 583
# 14 3440 1 27
# 15 3440 2 29
# 16 3440 3 659
在这里,我们得到了每个看不见的时间段(一个或多个)的长度,每个时间段都没有任何看不见的时间段。你可以用更多的时间来总结这些:ObsID
focal %>%
mutate(HMS = lubridate::hms(HMS)) %>%
filter(Setting %in% c("Out_Sight", "In_Sight", "End_Focal")) %>%
group_by(ObsID) %>%
mutate(outofsight = cumsum(Setting == "Out_Sight")) %>%
select(ObsID, outofsight, Time, HMS, Setting) %>%
group_by(ObsID, outofsight) %>%
filter(first(Setting) == "Out_Sight", any(Setting %in% c("In_Sight", "End_Focal"))) %>%
summarize(secs = period_to_seconds(last(HMS) - first(HMS))) %>%
group_by(ObsID) %>%
summarize(secs = sum(secs))
# # A tibble: 5 × 2
# ObsID secs
# <int> <dbl>
# 1 548 7713
# 2 1731 823
# 3 3097 907
# 4 3100 583
# 5 3440 715
评论
1赞
Vittoria Roatti
4/28/2023
谢谢 r2evans,这是一个非常有效的解决方案!:)
评论