计算具有条件中断的 while 循环的持续时间

Calculate duration through while loop with conditional break

提问人:Vittoria Roatti 提问时间:4/28/2023 最后编辑:Gilles QuénotVittoria Roatti 更新时间:4/28/2023 访问量:35

问:

我正在尝试计算在行为观察期间离开视线的时间。我有一个 df(链接),如果有“In_Sight”或观察结束(“End_Focal”),“Out_Sight”就会停止。我想计算每个焦点观察的视线外时间(每个观察都有一个唯一的“ObsID”——持续时间将出现在每行“foc_summary”中,如下图所示)。

我创建了一个循环,其中属于每个 ObsID 的记录应单独处理。如果新Time_OutSight变量遇到Out_Sight并且没有“In_Sight”或“End_Focal”(“停止”),则 for 循环会将时间加到新变量中。

df$HMS <- lubridate::hms(df$Time, roll = F)

library(dplyr)
foc_summary <- df %>% 
  mutate(ObsID=factor(ObsID)) %>% 
  group_by(ObsID) %>% 
  slice(1) 
detach('package:dplyr')

foc_summary <- as.data.frame(foc_summary)

df <- df[order(df$ObsID, df$HMS),] # Order focal data to have the right sequence

stop <- list("In_Sight","End_Focal")

foc_summary$Time_OutSight <- NULL

for (i in 1:nrow(foc_summary)){
  I <- foc_summary$ObsID[i]
  print(I)
  
  tabI <- droplevels(subset(df, df$ObsID == I))
  Time_OutSight <- c()
  
  for(j in 1:nrow(tabI)){
    if (tabI$Setting[j] %in% "Out_Sight"){
      
      k=j+1
      while(k < nrow(tabI) &&
            tabI$Setting[k] %!in% stop){k=k+1}
      print(tabI$Setting[k]); print(tabI$HMS[k])
      
      Time_OutSight <- c(Time_OutSight, as.numeric(tabI$HMS[k])-as.numeric(tabI$HMS[j]))
    }
  }
  foc_summary$Time_OutSight[i] <- sum(Time_OutSight)
}

该代码包括一个“不在”函数:

'%!in%' <- function(x,y)!('%in%'(x,y))

该代码仅部分起作用,因为它似乎不会停在“End_Focal”处,从而导致超长的视线时间。

有人可以帮我理解为什么代码不起作用吗?提前感谢您的帮助!

r for 循环 while-loop 持续时间

评论

1赞 Gilles Quénot 4/28/2023
下次添加您的语言标签。为您完成

答:

1赞 r2evans 4/28/2023 #1

我建议 / 循环对于总结这样的操作来说不是最佳或有效的。我认为我们可以简单地按组进行总结,如下所示。forwhile

我将在 dplyr 中演示,尽管这可以在基础 R 中完成或多做一些工作。data.table

首先,让我们找到我们需要使用的所有行。

library(dplyr)
focal %>%
  mutate(HMS = lubridate::hms(HMS)) %>%
  filter(Setting %in% c("Out_Sight", "In_Sight", "End_Focal")) %>%
  group_by(ObsID) %>%
  mutate(outofsight = cumsum(Setting == "Out_Sight")) %>%
  select(ObsID, outofsight, Time, HMS, Setting)
# # A tibble: 37 × 5
# # Groups:   ObsID [6]
#    ObsID outofsight Time     HMS         Setting  
#    <int>      <int> <chr>    <Period>    <chr>    
#  1   548          1 13:00:26 13H 0M 26S  Out_Sight
#  2   548          1 15:08:59 15H 8M 59S  End_Focal
#  3  1535          0 11:12:35 11H 12M 35S End_Focal
#  4  1731          1 8:15:20  8H 15M 20S  Out_Sight
#  5  1731          1 8:15:31  8H 15M 31S  In_Sight 
#  6  1731          2 8:15:47  8H 15M 47S  Out_Sight
#  7  1731          2 8:22:59  8H 22M 59S  In_Sight 
#  8  1731          3 8:30:31  8H 30M 31S  Out_Sight
#  9  1731          3 8:31:11  8H 31M 11S  In_Sight 
# 10  1731          4 8:32:27  8H 32M 27S  Out_Sight
# # … with 27 more rows
# # ℹ Use `print(n = ...)` to see more rows

该变量是第二个分组变量,其中我们确信每个 的第一行,其中:outofsightObsID, outofsight

  • 如果第一行是并且有两行或更多行,那么我们需要第一行和最后一行之间的时间跨度;"Out_Sight"
  • 如果第一行是并且只有一行,那么我们看不到结束设置,因此我们将丢弃该组;"Out_Sight"
  • 如果第一行不是 ,那么它从未经历过看不见,所以我们将丢弃该组。"Out_Sight"ObsID

有了这个,我们可以添加一些总结:

focal %>%
  mutate(HMS = lubridate::hms(HMS)) %>%
  filter(Setting %in% c("Out_Sight", "In_Sight", "End_Focal")) %>%
  group_by(ObsID) %>%
  mutate(outofsight = cumsum(Setting == "Out_Sight")) %>%
  select(ObsID, outofsight, Time, HMS, Setting) %>%
  group_by(ObsID, outofsight) %>%
  filter(first(Setting) == "Out_Sight", any(Setting %in% c("In_Sight", "End_Focal"))) %>%
  summarize(time = last(HMS) - first(HMS)) %>%
  ungroup()
# # A tibble: 16 × 3
#    ObsID outofsight time     
#    <int>      <int> <Period> 
#  1   548          1 2H 8M 33S
#  2  1731          1 11S      
#  3  1731          2 7M 12S   
#  4  1731          3 1M -20S  
#  5  1731          4 22S      
#  6  1731          5 31S      
#  7  1731          6 21S      
#  8  1731          7 1M -15S  
#  9  1731          8 1M 13S   
# 10  1731          9 2M 28S   
# 11  3097          1 3M -29S  
# 12  3097          2 12M 36S  
# 13  3100          1 9M 43S   
# 14  3440          1 27S      
# 15  3440          2 29S      
# 16  3440          3 11M -1S  

就个人而言,我更喜欢秒而不是句点,因此将最后一行更改为:

... %>%
  summarize(secs = period_to_seconds(last(HMS) - first(HMS)))
# # A tibble: 16 × 3
# # Groups:   ObsID [5]
#    ObsID outofsight  secs
#    <int>      <int> <dbl>
#  1   548          1  7713
#  2  1731          1    11
#  3  1731          2   432
#  4  1731          3    40
#  5  1731          4    22
#  6  1731          5    31
#  7  1731          6    21
#  8  1731          7    45
#  9  1731          8    73
# 10  1731          9   148
# 11  3097          1   151
# 12  3097          2   756
# 13  3100          1   583
# 14  3440          1    27
# 15  3440          2    29
# 16  3440          3   659

在这里,我们得到了每个看不见的时间段(一个或多个)的长度,每个时间段都没有任何看不见的时间段。你可以用更多的时间来总结这些:ObsID

focal %>%
  mutate(HMS = lubridate::hms(HMS)) %>%
  filter(Setting %in% c("Out_Sight", "In_Sight", "End_Focal")) %>%
  group_by(ObsID) %>%
  mutate(outofsight = cumsum(Setting == "Out_Sight")) %>%
  select(ObsID, outofsight, Time, HMS, Setting) %>%
  group_by(ObsID, outofsight) %>%
  filter(first(Setting) == "Out_Sight", any(Setting %in% c("In_Sight", "End_Focal"))) %>%
  summarize(secs = period_to_seconds(last(HMS) - first(HMS))) %>%
  group_by(ObsID) %>%
  summarize(secs = sum(secs))
# # A tibble: 5 × 2
#   ObsID  secs
#   <int> <dbl>
# 1   548  7713
# 2  1731   823
# 3  3097   907
# 4  3100   583
# 5  3440   715

评论

1赞 Vittoria Roatti 4/28/2023
谢谢 r2evans,这是一个非常有效的解决方案!:)