R 在 1:36:14 而不是 2:00:00 从 PDT 切换到 PST - Lubridate 在切换前分配时区

R switches from PDT to PST at 1:36:14 and not at 2:00:00 - Lubridate assigns time zone before switch

提问人:Colton Stephens 提问时间:11/9/2023 最后编辑:Colton Stephens 更新时间:11/21/2023 访问量:106

问:

赏金将在 5 天后到期。这个问题的答案有资格获得 +200 声望赏金。TheLaTemail希望引起人们对这个问题的更多关注

在查看从 PDT 到 PST 的时区更改重叠的日期时间值时,R 似乎在 1:36:14 切换时区,而不是按预期在 2:00:00 切换时区。具体而言,R 将 PST 时区分配给 2021-11-07 01:36:14 之后的所有日期时间(如下所示):

x <-c(
    "2021-11-07 1:00:00",
    "2021-11-07 1:00:01",
    "2021-11-07 1:35:00",
    "2021-11-07 1:36:00",
    "2021-11-07 1:36:10",
    "2021-11-07 1:36:14",
    "2021-11-07 1:36:15",
    "2021-11-07 1:36:30",
    "2021-11-07 1:36:59",
    "2021-11-07 1:45:00",
    "2021-11-07 1:59:59",
    "2021-11-07 2:00:00",
    "2021-11-07 2:30:00"
    )
x_pst <- as.POSIXct(x, tz = "PST8PDT")
> x_pst
# ...
[5] "2021-11-07 01:36:10 PDT" "2021-11-07 01:36:14 PDT"
[7] "2021-11-07 01:36:15 PST" "2021-11-07 01:36:30 PST"
# ...

除此之外,lubridate 似乎在切换之前将所有日期时间调整为 PST(使用相同的数据):

x_pst <- lubridate::as_datetime(x, tz = "PST8PDT")
> x_pst
[1] "2021-11-07 01:00:00 PST" "2021-11-07 01:00:01 PST"
[3] "2021-11-07 01:35:00 PST" "2021-11-07 01:36:00 PST"
[5] "2021-11-07 01:36:10 PST" "2021-11-07 01:36:14 PST"
[7] "2021-11-07 01:36:15 PST" "2021-11-07 01:36:30 PST"
[9] "2021-11-07 01:36:59 PST" "2021-11-07 01:45:00 PST"
[11] "2021-11-07 01:59:59 PST" "2021-11-07 02:00:00 PST"
[13] "2021-11-07 02:30:00 PST"

x_pst <- lubridate::ymd_hms(x, tz = "PST8PDT")
> x_pst
# same output as above

那么,为什么时区会在如此特定的时间切换,以及 lubridate 通过将 PST 分配给更改前的所有日期时间来做什么?

会议信息:

> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.0

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: US/Pacific
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.3.1   generics_0.1.3   tools_4.3.1     
[4] lubridate_1.9.3  timechange_0.2.0
R Datetime 润滑剂

评论

2赞 thelatemail 11/9/2023
我可以复制这一点,并至少回到 R 3.6。我也可以用“美国/太平洋”得到相同的结果
5赞 Ben Bolker 11/9/2023
嗯。在我的系统上,第一个示例为 TRUE。(Pop!_OS 22.04,时区为 ,本地时区为 America/Toronto)。你能编辑你的问题以包括结果吗?all(sapply(as.POSIXlt(x_pst), \(x) x$zone) == "PST")en_CA.UTF-8sessionInfo()
3赞 thelatemail 11/9/2023
@BenBolker - 确认这绝对适合我在 R 4.3.1 Windows 10 上。tzcode 源:内部。FALSE
3赞 Ben Bolker 11/9/2023
FWIW 我有“tzcode source: system (glibc)”,使用 R-devel
3赞 IRTFM 11/10/2023
我不认为润滑剂是原因,因为你用 as 来看待它。POSIXct.我可以在运行 4.2.3 的 Mac 上确认此行为。
4赞 Joseph Wood 11/23/2023
当我有时间时,我打算编写一个纯函数来确认这一点。如果您阅读了上面的文档,似乎没有任何真正的迹象表明原因。Cmktime

答:

7赞 Joseph Wood 11/24/2023 #1

这不是一个完整的答案,但我希望有更多专业知识的人可以在此基础上再接再厉。

as.POSIXct

在进入代码之前,我首先想提供一些上下文。我们从定义了多个方法的泛型函数开始。as.POSIXctS3

as.POSIXct
#> function (x, tz = "", ...)
#> UseMethod("as.POSIXct")

methods(as.POSIXct)
#> [1] as.POSIXct.Date    as.POSIXct.default as.POSIXct.numeric as.POSIXct.POSIXlt
#> see '?methods' for accessing help and source code

对于 OP 给出的示例,由于我们正在处理字符数据类型,因此我们将使用以下方法:default

as.POSIXct.default
#> function (x, tz = "", ...)
#> {
#>     if (inherits(x, "POSIXct"))
#>         return(if (missing(tz)) x else .POSIXct(x, tz))
#>     if (is.null(x))
#>         return(.POSIXct(numeric(), tz))
#>     if (is.character(x) || is.factor(x))
#>         return(as.POSIXct(as.POSIXlt(x, tz, ...), tz, ...))
#>     if (is.logical(x) && all(is.na(x)))
#>         return(.POSIXct(as.numeric(x), tz))
#>     stop(gettextf("do not know how to convert '%s' to class %s",
#>         deparse1(substitute(x)), dQuote("POSIXct")), domain = NA)
#> }

这使我们调用(上面的第三个条件)一个泛型函数,它恰好有一个字符方法:.我不会粘贴源代码,但该函数的核心是 .as.POSIXltS3as.POSIXlt.characterstrptime

strptime
#> function (x, format, tz = "")
#> .Internal(strptime(if (is.character(x)) x else if (is.object(x)) `names<-`(as.character(x),
#>     names(x)) else `storage.mode<-`(x, "character"), format, tz))

您可以在此处查看代码。我最初尝试从逻辑上遵循代码,但事实证明这非常困难。C

RApiDatetime

幸运的是,有一个包 RApiDatetime(感谢 Dirk!),它的功能是: .根据 OP 提供的值调用它,我们有:RApiDatetime::rapistrptime

RApiDatetime::rapistrptime(x, fmt = "%Y-%m-%d %H:%M:%OS", "PST8PDT")
#> $sec
#>  [1]  0  1  0  0 10 14 15 30 59  0 59  0  0
#>
#> $min
#>  [1]  0  0 35 36 36 36 36 36 36 45 59  0 30
#>
#> $hour
#>  [1] 1 1 1 1 1 1 1 1 1 1 1 2 2
#>
#> $mday
#>  [1] 7 7 7 7 7 7 7 7 7 7 7 7 7
#>
#> $mon
#>  [1] 10 10 10 10 10 10 10 10 10 10 10 10 10
#>
#> $year
#>  [1] 121 121 121 121 121 121 121 121 121 121 121 121 121
#>
#> $wday
#>  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
#>
#> $yday
#>  [1] 310 310 310 310 310 310 310 310 310 310 310 310 310
#>
#> $isdst
#>  [1] 1 1 1 1 1 1 0 0 0 0 0 0 0
#>
#> $zone
#>  [1] "PDT" "PDT" "PDT" "PDT" "PDT" "PDT" "PST" "PST" "PST" "PST" "PST" "PST" "PST"
#>
#> $gmtoff
#>  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA
#>
#> attr(,"class")
#> [1] "POSIXlt" "POSIXt"
#> attr(,"tzone")
#> [1] "PST8PDT" "PST"     "PDT"

我们看到这个领域看起来值得研究。在克隆了 repo 并粗略地使用了 之后,我更容易遵循路径。我们发现背后的真正行动就发生在这里isdstprintfisdist

.
.
    OK = tm->tm_year < 138 && tm->tm_year >= (have_broken_mktime() ? 70 : 02);
    if(OK) {
    res = (double) mktime(tm);
    if (res == -1.) return res;
.
.

mktime

最后,我们在评论中谈到了我的主张。mktime

我写了这个非常简单的函数来查看调用后我们的结构会发生什么:C++mktime

#include <Rcpp.h>
using namespace Rcpp;

#include <time.h>
#include <stdio.h>

// [[Rcpp::export]]
void CheckMkTime(int tm_sec) {
    struct tm info;

    info.tm_sec = tm_sec;
    info.tm_min = 36;
    info.tm_hour = 1;
    info.tm_mday = 7;
    info.tm_mon = 10;
    info.tm_year = 121;
    info.tm_wday = 0;
    info.tm_yday = 310;
    info.tm_isdst = -1;

    time_t val = mktime(&info);
    printf("mktime_res: %jd,\n tm_zone: %s,\n tm_gmtoff: %ld,\n tm_sec: %d,\n "
           "tm_min: %d,\n tm_hour: %d,\n tm_mday: %d,\n tm_mon: %d,\n "
           "tm_year: %d,\n tm_wday: %d,\n tm_yday: %d,\n tm_isdst: %d,\n",
           val,
           info.tm_zone,
           info.tm_gmtoff,
           info.tm_sec,
           info.tm_min,
           info.tm_hour,
           info.tm_mday,
           info.tm_mon,
           info.tm_year,
           info.tm_wday,
           info.tm_yday,
           info.tm_isdst);
}

并调用它,我们有:tm_sec = 14

CheckMkTime(14)
#> mktime_res: 1636274174,
#>  tm_zone: PDT,
#>  tm_gmtoff: -25200,
#>  tm_sec: 14,
#>  tm_min: 36,
#>  tm_hour: 1,
#>  tm_mday: 7,
#>  tm_mon: 10,
#>  tm_year: 121,
#>  tm_wday: 0,
#>  tm_yday: 310,
#>  tm_isdst: 1,

我们看到:tm_sec = 15

CheckMkTime(15)
#> mktime_res: 1636277775,
#>  tm_zone: PST,
#>  tm_gmtoff: -28800,
#>  tm_sec: 15,
#>  tm_min: 36,
#>  tm_hour: 1,
#>  tm_mday: 7,
#>  tm_mon: 10,
#>  tm_year: 121,
#>  tm_wday: 0,
#>  tm_yday: 310,
#>  tm_isdst: 0,

所以问题对吗?mktime

我不太确定...

我写了纯代码:C

#include <time.h>
#include <stdio.h>

int main(void) {
    struct tm info;

    info.tm_sec = 14;
    info.tm_min = 36;
    info.tm_hour = 1;
    info.tm_mday = 7;
    info.tm_mon = 10;
    info.tm_year = 121;
    info.tm_wday = 0;
    info.tm_yday = 310;
    info.tm_isdst = -1;

    time_t val = mktime(&info);
    printf("mktime_res: %jd,\n tm_zone: %s,\n tm_gmtoff: %ld,\n tm_sec: %d,\n "
               "tm_min: %d,\n tm_hour: %d,\n tm_mday: %d,\n tm_mon: %d,\n "
               "tm_year: %d,\n tm_wday: %d,\n tm_yday: %d,\n tm_isdst: %d\n",
               val,
               info.tm_zone,
               info.tm_gmtoff,
               info.tm_sec,
               info.tm_min,
               info.tm_hour,
               info.tm_mday,
               info.tm_mon,
               info.tm_year,
               info.tm_wday,
               info.tm_yday,
               info.tm_isdst);


    struct tm info2;
    info2.tm_sec = 15;
    info2.tm_min = 36;
    info2.tm_hour = 1;
    info2.tm_mday = 7;
    info2.tm_mon = 10;
    info2.tm_year = 121;
    info2.tm_wday = 0;
    info2.tm_yday = 310;
    info2.tm_isdst = -1;

    val = mktime(&info2);
    printf("\n\nmktime_res: %jd,\n tm_zone: %s,\n tm_gmtoff: %ld,\n tm_sec: %d,\n "
               "tm_min: %d,\n tm_hour: %d,\n tm_mday: %d,\n tm_mon: %d,\n "
               "tm_year: %d,\n tm_wday: %d,\n tm_yday: %d,\n tm_isdst: %d\n",
               val,
               info2.tm_zone,
               info2.tm_gmtoff,
               info2.tm_sec,
               info2.tm_min,
               info2.tm_hour,
               info2.tm_mday,
               info2.tm_mon,
               info2.tm_year,
               info2.tm_wday,
               info2.tm_yday,
               info2.tm_isdst);

    return 0;
}

编译了它,并在终端中运行了它:

% clang time_shift.c -o time_shift
% ./time_shift
#> mktime_res: 1636266974,
#>  tm_zone: EST,
#>  tm_gmtoff: -18000,
#>  tm_sec: 14,
#>  tm_min: 36,
#>  tm_hour: 1,
#>  tm_mday: 7,
#>  tm_mon: 10,
#>  tm_year: 121,
#>  tm_wday: 0,
#>  tm_yday: 310,
#>  tm_isdst: 0
#>
#>
#> mktime_res: 1636266975,
#>  tm_zone: EST,
#>  tm_gmtoff: -18000,
#>  tm_sec: 15,
#>  tm_min: 36,
#>  tm_hour: 1,
#>  tm_mday: 7,
#>  tm_mon: 10,
#>  tm_year: 121,
#>  tm_wday: 0,
#>  tm_yday: 310,
#>  tm_isdst: 0

我们在这里没有看到问题。但是,我们确实注意到在这两种情况下都是 is,而当我们在运行后运行它时,我们获得了 14 和 15。tm_zoneESTRRApiDatetime::rapistrptime(x, fmt = "%Y-%m-%d %H:%M:%OS", "PST8PDT")PSTPDT

有了这个,我在一个新的会话中重新运行了这些示例,并获得了与我们在纯实现中相同的结果。RC

在调用 base 后,我们在新会话中没有看到此行为。RstrptimeR

我试着查看 mktime 源代码,但它远远超出了我的范围。

会议信息

sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Ventura 13.4.1
#>
#> Matrix products: default
#> BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: PST8PDT
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base
#>
#> other attached packages:
#> [1] RApiDatetime_0.0.8
#>
#> loaded via a namespace (and not attached):
#> [1] compiler_4.3.1 tools_4.3.1    Rcpp_1.0.11

评论

2赞 Ben Bolker 11/24/2023
伟大!stackoverflow.com/questions/8558919/mktime-and-tm-isdst 有帮助吗?
0赞 Joseph Wood 11/24/2023
超级有帮助!我仍然感到困惑,为什么纯 C 代码没有给出相同的结果。我不完全理解时间区域设置,但我认为这与它有关。
0赞 Joseph Wood 11/24/2023
这个:stackoverflow.com/q/13804095/4408538 看起来也很有趣
1赞 nicola 11/24/2023
看起来行为与模棱两可的日期不一致;这里还有一些例子: cygwin.com/bugzilla/show_bug.cgi?id=17646.但是,我认为我们不能将其定义为错误,无论是从 R 还是侧面。mktimemktime