在 F# 中制作可重放序列的惯用方法是什么?

What's the idiomatic way to make replayable sequences in F#?

提问人:Kenneth Allen 提问时间:12/1/2021 最后编辑:Kenneth Allen 更新时间:12/4/2021 访问量:81

问:

我刚开始使用今年的 Advent of Code 来学习 F#,我立即尝试重用 from .IEnumerableFile.ReadLines

以下是我看到的解决此问题的所有方法:

// Read all lines immediately into array/list
let linesAll     = File.ReadAllLines "file.txt"
let linesArray   = File.ReadLines "file.txt" |> Array.ofSeq
let linesList    = File.ReadLines "file.txt" |> List.ofSeq

// Lazily load and cache for replays
let linesCache   = File.ReadLines "file.txt" |> Seq.cache

// Start new filesystem read for every replay
let linesDelay   = (fun () -> File.ReadLines "file.txt") |> Seq.delay
let linesSeqExpr = seq { yield! File.ReadLines "file.txt" }
  • 这些在语义上是否相同(对于只读文件)?
  • 并且是唯一一个不将整个文件读入内存的人?linesDelaylinesSeqExpr
  • 是否因必须向后组装列表而减慢速度?linesList
  • 这些被认为是或多或少的惯用语吗?

编辑

这是重现我问题的代码:

let lines = System.IO.File.ReadLines("alphabet.txt")
for i = 0 to 5 do
  let arr = Seq.zip lines (Seq.skip 1 lines) |> Array.ofSeq
  printfn "%A %A" i arr

给出输出:

0 [|("A", "C"); ("D", "E"); ("F", "G"); ("H", "I"); ("J", "K"); ("L", "M");
  ("N", "O"); ("P", "Q"); ("R", "S"); ("T", "U"); ("V", "W"); ("X", "Y")|]
1 [|("A", "B"); ("B", "C"); ("C", "D"); ("D", "E"); ("E", "F"); ("F", "G");
  ("G", "H"); ("H", "I"); ("I", "J"); ("J", "K"); ("K", "L"); ("L", "M");
  ("M", "N"); ("N", "O"); ("O", "P"); ("P", "Q"); ("Q", "R"); ("R", "S");
  ("S", "T"); ("T", "U"); ("U", "V"); ("V", "W"); ("W", "X"); ("X", "Y");
  ("Y", "Z")|]
2 [|("A", "B"); ("B", "C"); ("C", "D"); ("D", "E"); ("E", "F"); ("F", "G");
  ("G", "H"); ("H", "I"); ("I", "J"); ("J", "K"); ("K", "L"); ("L", "M");
  ("M", "N"); ("N", "O"); ("O", "P"); ("P", "Q"); ("Q", "R"); ("R", "S");
  ("S", "T"); ("T", "U"); ("U", "V"); ("V", "W"); ("W", "X"); ("X", "Y");
  ("Y", "Z")|]
3 [|("A", "B"); ("B", "C"); ("C", "D"); ("D", "E"); ("E", "F"); ("F", "G");
  ("G", "H"); ("H", "I"); ("I", "J"); ("J", "K"); ("K", "L"); ("L", "M");
  ("M", "N"); ("N", "O"); ("O", "P"); ("P", "Q"); ("Q", "R"); ("R", "S");
  ("S", "T"); ("T", "U"); ("U", "V"); ("V", "W"); ("W", "X"); ("X", "Y");
  ("Y", "Z")|]
4 [|("A", "B"); ("B", "C"); ("C", "D"); ("D", "E"); ("E", "F"); ("F", "G");
  ("G", "H"); ("H", "I"); ("I", "J"); ("J", "K"); ("K", "L"); ("L", "M");
  ("M", "N"); ("N", "O"); ("O", "P"); ("P", "Q"); ("Q", "R"); ("R", "S");
  ("S", "T"); ("T", "U"); ("U", "V"); ("V", "W"); ("W", "X"); ("X", "Y");
  ("Y", "Z")|]
5 [|("A", "B"); ("B", "C"); ("C", "D"); ("D", "E"); ("E", "F"); ("F", "G");
  ("G", "H"); ("H", "I"); ("I", "J"); ("J", "K"); ("K", "L"); ("L", "M");
  ("M", "N"); ("N", "O"); ("O", "P"); ("P", "Q"); ("Q", "R"); ("R", "S");
  ("S", "T"); ("T", "U"); ("U", "V"); ("V", "W"); ("W", "X"); ("X", "Y");
  ("Y", "Z")|]

看起来表达式通过同时执行两个枚举来触发错误。Seq.zip lines (Seq.skip 1 lines)

编辑 2

在 C# 中复制。顺序略有不同,因为我没有跳过右侧的一个。

var lines = File.ReadLines("alphabet.txt");
for (int i = 0; i < 5; i++)
{
    var zipped = new List<(string, string)>();
    var enum1 = lines.GetEnumerator();
    var enum2 = lines.GetEnumerator();
    while (enum1.MoveNext() && enum2.MoveNext())
    {
        zipped.Add((enum1.Current, enum2.Current));
    }
    Console.WriteLine($"{i} [{string.Join(',', zipped)}]");
}
0 [(A, B),(C, D),(E, F),(G, H),(I, J),(K, L),(M, N),(O, P),(Q, R),(S, T),(U, V),(W, X),(Y, Z)]
1 [(A, A),(B, B),(C, C),(D, D),(E, E),(F, F),(G, G),(H, H),(I, I),(J, J),(K, K),(L, L),(M, M),(N, N),(O, O),(P, P),(Q, Q),(R, R),(S, S),(T, T),(U, U),(V, V),(W, W),(X, X),(Y, Y),(Z, Z)]
2 [(A, A),(B, B),(C, C),(D, D),(E, E),(F, F),(G, G),(H, H),(I, I),(J, J),(K, K),(L, L),(M, M),(N, N),(O, O),(P, P),(Q, Q),(R, R),(S, S),(T, T),(U, U),(V, V),(W, W),(X, X),(Y, Y),(Z, Z)]
3 [(A, A),(B, B),(C, C),(D, D),(E, E),(F, F),(G, G),(H, H),(I, I),(J, J),(K, K),(L, L),(M, M),(N, N),(O, O),(P, P),(Q, Q),(R, R),(S, S),(T, T),(U, U),(V, V),(W, W),(X, X),(Y, Y),(Z, Z)]
4 [(A, A),(B, B),(C, C),(D, D),(E, E),(F, F),(G, G),(H, H),(I, I),(J, J),(K, K),(L, L),(M, M),(N, N),(O, O),(P, P),(Q, Q),(R, R),(S, S),(T, T),(U, U),(V, V),(W, W),(X, X),(Y, Y),(Z, Z)]

编辑 3

这是一个已知问题,不会修复以保持兼容性。

    //  - IEnumerator<T> instances from the same IEnumerable<T> party on the same underlying
    //    reader.
F# 惰性计算 序列

评论


答:

2赞 Brian Berns 12/1/2021 #1

重用 from 的序列有什么问题?以下代码对我来说很好:File.ReadLines

let lines = File.ReadLines "file.txt"
for line in lines do printfn "%s" line
for line in lines do printfn "%s" line

无论如何,以下是我对您问题的答案的看法:

  • 这些在语义上是否相同(对于只读文件)?

它们相似,但不完全相同,因为它们具有不同的类型。例如,数组和列表没有完全相同的语义。(另外,请记住,即使是只读文件也可以删除,这将影响惰性版本。

  • linesDelaylinesSeqExpr 是唯一不将整个文件读入内存的吗?

不,也应该只读取需要的行数。linesCache

  • linesList 是否因为必须向后组装列表而变慢了速度?

我不这么认为。请参阅此处的原始源代码。List.ofSeq

  • 这些被认为是或多或少的惯用语吗?

我认为他们都很好,这取决于具体情况。就我个人而言,除非我有理由相信文件很大,否则我经常使用。File.ReadAllLines

评论

0赞 Kenneth Allen 12/2/2021
谢谢你的信息!不一致的是,它只会(不抛出)只给我部分文件。我看到了这一点,这表明我不是第一个遇到这个问题的人。我会看看我是否能得到一致的复制品。
1赞 Kenneth Allen 12/4/2021
添加了代码,重现了我看到的问题。
0赞 Kenneth Allen 12/4/2021
显然,这是一个已知问题,不会得到解决。