提问人:Thick_propheT 提问时间:8/13/2023 最后编辑:Theodor ZouliasThick_propheT 更新时间:8/15/2023 访问量:66
在不重新启动枚举的情况下重复 LINQ 查询
Repeat LINQ query without restarting enumeration
问:
我想使用 LINQ 解析具有重复模式的字符串(即使用 和 方法系列)。下面是字符串内容的示例:Skip
Take
"p0:{foo:bar}\r\np1:1234\r\np2:abcd"
如您所见,它几乎可以解析为 .我正在预解析它,以防止以后的 JSON 反序列化阶段窒息它。json
我对使用 LINQ 的好方法有一个想法,但我似乎无法找到实现这一目标的方法。下面是所需实现的示例:
public JToken[] GetContentOfEachP(string text)
{
return text
.Repeat(enumerable => // <- this is the method I'd like to write
enumerable // enumerable == text @ the enumerator.Current where we left off at the last iteration of Repeat
.SkipWhile(c => c != ':') // skip to the good part
.Skip(1) // skip ':'
.TakeWhile(c => c != '\r') // take the content between ':' & "\r\n"
.ToArray() // 'Select' {foo:bar} as char[] but without disrupting the current enumeration
)
.Select(charArray => JToken.Parse(new string(charArray)))
.ToArray();
}
foreach (var p in GetContentOfEachP("p0:{foo:bar}\r\np1:1234\r\np2:abcd"))
{
Console.WriteLine(p.ToString());
}
因此,这将是枚举以块为单位进行的方法。它 s & s 的内容并返回该块,然后继续枚举、ing & ing 的内容等。Skip
Take
p0
Skip
Take
p1
为了简单起见,我意识到我可以并且实际上可能最终在生产代码中使用它(在我当前的实现中,我是手动 -ing 和战略性 -ing,所以在这一点上任何事情都会更具可读性),但出于好奇,我想看看是否有人知道如何使用上面的 LINQ 方法完成此操作。foreach (var p in text.Split("\r\n"))
IEnumerator<char>.MoveNext()
yield return
答:
我认为不可能有一个可以按照您展示的方式调用的方法。您需要将“消耗”部分和“获取”部分拆分为两个参数。Repeat
public static IEnumerable<TResult> Repeat<T, TResult>(
this IEnumerable<T> source,
Func<IEnumerable<T>, IEnumerable<T>> consumer,
Func<IEnumerable<T>, TResult> taker
) where TResult : IEnumerable<T> {
while (source.Any()) {
source = consumer(source); // consume first,
var taken = taker(source); // then take
source = source.Skip(taken.Count()); // also consume the part that is taken
yield return taken;
}
}
请注意,这仅适用于允许您使用两次的源,字符串、数组、列表等。
用法:
// omitted the JSON parsing part for simplicity
public static IEnumerable<char[]> GetContentOfEachP(string text)
{
return text
.Repeat(
enumerable =>
enumerable
.SkipWhile(c => c != ':')
.Skip(1),
enumerable =>
enumerable
.TakeWhile(c => c != '\r')
.ToArray()
);
}
谢谢@Sweeper的回答;有一些反馈并看到其他人的方法帮助我想出了以下想法,这确实成功地实现了我最初的愿景。
public static class ParsingExtensions
{
public static IEnumerable<char[]> Repeat(this string source, Func<IEnumerable<char>, char[]> scope)
{
// wrap the source enumerable so we can control how it gets enumerated.
// that's important for...
using var e = new ContinuousEnumerable(source);
// checking if there are any elements in the collection without skipping ahead...
while (e.CanMoveNext())
{
// and running the "repeated" linq without restarting enumeration every time.
yield return scope(e);
}
}
private class ContinuousEnumerator : IEnumerator<char>
{
private readonly IEnumerator<char> _inner;
private int _state;
public char Current => _inner.Current;
object IEnumerator.Current => Current;
public ContinuousEnumerator(IEnumerator<char> inner)
{
_inner = inner;
}
public bool PeekNext()
{
var result = _inner.MoveNext();
if (result)
{
// now that WE'VE checked if there are more elements,
// we need to pretend that we didn't for when the "repeated" linq asks.
_state = 1;
}
return result;
}
public bool MoveNext()
{
switch (_state)
{
// if we've peeked above, we already know the answer.
case 1:
_state = 0;
return true;
// otherwise, ask the inner enumerator as normal.
default:
return _inner.MoveNext();
}
}
public void Reset() => _inner.Reset();
// don't dispose the enumerator when asked, because we're reusing it.
void IDisposable.Dispose() { }
// our own dispose method to call when the enumerable is disposed.
public void Dispose() => _inner.Dispose();
}
private class ContinuousEnumerable : IEnumerable<char>, IDisposable
{
private readonly IEnumerable<char> _inner;
private ContinuousEnumerator? _enumerator;
public ContinuousEnumerable(IEnumerable<char> inner)
{
_inner = inner;
}
public IEnumerator<char> GetEnumerator() => GetEnumeratorImpl();
IEnumerator IEnumerable.GetEnumerator() => GetEnumeratorImpl();
// always reuse the enumerator so we don't lose our place in the enumeration.
private ContinuousEnumerator GetEnumeratorImpl()
=> _enumerator ??= new ContinuousEnumerator(_inner.GetEnumerator());
// use our PeekNext method to check if enumeration will continue.
public bool CanMoveNext()
=> GetEnumeratorImpl().PeekNext();
// the enumerable is disposable here,
public void Dispose() => _enumerator?.Dispose();
}
上面的代码中有注释,但一般的想法是将源 enumerable 包装在自定义 enumerable () 中。这样,即使在“重复”的 linq 使用 .由于始终返回它创建的第一个枚举器,并且枚举器 () 忽略调用,因此可枚举上的枚举将在每次重复结束时从中断的位置继续。ContinuousEnumerable
ToArray
ContinuousEnumerable
ContinuousEnumerator
Dispose
包装源枚举还允许您检查剩余的元素,而无需可怕的多个枚举。因为我们也包装了源枚举对象,所以我们可以使用它来要求源枚举器,然后假装我们没有这样做,这样当“重复”的 linq 询问时,我们就不会以 -ing 两次结束。IEnumerator
ContinuousEnumerable.CanMoveNext
MoveNext
MoveNext
从那里开始,我们只需确保有一种替代方法来处置枚举器。
评论
text.Split('\r', '\n').Select(JToken.Parse)