提问人:Jacob Birkett 提问时间:10/26/2023 最后编辑:Jacob Birkett 更新时间:10/29/2023 访问量:80
调用在循环中使用“&mut self.0”的函数 (E0499)
Call a function that uses `&mut self.0` in a loop (E0499)
问:
我正在寻找一种解决这种特定情况下缺乏 polonius 问题的方法。据我所知,其他答案似乎不适用。
我有两个结构,一个.前者是解耦的,但后者与前者高度耦合。 应该由任何 构造,并且应该由相同的构造。SourceBytes<S>
SourceChars
SourceBytes<S>
S: Iterator<Item = u8>
SourceChars
S: Iterator<Item = u8>
以下是每个定义的样子:
#[derive(Clone, Debug)]
pub struct SourceBytes<S>
where
S: Iterator<Item = u8>,
{
iter: S,
buffer: Vec<S::Item>,
}
#[derive(Clone, Debug)]
pub struct SourceChars<S>(S)
where
S: Iterator<Item = u8>;
其目的是抽象化,以便可以缓冲每个项目,并且不可变地读取,而无需从迭代器中获取/弹出项目。它看起来像这样:SourceBytes<S>
S
S::Item
impl<S> Iterator for SourceBytes<S>
where
S: Iterator<Item = u8>,
{
type Item = S::Item;
fn next(&mut self) -> Option<Self::Item> {
self.buffer.pop().or_else(|| self.iter.next())
}
}
这工作正常,缓冲区的处理方式如下:
impl<S> SourceBytes<S>
where
S: Iterator<Item = u8>,
{
// pub fn new<I>(iter: I) -> Self
// where
// I: IntoIterator<Item = S::Item, IntoIter = S>,
// {
// Self {
// iter: iter.into_iter(),
// buffer: Vec::new(),
// }
// }
fn buffer(&mut self, count: usize) -> Option<&[u8]> {
if self.buffer.len() < count {
self.buffer
.extend(self.iter.by_ref().take(count - self.buffer.len()));
}
self.buffer.get(0..count)
}
}
因此,每次调用时,项目都将从中获取并推送到 。每次调用时,它都会首先从 中获取,然后从后一个字段的类型中获取。SourceBytes<S>::buffer
S
buffer
<SourceBytes as Iterator>::next
self.buffer
self.iter
S
现在,目的是提供一个接口来读取字节(这是 ),直到它找到有效的 UTF-8 ,然后返回它:SourceChars<S>
Iterator
self.0
S
char
impl<S> Iterator for SourceChars<S>
where
S: Iterator<Item = u8>,
{
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
let mut buf = [0; 4];
// A single character can be at most 4 bytes.
for (i, byte) in self.0.by_ref().take(4).enumerate() {
buf[i] = byte;
if let Ok(slice) = std::str::from_utf8(&buf[..=i]) {
return slice.chars().next();
}
}
None
}
}
这也很好用。
现在,我还希望提供一个 for,以便可以依赖 提供的缓冲区(在这种情况下,它是 )。impl
SourceChars<&mut SourceBytes<S>>
SourceChars
self.0
&mut SourceBytes<S>
impl<S> SourceChars<&mut SourceBytes<S>>
where
S: Iterator<Item = u8>,
{
fn buffer(&mut self, count: usize) -> Option<&str> {
// let mut src = self.0.by_ref();
for byte_count in 0.. {
let Some(buf) = self.0.buffer(byte_count) else {
return None;
};
if let Ok(slice) = std::str::from_utf8(buf) {
if slice.chars().count() >= count {
return Some(slice);
}
}
}
unreachable!()
}
}
这依赖于实际缓冲字节,而是充当包装器,将迭代器的解释从 bytes 更改为 s。SourceChars<&mut SourceBytes<S>>::buffer
SourceBytes<S>::buffer
SourceChars
S
char
问题在于不能多次可变地借用,并且在循环中,编译器似乎不会删除引用。self.0
&mut self.0
我怎样才能在不遇到此编译器错误的情况下以依赖的方式实现它?SourceChars
SourceBytes::buffer
error[E0499]: cannot borrow `*self.0` as mutable more than once at a time
--> src/parser/iter.rs:122:29
|
119 | fn buffer(&mut self, count: usize) -> Option<&str> {
| - let's call the lifetime of this reference `'1`
...
122 | let Some(buf) = self.0.buffer(byte_count) else {
| ^^^^^^ `*self.0` was mutably borrowed here in the previous iteration of the loop
...
127 | return Some(slice);
| ----------- returning this value requires that `*self.0` is borrowed for `'1`
答:
解决方法与所有 Polonius 问题一样:重复计算。它的效率较低,但它有效。
impl<S> SourceChars<&mut SourceBytes<S>>
where
S: Iterator<Item = u8>,
{
fn buffer(&mut self, count: usize) -> Option<&str> {
// let mut src = self.0.by_ref();
for byte_count in 0.. {
let Some(buf) = self.0.buffer(byte_count) else {
return None;
};
if let Ok(slice) = std::str::from_utf8(buf) {
if slice.chars().count() >= count {
return Some(std::str::from_utf8(self.0.buffer(byte_count).unwrap()).unwrap());
}
}
}
unreachable!()
}
}
评论
from_utf8_unchecked
我之前尝试过的一个选项是 crate polonius-the-crab
,但这最终导致了 API 的使用出现更多问题,此外还使特征边界难以正确。
由于这种不便,我最终使用了一个不安全的指针强制来缩短 的生存期,使其不再依赖于 .buf
&mut SourceBytes
impl<S> Buffered for SourceChars<&mut S>
where
for<'a> S: Iterator<Item = u8> + Buffered<ItemSlice<'a> = &'a [u8]> + 'a,
{
type ItemSlice<'items> = &'items str where Self: 'items;
// Allowed specifically here because the borrow checker is incorrect.
#[allow(unsafe_code)]
fn buffer(&mut self, count: usize) -> Option<Self::ItemSlice<'_>> {
for byte_count in 0.. {
let buf = self.0.buffer(byte_count)?;
// SAFETY:
//
// This unsafe pointer coercion is here because of a limitation
// in the borrow checker. In the future, when Polonius is merged as
// the de-facto borrow checker, this unsafe code can be removed.
//
// The lifetime of the byte slice is shortened to the lifetime of
// the return value, which lives as long as `self` does.
//
// This is referred to as the "polonius problem",
// or more accurately, the "lack-of-polonius problem".
//
// <https://github.com/rust-lang/rust/issues/54663>
let buf: *const [u8] = buf;
let buf: &[u8] = unsafe { &*buf };
if let Ok(slice) = std::str::from_utf8(buf) {
if slice.chars().count() >= count {
return Some(slice);
}
}
}
unreachable!()
}
}
此外,以下是显示 API 使用情况的测试。使用板条箱无法解决我在实施这些测试时遇到的一些生命周期问题。polonius-the-crab
#[cfg(test)]
mod tests {
use super::{Buffered, SourceBytes, SourceChars};
#[test]
fn test_source_chars() {
let source = "abcdefg";
let chars = SourceChars::new(source.bytes());
assert_eq!(source, chars.collect::<String>());
}
#[test]
fn test_source_chars_buffer() {
let source = "abcdefg";
let mut bytes = SourceBytes::new(source.bytes());
let mut chars = SourceChars::new(&mut bytes);
// Ensure that the `buffer` function works.
assert_eq!(&source[0..3], chars.buffer(3).unwrap());
// Ensure that the characters are taken from the buffer,
// and that `buffer` correctly preserves them.
assert_eq!(&source[0..4], chars.by_ref().take(4).collect::<String>());
// Ensure that the iterator has been advanced.
assert_eq!(&source[4..7], chars.buffer(3).unwrap());
}
}
上一个:对象与基元
评论
polonius-the-crab
VecDeque
VecDeque::get
Vec::get
Range
usize
VecDeque
get