大括号是如何在报价中逃脱的！宏观？-解网

问：

我目前正在尝试为自定义特征编写一个派生宏。这就是我到目前为止得到的：

use proc_macro2::TokenStream;
use quote::{quote, quote_spanned};
use syn::spanned::Spanned;
use syn::{
    parse_macro_input, parse_quote, Data, DeriveInput, Fields, GenericParam, Generics, Index,
};

#[proc_macro_derive(HeapSize)]
pub fn derive_heap_size(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
    // Parse the input tokens into a syntax tree.
    let input = parse_macro_input!(input as DeriveInput);

    // Used in the quasi-quotation below as `#name`.
    let name = input.ident;

    // Add a bound `T: HeapSize` to every type parameter T.
    let generics = add_trait_bounds(input.generics);
    let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();

    // Generate an expression to sum up the heap size of each field.
    let sum = heap_size_sum(&input.data);

    let expanded = quote! {
        // The generated impl.
        impl #impl_generics lestream::FromLeBytes for #name #ty_generics #where_clause {
            fn heap_size_of_children(&self) -> usize {
                #sum
            }
        }
    };

    // Hand the output tokens back to the compiler.
    proc_macro::TokenStream::from(expanded)
}

// Add a bound `T: HeapSize` to every type parameter T.
fn add_trait_bounds(mut generics: Generics) -> Generics {
    for param in &mut generics.params {
        if let GenericParam::Type(ref mut type_param) = *param {
            type_param.bounds.push(parse_quote!(lestream::FromLeBytes));
        }
    }
    generics
}

// Generate an expression to sum up the heap size of each field.
fn heap_size_sum(data: &Data) -> TokenStream {
    match *data {
        Data::Struct(ref data) => {
            match data.fields {
                Fields::Named(ref fields) => {
                    // Expands to an expression like
                    //
                    //     0 + self.x.heap_size() + self.y.heap_size() + self.z.heap_size()
                    //
                    // but using fully qualified function call syntax.
                    //
                    // We take some care to use the span of each `syn::Field` as
                    // the span of the corresponding `heap_size_of_children`
                    // call. This way if one of the field types does not
                    // implement `HeapSize` then the compiler's error message
                    // underlines which field it is. An example is shown in the
                    // readme of the parent directory.
                    let q = quote! {
                        Self {
                    };

                    for field in fields.named {
                        let item_name = field.ident.expect("macro only works with named fields");
                        let item_type = field.ty;

                        quote! {
                            let #item_name = #item_type::from_le_bytes()
                        }
                    }
                }
                _ => panic!("The FromLeBytes derive can only be applied to structs"),
            }
        }
        Data::Enum(_) | Data::Union(_) => unimplemented!(),
    }
}

这个想法是通过实现结构的特征来派生结构的特征，即按结构的每个成员的顺序调用特征方法：from_le_bytes()

use std::fmt::{Display, Formatter};

#[derive(Debug)]
pub enum Error {
    UnexpectedEndOfStream,
}

impl Display for Error {
    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
        match self {
            Self::UnexpectedEndOfStream => write!(f, "unexpected end of stream"),
        }
    }
}

impl std::error::Error for Error {}

pub trait FromLeBytes: Sized {
    fn from_le_bytes<T>(bytes: &mut T) -> Result<Self, Error>
    where
        T: Iterator<Item = u8>;
}

即：

#[derive(FromLeBytes)]
struct Foo {
    bar: u8;
    spamm: u16;
}

应该导致像这样的实现

impl FromLeBytes for Foo {
    fn from_le_bytes<T>(bytes: &mut T) -> Result<Self, Error>
    where
        T: Iterator<Item = u8>;
{
    Ok(Self { bar: u8::from_le_bytes(bytes)?, spamm: u16::from_le_bytes(bytes)? })
}

但是，我无法弄清楚如何在宏中转义结构构造函数的大括号。这是我第一次编写宏，所以我也愿意接受其他建议，如果这里不是正确的工具。quote!quote!

Rust 宏

use proc_macro2::TokenStream;
use quote::quote;
use syn::{parse_macro_input, parse_quote, Data, DeriveInput, Fields, GenericParam, Generics};

#[proc_macro_derive(FromLeBytes)]
pub fn derive_heap_size(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
    // Parse the input tokens into a syntax tree.
    let input = parse_macro_input!(input as DeriveInput);

    // Used in the quasi-quotation below as `#name`.
    let name = input.ident;

    // Add a bound `T: HeapSize` to every type parameter T.
    let generics = add_trait_bounds(input.generics);
    let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();

    // Generate an expression to sum up the heap size of each field.
    let body = impl_body(&input.data);

    let expanded = quote! {
        // The generated impl.
        impl #impl_generics lestream::FromLeBytes for #name #ty_generics #where_clause {
            fn from_le_bytes<T>(bytes: &mut T) -> lestream::Result<Self>
            where
                T: Iterator<Item = u8>
            {
                #body
            }
        }
    };

    // Hand the output tokens back to the compiler.
    proc_macro::TokenStream::from(expanded)
}

// Add a bound `T: HeapSize` to every type parameter T.
fn add_trait_bounds(mut generics: Generics) -> Generics {
    for param in &mut generics.params {
        if let GenericParam::Type(ref mut type_param) = *param {
            type_param.bounds.push(parse_quote!(lestream::FromLeBytes));
        }
    }
    generics
}

// Generate an expression to sum up the heap size of each field.
fn impl_body(data: &Data) -> TokenStream {
    match *data {
        Data::Struct(ref strct) => {
            match strct.fields {
                Fields::Named(ref fields) => {
                    // Expands to an expression like
                    //
                    //     0 + self.x.heap_size() + self.y.heap_size() + self.z.heap_size()
                    //
                    // but using fully qualified function call syntax.
                    //
                    // We take some care to use the span of each `syn::Field` as
                    // the span of the corresponding `heap_size_of_children`
                    // call. This way if one of the field types does not
                    // implement `HeapSize` then the compiler's error message
                    // underlines which field it is. An example is shown in the
                    // readme of the parent directory.
                    let mut tokens = TokenStream::new();
                    let mut constructor_fields = TokenStream::new();

                    for field in &fields.named {
                        let item_name = field.ident.clone().unwrap();
                        let item_type = &field.ty;

                        tokens.extend(quote! {
                            let #item_name = <#item_type as lestream::FromLeBytes>::from_le_bytes(bytes)?;
                        });

                        constructor_fields.extend(quote! {
                            #item_name,
                        });
                    }

                    tokens.extend(quote! { Ok(Self { #constructor_fields }) });
                    tokens
                }
                _ => panic!("The FromLeBytes derive can only be applied to structs"),
            }
        }
        Data::Enum(_) | Data::Union(_) => unimplemented!(),
    }
}

3赞 Cerberus 11/9/2023 #2

自我回答中已经描述了一种方法，但我会尝试为所讨论的问题添加更多的背景。

此错误的原因是输出必须是有效的 Rust 令牌序列 - 或者更准确地说，是 TokenTrees 的序列。而且 Rust 没有用于单个左大括号或右大括号的令牌;取而代之的是，它有一个组的概念，即放置在匹配的大括号对（或其他分隔符）内的标记子序列。quote

因此，在中的任何位置都有不匹配的分隔符是无效的。而这正是你想做的.TokenStreamquote!{ Self { }

至于为什么必须这样 - 让我们考虑以下代码：

fn foo() -> proc_macro2::TokenStream  {
    quote!{ { }; // (1)
    // imagine here's some code generating `TokenStream`,
    // so that the function would be valid if this `quote` is valid
}

fn bar() -> proc_macro2::TokenStream  {
    quote!{ { }; // (2)
    // imagine here's the same code as above in `foo`
    }
}

让我们问问自己：在每种情况下，解析器究竟应该如何遍历这段代码？

请注意，这里的函数实际上是编译的 - 当然，它没有做任何有用的事情，但它是正确的;按原样，其中的宏生成，其中包含一个空块和一个分号（注释被剥离）。换句话说，如果注释被某些代码替换，则此代码将被传递给并且不会被 Rust 编译器解析 - 仅 lexed。从解析器的角度来看，这很可能是荒谬的，但由于它是谁接收这些令牌 - 这种“荒谬”实际上并不重要。
换句话说，with 解析器将看到宏的左大括号，然后按原样使用所有内容，直到匹配的右大括号。barquoteTokenStreamquotequotebar

想象一下，现在，我们也想编译并产生一个单左括号。这意味着解析器必须将第（1）行处的右大括号视为闭合宏，并实际在其余标记上运行自身，因为它们现在不在宏上下文中，因此必须进行解析。fooquoteTokenStreamquote

但是现在，请注意，在解析行（1）和（2）时，实际上无法区分这两种情况：并且完全相同的标记序列，除了一个额外的右大括号。为了检查这个额外的大括号是否真的在这里，解析器必须使用无限前瞻 - 也就是说，扫描到文件的末尾，然后在看到大括号实际上不匹配后，倒带并再次开始解析。foobar

此外，严格来说，很可能不可能知道哪个确切的支架必须被视为不匹配。这样想：

fn foo() {
    quote::quote!{ { }; { };
}

如果 Rust 允许在宏中使用不匹配的大括号，那么这段代码将是模棱两可的：究竟必须在哪里结束？在第一个右大括号上，以便将其之后的一对大括号解析为块？还是在最后一个上，以便它本身得到一个块作为输入（在单个大括号之前）？在这种情况下，编译器不会做出决定 - 它会再次出错。quotequote

简而言之：对于编译器作者和语言用户来说，允许不匹配的大括号将为太多的复杂性打开大门。

大括号是如何在报价中逃脱的！宏观？

How are curly braces being escaped within the quote! macro?

评论

评论