使用预处理器生成具有多个参数的显式实例化

Generate explicit instantiations with multiple parameters with preprocessor

提问人:wittn 提问时间:10/16/2022 最后编辑:user12002570wittn 更新时间:10/22/2022 访问量:170

问:

在我的项目中,我希望有一堆模板化函数的显式实例化,以减少构建时间。现在我有很多功能,可以有不同的模板。出于这个原因(如果我想拥有更多),我不想手动输入它们,而是让预处理器生成它们。

我想生成的示例:

template bool match_any<x>();
template bool match_any<y<x>>();
template bool match_expr<x,y,z>();

while 是介于 1 和 a defined 之间的整数,可以是以下三个值之一: 并且是 0 或 1。xmax_dimyreal, bool, indexz

我现在想生成这三个函数与参数的任何可能组合(例如),但不仅仅是那些,因为我有 ~100 个具有类似结构的函数。 template bool match_any<0>(); template bool match_any<1>(); template bool match_any<2>(); ...

通过使用宏来定义前向声明,我已经弄清楚了如何做一个重复的模式:

#define REP_INT_1(f) f(1)
#define REP_INT_2(f) REP_INT_1(f) f(2)
#define REP_INT_3(f) REP_INT_2(f) f(3)
#define REP_INT_4(f) REP_INT_3(f) f(4)

#define REP_INT(n,gen) REP_INT_(n,gen)
#define REP_INT_(n,gen) REP_INT_##n(gen)

然后我可以像这样使用它

#define GEN(x) template bool match_any<x>();
REP_INT(3, GEN)
#undef GEN

当然,重复这种模式很简单,例如对于字符串。

但是我现在的问题是,由于模式的性质(因为我作为“函数”传递),我不能为 GEN 提供两个参数。 当然,我可以更改重复模式,使其也适用于两个参数,但是我必须为任意数量的参数以及它们的每个顺序创建一个这样的新模式, 这最终有点违背了目的,因为我将为每个函数设置一个大块 - 然后我可以手动编写它。GEN

有没有人想过用一种不同的方式“循环”可能的参数或如何扩展我现有的方法以使其适用于多个参数来实现这一目标?

C++ C-Preprocessor 元编程 显式实例化

评论


答:

1赞 Elliott 10/16/2022 #1

您可以像这样添加到宏中,以获得一个有点可维护的列表:

#define GEN_X(f) REP_INT(3, f)
#define GEN_Y(f, x) f(x, real) f(x, bool) f(x, index)
#define GEN_Z(f, x, y) f(x, y, 0) f(x, y, 1)

// GEN_F3 = generate functions with at least 3 arguments
#define GEN_F3(x, y, z) template bool match_any<x, y, z>();
#define GEN_F2(x, y) template bool match_any<y<x>>(); GEN_Z(GEN_F3, x, y)
#define GEN_F1(x) template bool match_any<x>(); GEN_Y(GEN_F2, x)
GEN_X(GEN_F1)

演示

我们将变量生成器分开,然后我们可以将它们链接在一起以获得所有排列:您的宏生成 's,它被管道传递到 ,它处理单参数情况并将这些值传递给 -generator,依此类推。xGEN_F1xy

请注意,尽管我们已经使源代码线性可维护,但我们无法避免可执行文件大小呈指数级增长。


为了解决您的扩展问题,如果您希望能够处理两个参数的任何数字排列,例如 x={1,2,3},y={1,2,3,4},直观地说,您可能需要进行如下调整:

#define GEN_X(f) REP_INT(3, f)
#define GEN_Y(f, x) REP_INT(4, f)

这几乎可以工作,但宏扩展可以防止在一个递归扩展中再次使用相同的 REP_INT_* 宏(至少以我的方式)。

一种解决方法是有两个列表(其中至少一个需要处理可变参数输入,将数字附加到末尾)。宏可以相同,但名称需要不同才能继续扩展。REP_INT

然后,我们可以像这样解决扩展问题:

#define GEN_X(f) REP1_INT_3(f)
#define GEN_Y(f, x) REP2_INT_4(f, x)

#define GEN_F2(x, y) template bool match_any<x, y>();
#define GEN_F1(x) GEN_Y(GEN_F2, x)
GEN_X(GEN_F1)

演示

评论

0赞 wittn 10/17/2022
这非常好,它适用于测试用例。不幸的是,我没有在我的例子中包括这一点,但是如果其中两个参数是数字,我该怎么办?(例如,如果我想生成 match_any<1,1>、match_any<2,1>、match_any<3,1>、match_any<1,2>..等)。
1赞 Elliott 10/18/2022
@wittn,这是一个不同的问题。我不确定如果没有第二个系列,这是否可能。我会考虑一下。REP_INT
0赞 wittn 10/18/2022
非常感谢您考虑它!另外,通过“第二个系列”,您的意思是与第一个系列风格/长度相似的第二个系列,还是每个组合都需要一个(因为后者会非常长,但第一个可以)。REP_INT
1赞 Elliott 10/18/2022
@wittn,没问题。我刚刚编辑了我的答案以解决您的评论,但这里有一个示例,说明如何拥有名称之外相同的宏系列:coliru
1赞 Turtlefight 10/22/2022 #2

假设 C++20 可供您使用,并且您不介意有些复杂的宏,您可以使用递归宏,利用新的__VA_OPT__标记延迟表达式,它们组合在一起允许宏中的(有限)递归。

注意:C++20 和 __VA_OPT__ 不是严格要求的,但它们大大简化了实现。

在你的情况下,它将允许你写这样的东西:
godbolt

#define GEN(x, y, z) template bool match_expr<x,y,z>();
FOR_EACH_COMBINATION(
  GEN,
  (1, 2, 3),
  (real, bool, index),
  (0, 1)
)
#undef GEN

这将扩展到:

template bool match_expr<1,real,0>();
template bool match_expr<1,real,1>();
template bool match_expr<1,bool,0>();
template bool match_expr<1,bool,1>();
template bool match_expr<1,index,0>();
template bool match_expr<1,index,1>();
template bool match_expr<2,real,0>();
template bool match_expr<2,real,1>();
template bool match_expr<2,bool,0>();
template bool match_expr<2,bool,1>();
template bool match_expr<2,index,0>();
template bool match_expr<2,index,1>();
template bool match_expr<3,real,0>();
template bool match_expr<3,real,1>();
template bool match_expr<3,bool,0>();
template bool match_expr<3,bool,1>();
template bool match_expr<3,index,0>();
template bool match_expr<3,index,1>();

1. 递归宏的工作原理

这是一本很好的读物,详细解释了递归宏。(或此处的较短版本)

它的基本要点是,您可以强制表达式进行多次预处理器扫描才能进行全面评估,例如:
godbolt

#define FOO(a) [a]
#define DELAY DELAY_IMPL_NOTHING
#define DELAY_IMPL_NOTHING()

// normal function-like macro expansion
FOO(bar) // => [bar]

// function-like macro will not expand, due to delay
FOO DELAY() (bar) // => FOO (bar)

同时还能够通过简单地将任何表达式传递给类似函数的宏来强制对任何表达式进行额外的扫描:
Godbolt

#define EXPAND(...) __VA_ARGS__

// forcing a rescan on the expression will expand the function-like macro
EXPAND(FOO DELAY() (bar)) // => [bar]

这两件事结合在一起,可以让我们编写一个“递归”宏:
godbolt

#define RECURSE(fn, value) RECURSE_AGAIN DELAY() () (fn, fn(value))
#define RECURSE_AGAIN() RECURSE

RECURSE(FOO, bar)                         // => RECURSE_AGAIN () (FOO, [bar])
EXPAND(RECURSE(FOO, bar))                 // => RECURSE_AGAIN () (FOO, [[bar]])
EXPAND(EXPAND(RECURSE(FOO, bar)))         // => RECURSE_AGAIN () (FOO, [[[bar]]])
EXPAND(EXPAND(EXPAND(RECURSE(FOO, bar)))) // => RECURSE_AGAIN () (FOO, [[[[bar]]]])

请注意,扩展 will 不会直接导致对自身的递归调用。(直接使用 within 是不可能的,因为预处理器不允许宏递归)RECURSERECURSERECURSE

相反,宏会生成一个延迟表达式,这只会在发生另一个预处理器扫描时导致另一个扩展。因此,在原始调用已经扩展之后,第二次调用将发生在内部。RECURSERECURSERECURSEEXPANDRECURSE

但目前没有办法阻止递归 - 这是拯救世界的地方; 扩展为 if 为非空,否则扩展为无。(一旦处理完所有参数,我们就可以用它来结束递归)RECURSE__VA_OPT____VA_OPT__ ( content )content__VA_ARGS__

使用递归宏的简单实现可能如下所示:
godbolt
FOR_EACH

// forces 3 rescans (2 invocations of EXPAND + 1 for EXPAND2)
#define EXPAND2(...) EXPAND(EXPAND(__VA_ARGS__))

#define FOR_EACH(fn, ...) \
  __VA_OPT__(EXPAND2(FOR_EACH_IMPL(fn, __VA_ARGS__)))

#define FOR_EACH_IMPL(fn, first, ...) \
  fn(first) \
  __VA_OPT__( \
    FOR_EACH_IMPL_AGAIN DELAY() () (fn, __VA_ARGS__) \
  )
#define FOR_EACH_IMPL_AGAIN() FOR_EACH_IMPL

FOR_EACH(FOO, a, b) // => [a] [b]
FOR_EACH(FOO, a, b, c, d) // => [a] [b] [c] [d]

__VA_OPT__用于在我们调用每个参数后停止递归。FOR_EACH_IMPLfn

这种方法的唯一限制因素是宏提供的额外重新扫描的数量(例如 只提供 1,提供 3)。EXPANDEXPAND2

但这可以通过添加一些额外的扩展层来轻松解决,例如,这个宏已经提供了 86 次重新扫描:EXPAND_BIG

#define EXPAND_BIG(...) EXPAND1(EXPAND1(EXPAND1(EXPAND1(__VA_ARGS__))))
#define EXPAND1(...) EXPAND2(EXPAND2(EXPAND2(EXPAND2(__VA_ARGS__))))
#define EXPAND2(...) EXPAND3(EXPAND3(EXPAND3(EXPAND3(__VA_ARGS__))))
#define EXPAND3(...) __VA_ARGS__

添加另一个宏将导致 342 次重新扫描,因此每一层都可以使用的递归调用数增加四倍以上。EXPAND*EXPAND_BIG

请注意,这确实是有代价的,即编译时。宏堆栈越大,编译所需的时间就越长,因此最好使用适用于您的用例的最少重新扫描次数。EXPAND

因此,通过足够的重新扫描,您基本上可以在宏中执行任何操作,例如迭代(见上文)、左折叠(godbolt;解释),或者像在您的案例中创建多个集合的所有可能组合一样。FOR_EACH


2. 实施FOR_EACH_COMBINATION

2.1 代码

首先,我们需要一些实用程序宏(主要是上面的宏 + 一些额外的宏):

// Simple concat
// example: CONCAT(foo,bar) => foobar
#define CONCAT(a, b) CONCAT_IMPL(a, b)
#define CONCAT_IMPL(a, b) a##b

// returns the first argument
// example: VAARGS_HEAD(1,2,3) => 1
#define VAARGS_HEAD(head, ...) head
// returns all arguments except the first one
// example: VAARGS_TAIL(1,2,3) => 2,3
#define VAARGS_TAIL(head, ...) __VA_ARGS__

// basic preprocessor if
// examples:
//  - IIF(1)(a,b) => a
//  - IIF(0)(a,b) => b
#define IIF(value) CONCAT(IIF_,value)
#define IIF_1(true_, false_) true_
#define IIF_0(true_, false_) false_

// evaluates to 1 if it has been called with at least 1 argument, 0 otherwise
// examples:
//   - HAS_VAARGS(1,2) => 1
//   - HAS_VAARGS()    => 0
#define HAS_VAARGS(...) VAARGS_HEAD(__VA_OPT__(1,) 0)

// forces the preprocessor to repeatedly scan an expression
// this definition forces a total of 86 scans, but can easily extended
// by adding more EXPAND*() macros (each additional one more than
// quadruples the amount of scans)
// examples:
//   - CONCAT DELAY() (a,b)         => CONCAT (a,b)
//   - EXPAND(CONCAT DELAY() (a,b)) => ab
#define EXPAND(...) EXPAND1(EXPAND1(EXPAND1(EXPAND1(__VA_ARGS__))))
#define EXPAND1(...) EXPAND2(EXPAND2(EXPAND2(EXPAND2(__VA_ARGS__))))
#define EXPAND2(...) EXPAND3(EXPAND3(EXPAND3(EXPAND3(__VA_ARGS__))))
#define EXPAND3(...) __VA_ARGS__

// evaluates to nothing, but requires an additional preprocessor scan.
// this can be used to delay macro evaluations.
// examples:
//   - CONCAT(a,b)                  => ab
//   - CONCAT DELAY() (a,b)         => a DELAY_IMPL_NOTHING () b
//   - EXPAND(CONCAT DELAY() (a,b)) => ab
#define DELAY DELAY_IMPL_NOTHING
#define DELAY_IMPL_NOTHING()

// discards all arguments, evaluates to nothing
#define SWALLOW(...)

// appends an element to a tuple
// examples:
//   - TUPLE_APPEND((a,b), c) => (a,b,c)
//   - TUPLE_APPEND((), a)    => (a)
#define TUPLE_APPEND(tuple, el) (TUPLE_APPEND_IMPL_UNPACK tuple el) 
#define TUPLE_APPEND_IMPL_UNPACK(...) __VA_ARGS__ __VA_OPT__(,)

利用这些宏,我们可以构建宏:FOR_EACH_COMBINATION

// if __VA_ARGS__ is empty then it expands to fn(args);
// otherwise it'll expand to FOR_EACH_COMBINATION_IMPL_RECURSE(fn, args, __VA_ARGS__)
#define FOR_EACH_COMBINATION_IMPL(fn, args, ...) \
  IIF(HAS_VAARGS(__VA_ARGS__))( \
    FOR_EACH_COMBINATION_IMPL_RECURSE, \
    FOR_EACH_COMBINATION_IMPL_CALL \
  )(fn, args __VA_OPT__(, __VA_ARGS__))

// evaluates the user-provided function-like macro fn with arguments args.
// example: FOR_EACH_IMPL_CALL(foo, (1,2)) => foo(1,2)
#define FOR_EACH_COMBINATION_IMPL_CALL(fn, args) \
  fn args

// if tuple has at least 1 element it calls FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY;
// otherwise it stops recursion.
// examples:
//   - FOR_EACH_COMBINATION_IMPL_RECURSE(fn, (), (a, b))
//     => FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY(fn, (), (a,b))
//   - FOR_EACH_COMBINATION_IMPL_RECURSE(fn, (), ())
//     => 
#define FOR_EACH_COMBINATION_IMPL_RECURSE(fn, args, tuple, ...) \
  IIF(HAS_VAARGS tuple)( \
    FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY, \
    SWALLOW \
  ) DELAY() ( \
    fn, args, tuple __VA_OPT__(, __VA_ARGS__) \
  )

// calls FOR_EACH_COMBINATION_IMPL twice;
// once with the first element of tuple appended to args,
// and a second time with the first element of tuple removed.
// examples:
//   - FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY(fn, (), (a,b), (c,d))
//     => FOR_EACH_COMBINATION_IMPL(fn, (a), (c,d))
//        FOR_EACH_COMBINATION_IMPL(fn, (), (b), (c,d))
#define FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY(fn, args, tuple, ...) \
    FOR_EACH_COMBINATION_IMPL DELAY() ( \
      fn, \
      TUPLE_APPEND(args, VAARGS_HEAD tuple) \
      __VA_OPT__(, __VA_ARGS__) \
    ) \
    \
    FOR_EACH_COMBINATION_IMPL DELAY() ( \
      fn, \
      args, \
      (VAARGS_TAIL tuple) \
      __VA_OPT__(, __VA_ARGS__) \
    )

// takes a function-like macro (fn) and an arbitrary amount of tuples.
// the tuples can be of any size (only constrained by the number
// of expansions provided by the EXPAND macro)
// fn will be evaluated for each possible combination from the tuples.
// examples:
//   - FOR_EACH_COMBINATION(foo, (a,b))
//     => foo(a) foo(b)
//   - FOR_EACH_COMBINATION(foo, (a,b), (1,2))
//     => foo(a, 1) foo(a, 2) foo(b, 1) foo(b, 2)
#define FOR_EACH_COMBINATION(fn, ...) \
  EXPAND( \
    FOR_EACH_COMBINATION_IMPL( \
      fn, \
      () \
      __VA_OPT__(, __VA_ARGS__) \
    ) \
  )

2.2 工作原理

FOR_EACH_COMBINATION_IMPL是这个宏的核心。

  • fn是用户定义的函数,将从传入的元组中为每个组合调用该函数
  • args是一个元组,用于收集我们需要传递给的参数fn
  • __VA_ARGS__是我们仍然需要从中选择元素的元组。

每次调用都可能导致两种不同的情况:FOR_EACH_COMBINATION_IMPL

  • 如果没有更多的元组可以从中选择元素(为空),我们只需调用__VA_ARGS__fn(args)
    FOR_EACH_COMBINATION_IMPL(fn, (a, 1)) => fn(a, 1)
    
  • 如果仍有元组可以从中选择元素(至少存在一个参数),我们调用 ,它检查第一个元组是否有任何选项可供选择:__VA_ARGS__FOR_EACH_COMBINATION_IMPL_RECURSE
    • 如果还有可供选择的选项,则将导致 2 次额外的调用;一个用于处理选取第一个元素的分支,另一个用于处理所有剩余元素:FOR_EACH_COMBINATION_IMPL
      FOR_EACH_COMBINATION_IMPL(fn, (), (a, b), (1, 2))
      =>
      FOR_EACH_COMBINATION_IMPL(fn, (a), (1, 2))
      FOR_EACH_COMBINATION_IMPL(fn, (), (b), (1, 2))
      
    • 如果没有剩余的选项,则计算结果为无(停止递归):
      FOR_EACH_COMBINATION_IMPL(fn, (), (), (1, 2)) => /* nothing */
      

下面是一个示例扩展,用于说明正在发生的扩展:

FOR_EACH_COMBINATION(foo, (a, b), (1, 2))
=>
FOR_EACH_COMBINATION_IMPL(foo, (), (a, b), (1, 2))
=>
FOR_EACH_COMBINATION_IMPL(foo, (a), (1, 2))
FOR_EACH_COMBINATION_IMPL(foo, (), (b), (1, 2))
=>
FOR_EACH_COMBINATION_IMPL(foo, (a, 1))
FOR_EACH_COMBINATION_IMPL(foo, (a), (2))
FOR_EACH_COMBINATION_IMPL(foo, (b), (1, 2))
FOR_EACH_COMBINATION_IMPL(foo, (), (), (1, 2)) // expands to nothing
=>
foo(a, 1)
FOR_EACH_COMBINATION_IMPL(foo, (a, 2))
FOR_EACH_COMBINATION_IMPL(foo, (a), ()) // expands to nothing
FOR_EACH_COMBINATION_IMPL(foo, (b, 1))
FOR_EACH_COMBINATION_IMPL(foo, (b), (2))
=>
foo(a, 1)
foo(a, 2)
foo(b, 1)
FOR_EACH_COMBINATION_IMPL(foo, (b, 2))
FOR_EACH_COMBINATION_IMPL(foo, (b), ()) // expands to nothing
=>
foo(a, 1)
foo(a, 2)
foo(b, 1)
foo(b, 2)

2.3 示例

以下是 .
请注意,您可以根据需要传递任意数量的元组,并且每个元组可以包含任意数量的元素(大小的唯一限制因素是宏提供的重新扫描次数)
FOR_EACH_COMBINATIONEXPAND

2.3.1 简单循环
#define GEN(x) template void do_something<x>();
FOR_EACH_COMBINATION(
    GEN,
    (1,2,3)
)
#undef GEN

扩展为:

template void do_something<1>();
template void do_something<2>();
template void do_something<3>();
2.3.2 您的函数match_expr
// helper for getting a tuple of n consecutive int values, starting at 1.
// currently only works up to 5, but can be easily expanded by
// adding more INT_TUPLE_* macros.
// examples:
//   - INT_TUPLE(0) => ()
//   - INT_TUPLE(3) => (1,2,3)
//   - INT_TUPLE(5) => (1,2,3,4,5)
#define INT_TUPLE(size) (CONCAT(INT_TUPLE_, size))
#define INT_TUPLE_0 
#define INT_TUPLE_1 1
#define INT_TUPLE_2 INT_TUPLE_1, 2
#define INT_TUPLE_3 INT_TUPLE_2, 3
#define INT_TUPLE_4 INT_TUPLE_3, 4
#define INT_TUPLE_5 INT_TUPLE_4, 5

#define GEN(x, y, z) template bool match_expr<x,y,z>();
FOR_EACH_COMBINATION(
  GEN,
  INT_TUPLE(3),
  (real, bool, index),
  (0, 1)
)
#undef GEN

扩展到

template bool match_expr<1,real,0>();
template bool match_expr<1,real,1>();
template bool match_expr<1,bool,0>();
template bool match_expr<1,bool,1>();
template bool match_expr<1,index,0>();
template bool match_expr<1,index,1>();
/* ... */
template bool match_expr<3,real,0>();
template bool match_expr<3,real,1>();
template bool match_expr<3,bool,0>();
template bool match_expr<3,bool,1>();
template bool match_expr<3,index,0>();
template bool match_expr<3,index,1>();
2.3.3 所有 8 位模式

// all possible arrangements of 8 bits, in order.
// all8BitPatterns[1] == {0,0,0,0,0,0,0,1}
// all8BitPatterns[4] == {0,0,0,0,0,1,0,0}
// all8BitPatterns[9] == {0,0,0,0,1,0,0,1}
// etc..
#define GEN(b1,b2,b3,b4,b5,b6,b7,b8) {b1,b2,b3,b4,b5,b6,b7,b8},
int all8BitPatterns[256][8] = { 
    FOR_EACH_COMBINATION(
        GEN,
        (0, 1), (0, 1), (0, 1), (0, 1),
        (0, 1), (0, 1), (0, 1), (0, 1)
    )
};
#undef GEN

扩展到

int all8BitPatterns[256][8] = {
  {0,0,0,0,0,0,0,0},
  {0,0,0,0,0,0,0,1},
  {0,0,0,0,0,0,1,0},
  {0,0,0,0,0,0,1,1},
  /* ... */
  {1,1,1,1,1,1,0,0},
  {1,1,1,1,1,1,0,1},
  {1,1,1,1,1,1,1,0},
  {1,1,1,1,1,1,1,1},
};

3. 在线试用!

这是一个包含宏和上面显示的所有示例的 godbolt。FOR_EACH_COMBINATION

3.1 ...或本地

这是完整的代码,包括上面的所有示例。

若要仅查看预处理器的输出,可以使用以下选项:

  • 对于 gcc 和 clang:-std=c++20 -E
  • 对于 msvc:警告:默认情况下,msvc-preprocessor 不符合标准,必须传递以强制执行符合标准的宏评估。/std:c++20 /Zc:preprocessor /P/Zc:preprocessor
// Simple concat
// example: CONCAT(foo,bar) => foobar
#define CONCAT(a, b) CONCAT_IMPL(a, b)
#define CONCAT_IMPL(a, b) a##b

// returns the first argument
// example: VAARGS_HEAD(1,2,3) => 1
#define VAARGS_HEAD(head, ...) head
// returns all arguments except the first one
// example: VAARGS_TAIL(1,2,3) => 2,3
#define VAARGS_TAIL(head, ...) __VA_ARGS__

// basic preprocessor if
// examples:
//  - IIF(1)(a,b) => a
//  - IIF(0)(a,b) => b
#define IIF(value) CONCAT(IIF_,value)
#define IIF_1(true_, false_) true_
#define IIF_0(true_, false_) false_

// evaluates to 1 if it has been called with at least 1 argument, 0 otherwise
// examples:
//   - HAS_VAARGS(1,2) => 1
//   - HAS_VAARGS()    => 0
#define HAS_VAARGS(...) VAARGS_HEAD(__VA_OPT__(1,) 0)

// forces the preprocessor to repeatedly scan an expression
// this definition forces a total of 86 scans, but can easily extended
// by adding more EXPAND*() macros (each additional one more than
// quadruples the amount of scans)
// examples:
//   - CONCAT DELAY() (a,b)         => CONCAT (a,b)
//   - EXPAND(CONCAT DELAY() (a,b)) => ab
#define EXPAND(...) EXPAND1(EXPAND1(EXPAND1(EXPAND1(__VA_ARGS__))))
#define EXPAND1(...) EXPAND2(EXPAND2(EXPAND2(EXPAND2(__VA_ARGS__))))
#define EXPAND2(...) EXPAND3(EXPAND3(EXPAND3(EXPAND3(__VA_ARGS__))))
#define EXPAND3(...) __VA_ARGS__

// evaluates to nothing, but requires an additional preprocessor scan.
// this can be used to delay macro evaluations.
// examples:
//   - CONCAT(a,b)                  => ab
//   - CONCAT DELAY() (a,b)         => a DELAY_IMPL_NOTHING () b
//   - EXPAND(CONCAT DELAY() (a,b)) => ab
#define DELAY DELAY_IMPL_NOTHING
#define DELAY_IMPL_NOTHING()

// discards all arguments, evaluates to nothing
#define SWALLOW(...)

// appends an element to a tuple
// examples:
//   - TUPLE_APPEND((a,b), c) => (a,b,c)
//   - TUPLE_APPEND((), a)    => (a)
#define TUPLE_APPEND(tuple, el) (TUPLE_APPEND_IMPL_UNPACK tuple el) 
#define TUPLE_APPEND_IMPL_UNPACK(...) __VA_ARGS__ __VA_OPT__(,)

// if __VA_ARGS__ is empty then it expands to fn(args);
// otherwise it'll expand to FOR_EACH_COMBINATION_IMPL_RECURSE(fn, args, __VA_ARGS__)
#define FOR_EACH_COMBINATION_IMPL(fn, args, ...) \
  IIF(HAS_VAARGS(__VA_ARGS__))( \
    FOR_EACH_COMBINATION_IMPL_RECURSE, \
    FOR_EACH_COMBINATION_IMPL_CALL \
  )(fn, args __VA_OPT__(, __VA_ARGS__))

// evaluates the user-provided function-like macro fn with arguments args.
// example: FOR_EACH_IMPL_CALL(foo, (1,2)) => foo(1,2)
#define FOR_EACH_COMBINATION_IMPL_CALL(fn, args) \
  fn args

// if tuple has at least 1 element it calls FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY;
// otherwise it stops recursion.
// examples:
//   - FOR_EACH_COMBINATION_IMPL_RECURSE(fn, (), (a, b))
//     => FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY(fn, (), (a,b))
//   - FOR_EACH_COMBINATION_IMPL_RECURSE(fn, (), ())
//     => 
#define FOR_EACH_COMBINATION_IMPL_RECURSE(fn, args, tuple, ...) \
  IIF(HAS_VAARGS tuple)( \
    FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY, \
    SWALLOW \
  ) DELAY() ( \
    fn, args, tuple __VA_OPT__(, __VA_ARGS__) \
  )

// calls FOR_EACH_COMBINATION_IMPL twice;
// once with the first element of tuple appended to args,
// and a second time with the first element of tuple removed.
// examples:
//   - FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY(fn, (), (a,b), (c,d))
//     => FOR_EACH_COMBINATION_IMPL(fn, (a), (c,d))
//        FOR_EACH_COMBINATION_IMPL(fn, (), (b), (c,d))
#define FOR_EACH_COMBINATION_IMPL_RECURSE_APPLY(fn, args, tuple, ...) \
    FOR_EACH_COMBINATION_IMPL DELAY() ( \
      fn, \
      TUPLE_APPEND(args, VAARGS_HEAD tuple) \
      __VA_OPT__(, __VA_ARGS__) \
    ) \
    \
    FOR_EACH_COMBINATION_IMPL DELAY() ( \
      fn, \
      args, \
      (VAARGS_TAIL tuple) \
      __VA_OPT__(, __VA_ARGS__) \
    )

// takes a function-like macro (fn) and an arbitrary amount of tuples.
// the tuples can be of any size (only constrained by the number
// of expansions provided by the EXPAND macro)
// fn will be evaluated for each possible combination from the tuples.
// examples:
//   - FOR_EACH_COMBINATION(foo, (a,b))
//     => foo(a) foo(b)
//   - FOR_EACH_COMBINATION(foo, (a,b), (1,2))
//     => foo(a, 1) foo(a, 2) foo(b, 1) foo(b, 2)
#define FOR_EACH_COMBINATION(fn, ...) \
  EXPAND( \
    FOR_EACH_COMBINATION_IMPL( \
      fn, \
      () \
      __VA_OPT__(, __VA_ARGS__) \
    ) \
  )

// helper for getting a tuple of n consecutive int values, starting at 1.
// currently only works up to 5, but can be easily expanded by
// adding more INT_TUPLE_* macros.
// examples:
//   - INT_TUPLE(0) => ()
//   - INT_TUPLE(3) => (1,2,3)
//   - INT_TUPLE(5) => (1,2,3,4,5)
#define INT_TUPLE(size) (CONCAT(INT_TUPLE_, size))
#define INT_TUPLE_0 
#define INT_TUPLE_1 1
#define INT_TUPLE_2 INT_TUPLE_1, 2
#define INT_TUPLE_3 INT_TUPLE_2, 3
#define INT_TUPLE_4 INT_TUPLE_3, 4
#define INT_TUPLE_5 INT_TUPLE_4, 5

// EXAMPLES:

#define GEN(x) template void do_something<x>();
FOR_EACH_COMBINATION(
    GEN,
    (1,2,3)
)
#undef GEN
  
#define GEN(x, y, z) template bool match_expr<x,y,z>();
FOR_EACH_COMBINATION(
  GEN,
  INT_TUPLE(3),
  (real, bool, index),
  (0, 1)
)
#undef GEN

// all possible arrangements of 8 bits, in order.
// all8BitPatterns[1] == {0,0,0,0,0,0,0,1}
// all8BitPatterns[4] == {0,0,0,0,0,1,0,0}
// all8BitPatterns[9] == {0,0,0,0,1,0,0,1}
// etc..
#define GEN(b1,b2,b3,b4,b5,b6,b7,b8) {b1,b2,b3,b4,b5,b6,b7,b8},
int all8BitPatterns[256][8] = { 
    FOR_EACH_COMBINATION(
        GEN,
        (0, 1), (0, 1), (0, 1), (0, 1),
        (0, 1), (0, 1), (0, 1), (0, 1)
    )
};
#undef GEN

int main() {}