如何简化巨大的算术表达式?

how to simplify huge arithmatic expression?

提问人:HabibS 提问时间:7/29/2023 更新时间:7/30/2023 访问量:99

问:

我有一个巨大的表情,比如:

x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52] + FUNC1(z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51] + FUNC0(h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50] + FUNC0(a49) + FUNC2(a49, x49, y49)) + FUNC2(h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50] + FUNC0(a49) + FUNC2(a49, x49, y49), a49, x49) + y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51] + FUNC1(h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50] + FUNC0(a49) + FUNC2(a49, x49, y49) + w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50] + FUNC1(a49 + v49 + FUNC1(x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52]) + RET(x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52], y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + movr[53] + m1[53]) + RET(a49 + v49 + FUNC1(x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52]) + RET(x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52], y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + movr[53] + m1[53], x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52], y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + movr[54] + m1[54]) + RET(h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50] + FUNC0(a49) + FUNC2(a49, x49, y49) + w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50] + FUNC1(a49 + v49 + FUNC1(x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52]) + RET(x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52], y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + movr[53] + m1[53]) + RET(a49 + v49 + FUNC1(x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52]) + RET(x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52], y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) ...

我需要通过用其他变量替换重复表达式的某些部分来简化此表达式。 例如:

a = RET(v49, t49, z49) 
b= w49 + h49 + FUNC1(v49) + a + movr[50] + m1[50] 
and so on...

我的问题是;这确实是一个巨大的表达式(如 2MB 长的表达式),手动执行此操作几乎是不可能的,也没有错误。

现在我的问题是;有没有应用程序可以做这样的事情?或任何 Python 程序可以这样做?

我可以很容易地对python进行编程,但我缺乏这样的算法知识。

任何帮助表示赞赏。

Python 数学 简化

评论

0赞 Dr. V 7/29/2023
这需要一些编码,但我可能会通过重载运算符和函数来编写数据类型,从而创建一个符号图。然后,您可以识别该图中的常见表达式,然后再转储它。如果你那里没有函数,我的库会完全满足你的需要。不过,请随意浏览代码以获取想法:saltype.readthedocs.io/en/latest/api.html#simplify。也许可以帮助你,因为它以某种方式处理功能:web.casadi.orgsaltypecasadi
0赞 HabibS 7/29/2023
@Dr.V:我会试一试。
0赞 Mike 'Pomax' Kamermans 7/29/2023
听起来您正在尝试识别较大字符串中的重复子字符串,在这种情况下,像 stackoverflow.com/questions/37499968 这样的东西可能对阅读很有用。

答:

3赞 Michael Hodel 7/29/2023 #1

以下函数提取所有函数调用并将它们放入变量中。

def simplify(progstr, variable_prefix='x'):
    progstr = f' {progstr} '
    prog = []
    while progstr.count('(') > 0:
        for i, c in enumerate(progstr):
            if c == ')':
                c2, i2 = None, i
                while c2 != '(':
                    i2 -= 1
                    c2 = progstr[i2]
                i2 -= 1
                c2 = progstr[i2]
                while c2 not in [',', ' ', '(', ')']:
                    i2 -= 1
                    c2 = progstr[i2]
                variable = progstr[i2+1:i+1]
                vname = f'{variable_prefix}{str(len(prog))}'
                progstr = progstr.replace(variable, vname)
                prog.append(f'{vname} = {variable}')
                break
    prog.append(progstr[1:-1])
    return '\n'.join(prog)

expression = 'x49 + t49 + FUNC1(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51]) + RET(y49 + z49 + FUNC1(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50]) + RET(w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49, t49) + movr[51] + m1[51], w49 + h49 + FUNC1(v49) + RET(v49, t49, z49) + movr[50] + m1[50], v49) + movr[52] + m1[52]'

print(simplify(expression, 'x'))

指纹

x0 = FUNC1(v49)
x1 = RET(v49, t49, z49)
x2 = FUNC1(w49 + h49 + x0 + x1 + movr[50] + m1[50])
x3 = RET(w49 + h49 + x0 + x1 + movr[50] + m1[50], v49, t49)
x4 = FUNC1(y49 + z49 + x2 + x3 + movr[51] + m1[51])
x5 = RET(y49 + z49 + x2 + x3 + movr[51] + m1[51], w49 + h49 + x0 + x1 + movr[50] + m1[50], v49)
t49 + x4 + x5 + movr[52] + m1[52]

除了使代码更具可读性之外,这还可以避免大量重复计算,这应该会大大加快计算速度(特别是如果单个函数调用成本很高),例如,这里只执行一次,而不是 5 次。FUNC1(v49)

(编辑):工作原理:

当表达式中有括号时,请执行以下操作:从左到右遍历表达式,直到遇到右括号(调用此位置),然后向左走,直到遇到左括号,然后向左走,直到遇到空格、逗号或括号(并调用此位置)。然后,该段标记第一个函数调用。然后,只需将每个匹配项替换为变量名称,然后添加到变量列表中。jiexpression[i:j]expression[i:j]expressionxx = expression[i:j]

关于代码的一些评论:

  • 它只在一些有效且格式良好的代码上进行了简短的测试,例如,它假设每个逗号后面有一个空格,每个运算符周围有一个空格(+、*、...,没有空格),右括号前没有空格,左括号后没有空格,...),
  • 它还假设变量前缀不会与表达式中的变量和函数名称发生冲突(可以通过了解名称并提供一些已知的安全前缀或调整代码本身来选择有效的前缀来使其更加健壮),
  • 它进一步假设(对于函数和一些常量参数)总是返回相同的结果,无论事先计算了什么,并且f(*args)fargs
  • 它还为仅出现一次的子表达式创建变量,这可能是不可取的(但这也应该很容易通过对代码进行一些小的调整来更改)。

评论

0赞 HabibS 7/29/2023
这看起来很棒!将立即测试它。
0赞 HabibS 7/29/2023
荣誉!我已经测试过了,它非常适合我的目的。