提问人:Parine 提问时间:9/27/2023 最后编辑:Edo AkseParine 更新时间:9/27/2023 访问量:112
通过python将两个输入文件转换为某些字符串规则
convert two input files into certain string rules by python
问:
假设有两个输入文件,如下所示:
(仅由input1.txt
hstep3_*
)
hstep3_num00 = a5;
hstep3_num01 = 3b;
hstep3_num02 = 4f;
hstep3_num03 = 27;
input2.txt
(括号内的字母是一些随机字符,由,
)
some random strings that are not 'hstep' form
...
match hstep1_num00 = {eau,t,nb,v,d}; // MATCH
match hstep1_num01 = {c,bul,kv,e}; // MATCH
...
match hstep3_num00 = {u_ku,b,ntv,q}; // MATCH
match hstep3_num01 = {qq,rask,cb_p}; // MATCH
match hstep3_num02 = {c,a,ha,w,ykl}; // MATCH
match hstep3_num03 = {p,gu,enb_q_b,z,d}; // MATCH
...
some random strings that are not 'hstep' form
我想做的是从 中整理出方程的所有左侧,并从 中匹配相应的括号和值。input1.txt
input2.txt
因此,最终输出 .txt 如下所示:output.txt
{u_ku,b,ntv,q} = a5;
{qq,rask,cb_p} = 3b;
{c,a,ha,w,ykl} = 4f;
{p,gu,enb_q_b,z,d} = 27;
为了通过python做到这一点,我想过.
另外,由于括号内的字符数并不总是以行为单位,我认为我必须使用正则表达式来限制里面的范围,但它并没有像我预期的那样工作......
有人给我任何解决方案或指南吗?readlines.split()
{}
任何帮助将不胜感激。谢谢!
答:
2赞
mozway
9/27/2023
#1
您可以使用带有正则表达式的两个循环。第一个循环使用 re.findall
读取行并在匹配时构建字典,第二个循环使用 re.sub
执行替换:input2.txt
input1.txt
import re
with open('input2.txt') as f2:
dic = dict(re.findall(fr'match ([^\s=]+) = ([^;]+); // MATCH', f2.read()))
# {'hstep1_num00': '{eau,t,nb,v,d}', 'hstep1_num01': '{c,bul,kv,e}',
# 'hstep3_num00': '{u_ku,b,ntv,q}', 'hstep3_num01': '{qq,rask,cb_p}',
# 'hstep3_num02': '{c,a,ha,w,ykl}', 'hstep3_num03': '{p,gu,enb_q_b,z,d}'}
with open('input1.txt') as f1, open('output1.txt', 'w') as f_out:
for line in f1:
f_out.write(re.sub(r'^\S+', lambda m: dic.get(m.group(), ''), line))
输出文件:
{u_ku,b,ntv,q} = a5;
{qq,rask,cb_p} = 3b;
{c,a,ha,w,ykl} = 4f;
{p,gu,enb_q_b,z,d} = 27;
对准
如果需要对齐字符串,则可以修改上述方法。
固定宽度(或基于最大可能宽度):
import re
# same as previously
with open('input2.txt') as f2:
dic = dict(re.findall(fr'match ([^\s=]+) = ([^;]+); // MATCH', f2.read()))
WIDTH = max([len(v) for k,v in dic.items() if k.startswith('hstep3_')])
with open('input1.txt') as f1, open('output1.txt', 'w') as f_out:
for line in f1:
f_out.write(re.sub(r'^\S+', lambda m: dic.get(m.group(), '').ljust(WIDTH), line))
动态宽度,基于最长的字符串:
import re
# same as previously
with open('input2.txt') as f2:
dic = dict(re.findall(fr'match ([^\s=]+) = ([^;]+); // MATCH', f2.read()))
with open('input1.txt') as f1:
WIDTH = max(len(dic.get(line.split(maxsplit=1)[0], '')) for line in f1)
with open('input1.txt') as f1, open('output1.txt', 'w') as f_out:
for line in f1:
f_out.write(re.sub(r'^\S+', lambda m: dic.get(m.group(), '').ljust(WIDTH), line))
输出:
{u_ku,b,ntv,q} = a5;
{qq,rask,cb_p} = 3b;
{c,a,ha,w,ykl} = 4f;
{p,gu,enb_q_b,z,d} = 27;
评论
0赞
mozway
9/27/2023
@DarkKnight我错过了什么?
0赞
CtrlZ
9/27/2023
可变间距,使所有内容正确对齐
0赞
mozway
9/27/2023
这是@DarkKnight要求?没有明确提及,那么规则是什么?更长的字符串?固定最大值?
0赞
mozway
9/27/2023
无论如何,很容易处理,我更新了答案。
1赞
Edo Akse
9/27/2023
#2
下面的代码没有优化,但它是为了让 OP 更好地理解所涉及的过程
# read input1 and turn into dict
input1 = {}
with open("input1.txt") as infile:
for line in infile.readlines():
key, value = line.split(" = ")
input1[key] = value
# read input 2 and store the maxlen value
input2 = []
maxlen = 0
with open("input2.txt") as infile:
for line in infile.readlines():
# only process lines that start with "match hstep3"
if line.startswith("match hstep3"):
key = line.split(" ")[1]
value = line.split("= ")[1].split(";")[0]
input2.append([key, value])
# get the maxlength and store it for future use
maxlen = max(maxlen, len(value))
# finally, produce the required output and write to file
with open("output.txt", "w") as outfile:
for line in input2:
key, value = line
# use an f-string to produce the required output
newline = f"{value:<{maxlen}} = {input1[key]}"
outfile.write(newline)
output.txt
文件内容:
{u_ku,b,ntv,q} = a5;
{qq,rask,cb_p} = 3b;
{c,a,ha,w,ykl} = 4f;
{p,gu,enb_q_b,z,d} = 27;
1赞
Muhammad Shamshad Aslam
9/27/2023
#3
如果您的数据采用您提到的格式或接近它的格式,那么这应该有效。
result_2_dict = {}
result_1_dict = {}
file_2_list= []
file_1_list = []
with open('file2.txt', 'r') as file:
for line in file:
parts = line.split('=')
file_2_list.append(parts)
for item in file_2_list:
if "h" in item[0]:
result_2_dict[item[0].strip("match").strip() ] = item[1].strip().split(" ")[0].strip(";")
with open('file1.txt', 'r') as file:
for line in file:
parts = line.split('=')
file_1_list.append(parts)
for item in file_1_list:
if "h" in item[0]:
result_1_dict[item[0].strip()] = item[1].strip().strip(";")
matches_values = {}
for key, value in result_2_dict.items():
if key in result_1_dict:
matches_values[value] = result_1_dict[key]
for key, value in matches_values.items():
print(f"{key} = {value}")
评论
match …
hstep…