提问人:Charles2a 提问时间:9/27/2023 更新时间:9/27/2023 访问量:28
在 Python 中解析 Markdown 文件以生成目录时,“lvl”变量的增量不正确
Incorrect incrementation of the 'lvl' variable when parsing Markdown files to generate a table of contents in Python
问:
我正在创建一个类似于 gitbook 的文档托管应用程序,为了使其具有跨功能性,我正在使用一个 SUMMARY.md 文件来创建我的文件树,如下所示:
SUMMARY.md :
# Table of contents
* [Introduction](./)
## Active Directory <a href="#ad" id="ad"></a>
* [Category X](path/to/category-x/README.md)
* [Subcategory 1](path/to/category-x/subcategory-1.md)
* [Subcategory 2](path/to/category-x/subcategory-2.md)
* [Subcategory 3](path/to/category-x/subcategory-3.md)
* [🛠️ Tool A](path/to/category-x/tool-a.md)
* [Subcategory 4](path/to/category-x/subcategory-4.md)
* [Subcategory 5](path/to/category-x/subcategory-5.md)
* [Subcategory 6](path/to/category-x/subcategory-6/README.md)
* [Sub-subcategory 1](path/to/category-x/subcategory-6/sub-subcategory-1.md)
* [Sub-subcategory 2](path/to/category-x/subcategory-6/sub-subcategory-2.md)
* [Category Y](path/to/category-y/README.md)
* [Subcategory 7](path/to/category-y/subcategory-7.md)
* [Subcategory 8](path/to/category-y/subcategory-8.md)
* [Subcategory 9](path/to/category-y/subcategory-9.md)
* [🛠️ Tool B](path/to/category-y/tool-b.md)
基本上,我需要将我的子类别包装在“cat title”类 div 中,以包含类别标题和一个图标,当我单击它时,它将使用事件侦听器下拉子文章和子类别,如下所示。
<div class="cat-title">
<li><a href="/path/to/category-x/README.md">Category X</a><div class="arrow">→</div></li>
</div>
<ul class="subcategory1">
<li><a href="/path/to/category-x/subcategory-1.md">Subcategory 1</a></li>
<li><a href="/path/to/category-x/subcategory-2.md">Subcategory 2</a></li>
<li><a href="/path/to/category-x/subcategory-3.md">Subcategory 3</a></li>
<li><a href="/path/to/category-x/tool-a.md">🛠️ Tool A</a></li>
<li><a href="/path/to/category-x/subcategory-4.md">Subcategory 4</a></li>
<li><a href="/path/to/category-x/subcategory-5.md">Subcategory 5</a></li>
<li><a href="/path/to/category-x/subcategory-6/README.md">Subcategory 6</a></li>
<ul class="subcategory2">
<li><a href="/path/to/category-x/subcategory-6/sub-subcategory-1.md">Sub-subcategory 1</a></li>
<li><a href="/path/to/category-x/subcategory-6/sub-subcategory-2.md">Sub-subcategory 2</a></li>
</ul>
<li><a href="/path/to/category-y/README.md">Category Y</a></li>
<ul class="subcategory1">
<li><a href="/path/to/category-y/subcategory-7.md">Subcategory 7</a></li>
<li><a href="/path/to/category-y/subcategory-8.md">Subcategory 8</a></li>
<li><a href="/path/to/category-y/subcategory-9.md">Subcategory 9</a></li>
<li><a href="/path/to/category-y/tool-b.md">🛠️ Tool B</a></li>
</ul>
</ul>
为了达到这个结果,我使用这个函数:
def summary():
regex_link_name = re.compile(r'\[(.*)\]\((.*)\)')
regex_title = re.compile(r'##\s+([^<]+)')
with open('SUMMARY.md', 'r') as f:
lines = f.readlines()
html = ''
lvl = 0
i = 0
while i < len(lines):
line = lines[i]
if not line.strip():
i += 1
continue
if line.strip().startswith('*'):
space_count = line.split('*')[0].count(' ')
current_lvl = space_count
if current_lvl > lvl:
lvl += 1 #why tf is the first lvl 1 and not 0 when it's the last subcategory of a category, causes last subcategory1 to be a subcategory2
#TODO : findout
html += f'<ul class="subcategory{lvl}">\n'
elif current_lvl < lvl:
html += '</ul>\n'
lvl -= 1
matched = regex_link_name.findall(line)
name = matched[0][0]
link = matched[0][1]
# Check if this line is followed by a sublist
is_subcategory = False
if i + 1 < len(lines):
next_line = lines[i + 1]
if next_line.strip().startswith('*') and next_line.split('*')[0].count(' ') > space_count:
is_subcategory = True
if is_subcategory:
html += f'<div class="cat-title">\n<li><a href="/{link}">{name}</a><div class="arrow">→</div></li>\n</div>\n'
else:
html += f'<li><a href="/{link}">{name}</a></li>\n'
print(name, lvl)
if line.strip().startswith('#'):
html += '</ul>\n' * lvl
lvl = 0
titles = regex_title.findall(line)
if titles:
title = titles[0]
html += f'<span>{title}</span>\n'
i += 1
with open('output.html', 'w', encoding='utf-8') as out_file:
out_file.write(html)
return html
问题是,每次我等待类别中的最后一个 1 级子类别(应该是类 subcategory1)时,lvl 变量在我的调试器中会频繁地递增一次。
我完全不知道为什么会发生这种情况,因为我从不操纵 lvl 变量。
仅当子类别是类别中的最后一个时,才会发生这种情况,如本例所示:
* [Category X](path/to/category-x/README.md)
* [🛠️ Tool A](path/to/tool-a.md)
* [Category Y](path/to/category-y.md)
* [Category Z](path/to/category-z.md)
* [Category W](path/to/category-w.md)
* [Category V](path/to/category-v.md)
* [Category U](path/to/category-u/README.md)
* [Subcategory 1](path/to/category-u/subcategory-1.md)
* [Subcategory 2](path/to/category-u/subcategory-2.md)
* [Category T](path/to/category-t/README.md)
* [Subcategory 3](path/to/category-t/subcategory-3.md)
* [Subcategory 4](path/to/category-t/subcategory-4.md)
* [Subcategory 5](path/to/category-t/subcategory-5.md)
* [🛠️ Tool B](path/to/category-t/tool-b.md)
## Web services <a href="#web" id="web"></a>
如您所见,甚至还没有创建子类别标签:
<div class="cat-title">
<li><a href="/path/to/category-x/README.md">Category X</a><div class="arrow">→</div></li>
</div>
<li><a href="/path/to/category-x/tool-a.md">🛠️ Tool A</a></li>
<li><a href="/path/to/category-x/category-y.md">Category Y</a></li>
<li><a href="/path/to/category-x/category-z.md">Category Z</a></li>
<li><a href="/path/to/category-x/category-w.md">Category W</a></li>
<li><a href="/path/to/category-x/category-v.md">Category V</a></li>
我尝试了很多事情,比如四处寻找一个糟糕的闭包、一个“lvl”变量的坏增量,但我什么也没找到
答: 暂无答案
评论