difflib 标准库详解:Python 文本对比的利器
difflib
是 Python 的标准库之一,它提供了很多用于计算和比较序列差异的功能。在这里,我们主要关注它在文本比较方面的应用。
- 使用
difflib.Differ()
进行差异化比较
import difflib
text1 = """The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
"""
text2 = """The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Simple is better than complex.
"""
d = difflib.Differ()
diff = d.compare(text1.splitlines(), text2.splitlines())
print('\n'.join(diff))
- 使用
difflib.HtmlDiff()
生成 HTML 格式的差异报告
import difflib
text1 = """The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
"""
text2 = """The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Simple is better than complex.
"""
d = difflib.HtmlDiff()
print(d.make_file(text1.splitlines(), text2.splitlines()))
- 使用
difflib.SequenceMatcher()
进行复杂的序列比较
import difflib
text1 = "abcd"
text2 = "axcd"
s = difflib.SequenceMatcher(None, text1, text2)
for tag, i1, i2, j1, j2 in s.get_opcodes():
print(f"{tag} {text1[i1:i2]} {text2[j1:j2]}")
difflib
提供了丰富的文本比较功能,可以方便地进行文件比较、多行字符串比较等。在实际应用中,可以根据需要选择合适的工具和方法。
评论已关闭