快速查找两个大文本文件之间的差异

编程基础网 Python问题

2022-01-01

Quickly find differences between two large text files(快速查找两个大文本文件之间的差异)

本文介绍了快速查找两个大文本文件之间的差异的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个 3GB 的文本文件，每个文件大约有 8000 万行.并且它们共享 99.9% 的相同行(文件 A 有 60,000 个唯一行，文件 B 有 80,000 个唯一行).

I have two 3GB text files, each file has around 80 million lines. And they share 99.9% identical lines (file A has 60,000 unique lines, file B has 80,000 unique lines).

如何在两个文件中快速找到这些独特的行?是否有任何现成的命令行工具可以做到这一点?我正在使用 Python，但我想找到一种有效的 Pythonic 方法来加载文件并进行比较是不太可能的.

How can I quickly find those unique lines in two files? Is there any ready-to-use command line tools for this? I'm using Python but I guess it's less possible to find a efficient Pythonic method to load the files and compare.

欢迎提出任何建议.