Can only iterate once through csv reader(只能通过 csv 阅读器迭代一次)
问题描述
所以我基本上有一个非常长的字符串列表,以及一个包含一列字符串和一列数字的 CSV 文件.我需要遍历非常长的字符串列表,并且对于每个字符串,循环遍历 CSV 文件的行,检查 CSV 第一列中的每个字符串,看看它是否出现在我的字符串中,如果出现,添加另一列中的数字到某事.一个最小的例子是
So I basically have an extremely long list of strings, and a CSV file that contains a column of strings and a column of numbers. I need to loop through the extremely long list of strings, and for each one, loop through the rows of the CSV file checking each string in the first column of the CSV to see if it occurs in my string, and if it does, add the number in the other column to something. A minimal sort of example would be
import csv
sList = ['a cat', 'great wall', 'mediocre wall']
vals = []
with open('file.csv', 'r') as f:
r = csv.reader(f)
for w in sList:
val = 0
for row in r:
if row[0] in w:
val += 1
vals.append(val)
我可能会使用它的 CSV 文件示例
An example of a CSV file with which I might use this could be
a, 1
great, 2
当然 csv.reader(f) 创建了一个我只能循环一次的可迭代对象.我在其他地方看到了使用 itertools 的建议,但我发现的所有建议都是针对涉及少量循环 CSV 文件的问题,通常只有两次.如果我多次尝试使用它来循环遍历 CSV,我不确定这对内存消耗意味着什么,总的来说,我只是想知道解决这个问题的最聪明的方法.
Of course csv.reader(f) creates an iterable that I can loop through only once. I've seen recommendations elsewhere to use itertools but all of the recommendations I've found have been for problems that involve looping through the CSV file a small number of times, usually just twice. If I tried to use this to loop through the CSV many times I'm unsure of what that would mean for memory consumption, and in general I'm just wondering about the smartest way to approach this problem.
推荐答案
你需要重置"文件迭代器:
You need to "reset" the file iterator:
import csv
sList = ['a cat', 'great wall', 'mediocre wall']
vals = []
with open('data.csv', 'r') as f:
r = csv.reader(f)
for w in sList:
val = 0
f.seek(0) #<-- set the iterator to beginning of the input file
for row in r:
print(row)
if row[0] in w:
val += 1
vals.append(val)
这篇关于只能通过 csv 阅读器迭代一次的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:只能通过 csv 阅读器迭代一次
基础教程推荐
- pyserial - 可以从线程 a 写入串行端口,是否阻塞从线程 b 读取? 2022-01-01
- 将 x 轴刻度更改为自定义字符串 2022-01-01
- 与常规 dict 相比,Python manager.dict() 非常慢 2022-01-01
- 由Python将MP3转换为MIDI(类型错误:无法加载插件:mtg-Melodia:Melodia) 2022-01-01
- 尝试制作WhatsApp机器人 2022-01-01
- 使用生成器和迭代器时 Python 多循环失败 2022-01-01
- 用 Python 编写 Fortran 无格式文件 2022-01-01
- 在 Celery 工作人员中捕获 Heroku SIGTERM 以优雅地关 2022-01-01
- numpy float:比算术运算中内置的慢 10 倍? 2022-01-01
- Discord.py 缺少必需的参数 2022-01-01
