Python 3: does Pool keep the original order of data passed to map?(Python 3:Pool 是否保持传递给 map 的原始数据顺序?)
问题描述
我编写了一个小脚本来在 4 个线程之间分配工作负载并测试结果是否保持有序(相对于输入的顺序):
I have written a little script to distribute workload between 4 threads and to test whether the results stay ordered (in respect to the order of the input):
from multiprocessing import Pool
import numpy as np
import time
import random
rows = 16
columns = 1000000
vals = np.arange(rows * columns, dtype=np.int32).reshape(rows, columns)
def worker(arr):
time.sleep(random.random()) # let the process sleep a random
for idx in np.ndindex(arr.shape): # amount of time to ensure that
arr[idx] += 1 # the processes finish at different
# time steps
return arr
# create the threadpool
with Pool(4) as p:
# schedule one map/worker for each row in the original data
q = p.map(worker, [row for row in vals])
for idx, row in enumerate(q):
print("[{:0>2}]: {: >8} - {: >8}".format(idx, row[0], row[-1]))
对我来说,这总是会导致:
For me this always results in:
[00]: 1 - 1000000
[01]: 1000001 - 2000000
[02]: 2000001 - 3000000
[03]: 3000001 - 4000000
[04]: 4000001 - 5000000
[05]: 5000001 - 6000000
[06]: 6000001 - 7000000
[07]: 7000001 - 8000000
[08]: 8000001 - 9000000
[09]: 9000001 - 10000000
[10]: 10000001 - 11000000
[11]: 11000001 - 12000000
[12]: 12000001 - 13000000
[13]: 13000001 - 14000000
[14]: 14000001 - 15000000
[15]: 15000001 - 16000000
问题:那么,Pool在q<中存储每个map函数的结果时,是否真的保持原始输入的顺序?/代码>?
Question: So, does Pool really keep the original input's order when storing the results of each map function in q?
旁注:我问这个,因为我需要一种简单的方法来并行处理多个工人的工作.在某些情况下,排序无关紧要.但是,在某些情况下(如 q 中的结果)必须以原始顺序返回,因为我使用了一个依赖于有序数据的附加 reduce 函数.
Sidenote: I am asking this, because I need an easy way to parallelize work over several workers. In some cases the ordering is irrelevant. However, there are some cases where the results (like in q) have to be returned in the original order, because I'm using an additional reduce function that relies on ordered data.
性能:在我的机器上,这个操作比在单个进程上的正常执行快了大约 4 倍(正如预期的那样,因为我有 4 个内核).此外,所有 4 个内核在运行时均处于 100% 的使用率.
Performance: On my machine this operation is about 4 times faster (as expected, since I have 4 cores) than normal execution on a single process. Additionally, all 4 cores are at 100% usage during the runtime.
推荐答案
Pool.map 结果是有序的.如果您需要订购,很好;如果你不这样做,池.imap_unordered 可能是一个有用的优化.
Pool.map results are ordered. If you need order, great; if you don't, Pool.imap_unordered may be a useful optimization.
请注意,虽然您从 Pool.map 接收结果的顺序是固定的,但它们的计算顺序是任意的.
Note that while the order in which you receive the results from Pool.map is fixed, the order in which they are computed is arbitrary.
这篇关于Python 3:Pool 是否保持传递给 map 的原始数据顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Python 3:Pool 是否保持传递给 map 的原始数据顺序?
基础教程推荐
- pyserial - 可以从线程 a 写入串行端口,是否阻塞从线程 b 读取? 2022-01-01
- 使用生成器和迭代器时 Python 多循环失败 2022-01-01
- 在 Celery 工作人员中捕获 Heroku SIGTERM 以优雅地关 2022-01-01
- 将 x 轴刻度更改为自定义字符串 2022-01-01
- 与常规 dict 相比,Python manager.dict() 非常慢 2022-01-01
- 由Python将MP3转换为MIDI(类型错误:无法加载插件:mtg-Melodia:Melodia) 2022-01-01
- numpy float:比算术运算中内置的慢 10 倍? 2022-01-01
- 尝试制作WhatsApp机器人 2022-01-01
- 用 Python 编写 Fortran 无格式文件 2022-01-01
- Discord.py 缺少必需的参数 2022-01-01
