Python教程-Python 多进程编程

在本文中，我们将学习如何使用 Python 实现多进程编程。我们还将讨论其高级概念。

什么是多进程编程？

多进程是系统能够并行运行一个或多个进程的能力。简单来说，多进程利用单台计算机系统内的两个或更多个CPU。这种方法还能够在多个进程之间分配任务。

处理单元共享主内存和外围设备以同时处理程序。多进程应用程序被分解成较小的部分并独立运行。每个进程由操作系统分配给处理器。

Python提供了一个内置的 multiprocessing 模块，支持进程切换。在使用多进程之前，我们必须熟悉进程对象。

为什么要使用多进程编程？

多进程编程对于在计算机系统内执行多个任务至关重要。假设一个没有多进程或单处理器的计算机。我们同时向该系统分配多个进程。

然后，它将不得不中断先前的任务并转移到另一个任务，以保持所有进程运行。就像一个厨师独自在厨房工作一样简单。他必须完成多项任务，如切割、清洁、烹饪、揉面团、烘焙等，以烹饪食物。

因此，多进程编程对于在同一时间执行多个任务而无需中断至关重要。它还可以轻松跟踪所有任务。这就是为什么多进程编程的概念产生的原因。

多进程编程可以被表示为具有多个中央处理器的计算机。
多核处理器指的是具有两个或多个独立单元的单个计算组件。

在多进程编程中，CPU可以分配多个任务，每个任务都有自己的处理器。

Python 中的多进程编程

Python 提供了 multiprocessing 模块，用于在单个系统内执行多个任务。它提供了一个用户友好且直观的 API，用于处理多进程编程。

让我们理解一下多进程编程的简单示例。

示例 -

from multiprocessing import Process  
   def disp():  
      print ('Hello !! Welcome to Python Tutorial')  
      if __name__ == '__main__':  
      p = Process(target=disp)  
      p.start()  
      p.join()

输出：

'Hello !! Welcome to Python Tutorial'

说明：

在上面的代码中，我们导入了 Process 类，然后在 disp() 函数内创建了 Process 对象。接下来，我们使用 start() 方法启动进程，并使用 join() 方法完成进程。我们还可以使用 args 关键字在声明的函数中传递参数。

让我们理解下面的多进程编程示例，其中包含了参数。

示例 - 2

# Python multiprocessing example  
# importing the multiprocessing module  
  
import multiprocessing  
def cube(n):  
   # This function will print the cube of the given number  
   print("The Cube is: {}".format(n * n * n))  
  
def square(n):  
    # This function will print the square of the given number  
   print("The Square is: {}".format(n * n))  
  
if __name__ == "__main__":  
   # creating two processes  
   process1 = multiprocessing.Process(target= square, args=(5, ))  
   process2 = multiprocessing.Process(target= cube, args=(5, ))  
  
   # Here we start the process 1  
   process1.start()  
   # Here we start process 2  
   process2.start()  
  
   # The join() method is used to wait for process 1 to complete  
   process1.join()  
   # It is used to wait for process 1 to complete  
   process2.join()  
  
   # Print if both processes are completed  
   print("Both processes are finished")

输出：

The Cube is: 125
The Square is: 25
Both processes are finished

说明：

在上面的示例中，我们创建了两个函数 - cube() 函数计算给定数字的立方，square() 函数计算给定数字的平方。

接下来，我们定义了 Process 类的进程对象，其中有两个参数。第一个参数是一个 target，表示要执行的函数，第二个参数是 args，表示要在函数内传递的参数。

process1 = multiprocessing.Process(target= square, args=(5, ))  
process2 = multiprocessing.Process(target= cube, args=(5, ))

我们使用 start() 方法启动进程。

process1.start()  
process2.start()

正如我们在输出中看到的，它等待 进程一 完成，然后等待 进程二。在两个进程都完成后，执行最后的语句。

Python 中的多进程编程类

Python 的 multiprocessing 模块提供了许多常用于构建并行程序的类。我们将讨论其主要类 - Process、Queue 和 Lock。我们已经在上一个示例中讨论了 Process 类。现在我们将讨论 Queue 和 Lock 类。

让我们看一个获取系统中当前正在工作的 CPU 数量的简单示例。

示例 -

import multiprocessing  
print("The number of CPU currently working in system : ", multiprocessing.cpu_count())

输出：

('The number of CPU currently woking in system : ', 32)

上述 CPU 数量可能因您的计算机而异。在我们的计算机上，核心数为 32。

使用 Python 进程队列类进行多进程编程

我们知道队列是数据结构的重要部分。Python 的 multiprocessing 与数据结构队列完全相同，基于“先进先出”的概念。队列通常存储 Python 对象，并在进程之间共享数据时起着重要作用。

队列被作为 Process 的目标函数中的参数传递，以允许进程消耗数据。队列提供了 put() 函数以插入数据和 get() 函数以从队列获取数据。让我们理解下面的示例。

示例 -

# Importing Queue Class  
  
from multiprocessing import Queue  
  
fruits = ['Apple', 'Orange', 'Guava', 'Papaya', 'Banana']  
count = 1  
# creating a queue object  
queue = Queue()  
print('pushing items to the queue:')  
for fr in fruits:  
    print('item no: ', count, ' ', fr)  
    queue.put(fr)  
    count += 1  
  
print('\npopping items from the queue:')  
count = 0  
while not queue.empty():  
    print('item no: ', count, ' ', queue.get())  
    count += 1

输出：

pushing items to the queue:
('item no: ', 1, ' ', 'Apple')
('item no: ', 2, ' ', 'Orange')
('item no: ', 3, ' ', 'Guava')
('item no: ', 4, ' ', 'Papaya')
('item no: ', 5, ' ', 'Banana')

popping items from the queue:
('item no: ', 0, ' ', 'Apple')
('item no: ', 1, ' ', 'Orange')
('item no: ', 2, ' ', 'Guava')
('item no: ', 3, ' ', 'Papaya')
('item no: ', 4, ' ', 'Banana')

说明：

在上面的代码中，我们导入了 Queue 类，并初始化了名为 fruits 的列表。接下来，我们将计数赋值为 1。计数变量将计算元素的总数。然后，我们通过调用 Queue() 方法创建队列对象。该对象将用于在队列中执行操作。在 for 循环中，我们使用 put() 函数逐个将元素插入队列，并在每次循环迭代中增加计数 1。

使用 Python 进程锁类进行多进程编程

Python 的 multiprocessing 锁类用于在进程上获取锁，以便我们可以阻止其他进程执行相似的代码，直到锁被释放。锁类主要执行两个任务。第一个是使用 acquire() 函数获取锁，第二个是使用 release() 函数释放锁。

Python 多进程编程示例

假设我们有多个任务。因此，我们创建了两个队列：第一个队列将维护任务，另一个队列将存储完成的任务日志。下一步是实例化进程以完成任务。如前所述，队列类已经同步，因此我们不需要使用 Lock 类获取锁。

在以下示例中，我们将所有多进程类合并在一起。让我们看看下面的示例。

示例 -

from multiprocessing import Lock, Process, Queue, current_process  
import time  
import queue   
  
  
def jobTodo(tasks_to_perform, complete_tasks):  
    while True:  
        try:  
  
            # The try block to catch task from the queue.  
            # The get_nowait() function is used to  
            # raise queue.Empty exception if the queue is empty.  
  
            task = tasks_to_perform.get_nowait()  
  
        except queue.Empty:  
  
            break  
        else:  
  
                # if no exception has been raised, the else block will execute  
                # add the task completion  
                  
  
            print(task)  
            complete_tasks.put(task + ' is done by ' + current_process().name)  
            time.sleep(.5)  
    return True  
  
  
def main():  
    total_task = 8  
    total_number_of_processes = 3  
    tasks_to_perform = Queue()  
    complete_tasks = Queue()  
    number_of_processes = []  
  
    for i in range(total_task):  
        tasks_to_perform.put("Task no " + str(i))  
  
    # defining number of processes  
    for w in range(total_number_of_processes):  
        p = Process(target=jobTodo, args=(tasks_to_perform, complete_tasks))  
        number_of_processes.append(p)  
        p.start()  
  
    # completing process  
    for p in number_of_processes:  
        p.join()  
  
    # print the output  
    while not complete_tasks.empty():  
        print(complete_tasks.get())  
  
    return True  
  
  
if __name__ == '__main__':  
    main()

输出：

Task no 2
Task no 5
Task no 0
Task no 3
Task no 6
Task no 1
Task no 4
Task no 7
Task no 0 is done by Process-1
Task no 1 is done by Process-3
Task no 2 is done by Process-2
Task no 3 is done by Process-1
Task no 4 is done by Process-3
Task no 5 is done by Process-2
Task no 6 is done by Process-1
Task no 7 is done by Process-3

Python 中的多进程池

Python 的 multiprocessing 池对于在多个输入值上并行执行函数是必要的。它还用于在进程之间分配输入数据 （数据并行性）。考虑以下使用 multiprocessing 池的示例。

示例 -

from multiprocessing import Pool  
import time  
  
w = (["V", 5], ["X", 2], ["Y", 1], ["Z", 3])  
  
  
def work_log(data_for_work):  
    print(" Process name is %s waiting time is %s seconds" % (data_for_work[0], data_for_work[1]))  
    time.sleep(int(data_for_work[1]))  
    print(" Process %s Executed." % data_for_work[0])  
  
  
def handler():  
    p = Pool(2)  
    p.map(work_log, w)  
  
if __name__ == '__main__':  
    handler()

输出：

Process name is V waiting time is 5 seconds
Process V Executed.
Process name is X waiting time is 2 seconds
Process X Executed.
Process name is Y waiting time is 1 seconds
Process Y Executed.
Process name is Z waiting time is 3 seconds
Process Z Executed.

让我们了解 multiprocessing 池的另一个示例。

示例 - 2

from multiprocessing import Pool  
def fun(x):  
    return x*x  
  
if __name__ == '__main__':  
    with Pool(5) as p:  
        print(p.map(fun, [1, 2, 3]))

输出：

[1, 8, 27]

代理对象

代理对象也称为共享对象，它们驻留在不同的进程中。此对象也称为代理。多个代理对象可能具有相同的引用。代理对象由多个方法组成，这些方法用于调用其引用对象的相应方法。以下是代理对象的示例。

示例 -

from multiprocessing import Manager  
manager = Manager()  
l = manager.list([i*i for i in range(10)])  
print(l)  
print(repr(l))  
print(l[4])  
print(l[2:5])

输出：

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
<ListProxy object, typeid 'list' at 0x7f063621ea10>
16
[4, 9, 16]

代理对象是可被序列化的，因此我们可以在进程之间传递它们。这些对象还用于控制同步级别。

多进程编程的常用函数

到目前为止，我们已经讨论了使用 Python 进行多进程编程的基本概念。多进程编程本身是一个广泛的主题，对于在单个系统内执行各种任务至关重要。以下是几个常用于实现多进程编程的关键函数。

方法	描述
pipe()	pipe() 函数返回一对连接对象。
run()	run() 方法用于表示进程活动。
start()	start() 方法用于启动进程。
join([timeout])	join() 方法用于阻塞进程，直到调用 join() 方法的进程终止。超时参数是可选的。
is_alive()	如果进程活动，则返回 True。
terminate()	正如名称所示，它用于终止进程。请记住 - terminate() 方法用于 Linux，对于 Windows，我们使用 TerminateProcess() 方法。
kill()	此方法类似于 terminate()，但在 Unix 上使用 SIGKILL 信号。
close()	此方法用于关闭 Process 对象，并释放与之关联的所有资源。
qsize()	返回队列的近似大小。
empty()	如果队列为空，则返回 True。
full()	如果队列已满，则返回 True。
get_await()	此方法等效于 get(False)。
get()	此方法用于从队列获取元素。它会删除并返回队列中的一个元素。
put()	此方法用于将元素插入队列。
cpu_count()	返回系统内工作的 CPU 数量。
current_process()	返回与当前进程对应的 Process 对象。
parent_process()	返回与当前进程对应的父 Process 对象。
task_done()	此函数用于指示已排队的任务已完成。
join_thread()	此方法用于加入后台线程。

Python教程-Python 多进程编程

什么是多进程编程？

为什么要使用多进程编程？

Python 中的多进程编程

Python 中的多进程编程类

使用 Python 进程队列类进行多进程编程

使用 Python 进程锁类进行多进程编程

Python 多进程编程示例

Python 中的多进程池

代理对象

多进程编程的常用函数

推荐文章

其它