An introduction to multiprocessing in Python

In the following tutorial, I talk about Multiprocessing in Python. By using Multiprocessing, you can break the task between processors of the network. Moreover, the accumulated computational time will reduce.

| kaggle | GitHub | Buy a coffee |

Parallel computing

One approach to parallelize in Python is to use the pooling method from multiprocessing. The pool takes a method and tries to distribute the function for different outputs between different processors at the same time. Here is a Python multiprocessing Pool example.

Example of using a pool in python

In this example, I tried to compute the _kfold() method in parallel for two runs and two arguments. I passed the arguments as a list and returned the results as a list, including NumPy arrays for train and test splits.

Multiprocessing

Another useful method from the multiprocessing library is the Process. Each process loads a specific program into its memory without sharing the memory with other processes. To use this approach you have to define the process target, which could be a concrete job defined in a method. Therefore, you can break your work into different methods and treat them as different targets.

In the following example, I defined a method that returns a summation of the parameters. Therefore, the method would be the process target and its parameters defined in the args. The .start() will start the process and .join() make sure that it ends the process.

Example of using multiprocessing in python

Measuring the computation time

When you are using the multiprocessing approaches, you may need to check if it reduces the execution time or not. For this purpose, you can consider the python profilers such as cProfile. The cProfile, read the python program line by line and measure the number of calls for each built-in or user method and class. Also, it measures the consumed time of each method and class.

Example of time profiling in Python

In the included example, the profiler returns the output for a selected method. The output is a statistical report including the details of the execution of the program. Therefore, it has to convert to a readable file using pstats class.

Conclusion

In this article, we learned how to compute a Python program in parallel with different CPU cores with the method’s parameters distribution. Moreover, the Multiprocessing with various processes was studied. Finally, we saw how we could see the real consumed time with the Python profiler.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store