An introduction to multiprocessing in Python
In the following tutorial, I talk about Multiprocessing in Python. By using Multiprocessing, you can break the task between processors of the network. Moreover, the accumulated computational time will reduce.
Parallel computing
One approach to parallelize in Python is to use the pooling method from multiprocessing. The pool takes a method and tries to distribute the function for different outputs between different processors at the same time. Here is a Python multiprocessing Pool example.
In this example, I tried to compute the _kfold() method in parallel for two runs and two arguments. I passed the arguments as a list and returned the results as a list, including NumPy arrays for train and test splits.
Multiprocessing
Another useful method from the multiprocessing library is the Process. Each process loads a specific program into its memory without sharing the memory with other processes. To use this approach you have to define the process target, which could be a concrete job defined in a method. Therefore, you can break your work into different methods and treat them as different targets.
In the following example, I defined a method that returns a summation of the parameters. Therefore, the method would be the process target and its parameters defined in the args. The .start() will start the process and .join() make sure that it ends the process.
Measuring the computation time
When you are using the multiprocessing approaches, you may need to check if it reduces the execution time or not. For this purpose, you can consider the python profilers such as cProfile. The cProfile, read the python program line by line and measure the number of calls for each built-in or user method and class. Also, it measures the consumed time of each method and class.
In the included example, the profiler returns the output for a selected method. The output is a statistical report including the details of the execution of the program. Therefore, it has to convert to a readable file using pstats class.
Conclusion
In this article, we learned how to compute a Python program in parallel with different CPU cores with the method’s parameters distribution. Moreover, the Multiprocessing with various processes was studied. Finally, we saw how we could see the real consumed time with the Python profiler.