To run more than one piece of code at the same time on the same computer one has the choice of either using multiple processes or multiple threads.
Although a program can be made up of multiple processes, these processes are in effect completely independent of one another: different processes are not able to cooperate with one another unless one sets up some means of communication between them (such as by using sockets). If a lot of data must be transferred between processes then this can be inefficient.
On the other hand, multiple threads within a single process are intimately connected: they share their data but often can interfere badly with one another. It is often argued that the only way to make multithreaded programming "easy" is to avoid relying on any shared state and for the threads to only communicate by passing messages to each other.
CPython has a Global Interpreter Lock (GIL) which in many ways makes threading easier than it is in most languages by making sure that only one thread can manipulate the interpreter's objects at a time. As a result, it is often safe to let multiple threads access data without using any additional locking as one would need to in a language such as C.
One downside of the GIL is that on multi-processor (or multi-core) systems a multithreaded Python program can only make use of one processor at a time. This is a problem that can be overcome by using multiple processes instead.
Python gives little direct support for writing programs using multiple process. This package allows one to write multi-process programs using much the same API that one uses for writing threaded programs.
There are two ways of creating a new process in Python:
The current process can fork a new child process by using the os.fork() function. This effectively creates an identical copy of the current process which is now able to go off and perform some task set by the parent process. This means that the child process inherits copies of all variables that the parent process had.
However, os.fork() is not available on every platform: in particular Windows does not support it.
Alternatively, the current process can spawn a completely new Python interpreter by using the subprocess module or one of the os.spawn*() functions.
Getting this new interpreter in to a fit state to perform the task set for it by its parent process is, however, a bit of a challenge.
The processing package uses os.fork() if it is available since it makes life a lot simpler. Forking the process is also more efficient in terms of memory usage and the time needed to create the new process.
In the processing package processes are spawned by creating a Process object and then calling its start() method. processing.Process follows the API of threading.Thread. A trivial example of a multiprocess program is
from processing import Process def f(): print 'hello world' if __name__ == '__main__': p = Process(target=f) p.start() p.join()
Here the function f is run in a child process.
For an explanation of why (on Windows) the if __name__ == '__main__' part is necessary see Programming guidelines.
processing contains two main functions for creating a communication channel between related process:
- Queue(maxsize=0)
This is a near clone of Queue.Queue. For example
from processing import Process, Queue def f(q): q.put([42, None, 'hello']) if __name__ == '__main__': q = Queue() p = Process(target=f, args=[q]) p.start() print q.get() # prints "[42, None, 'hello']" p.join()Queues are thread and process safe.
- Pipe()
This returns a pair of connection objects connected by a duplex (two-way) pipe, for example
from processing import Process, Pipe def f(conn): conn.send([42, None, 'hello']) if __name__ == '__main__': a, b = Pipe() p = Process(target=f, args=[b]) p.start() print a.recv() # prints "[42, None, 'hello']" p.join()Note however that the connection object returned by Pipe() are not thread/process safe: only one thread should write to or read from a connection object at any time.
processing contains equivalents of all the synchronization primitives from threading. For instance one can use a lock to ensure that only one process prints to standard output at a time:
from processing import Process, Lock def f(l): l.acquire() print 'hello world', i l.release() if __name__ == '__main__': l = Lock() for i in range(10): Process(target=f, args=[l, i]).start()
Without using the lock output from the different processes is liable to get all mixed up.
As mentioned above, when doing threaded programming it is usually best to avoid using shared state as far as possible. This is particularly true when using multiple processes.
However, if you really do need to use some shared data then you can create by using a manager. Managers can store their data in one of two ways:
Data can be stored in a shared memory map managed by an instance of LocalManager. Only those data types supported by the struct and array modules are supported. For example the following code
from processing import Process, LocalManager def f(n, s, a): n.value = 42 s.value = (0.75, 'hello') for i in range(len(a)): a[i] = -a[i] if __name__ == '__main__': manager = LocalManager() number = manager.SharedValue('i', 0) struct = manager.SharedStruct('d256p', (0.0, '')) array = manager.SharedArray('i', range(10)) p = Process(target=f, args=[number, struct, array]) p.start() p.join() print number print struct print array
will print
SharedValue('i', 42) SharedStruct('d256p', (0.75, 'hello')) SharedArray('i', [0, -1, -2, -3, -4, -5, -6, -7, -8, -9])
Note that SharedValue, SharedStruct and SharedArray objects are thread and process safe. The 'i' and 'd256p' arguments used when creating num, struct and array are format strings of the type used by the struct and array modules: 'i' indicates a signed integer, 'd' indicates a double precision float and '256p' indicates a string of length less than 256.
A manager object returned by processing.Manager() represents a server process which holds python objects and allows other processes to manipulate them using proxies.
In addition to SharedValue, SharedStruct and SharedArray a manager returned by processing.Manager() will support types list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Queue. For example:
from processing import Process, Manager def f(d, l): d[1] = '1' d['2'] = 2 d[0.25] = None l.reverse() if __name__ == '__main__': manager = Manager() d = manager.dict() l = manager.list(range(10)) p = Process(target=f, args=[d, l]) p.start() p.join() print d print l
will print
{0.25: None, 1: '1', '2': 2} [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Creating managers which support other types is not hard --- see Customized managers.
Proxy objects are picklable and can be transferred between processes.
Server process managers are more flexible than shared memory managers because they support more data types (and can be made to support many more). Also, a single server process manager can be shared by different computers over the network. They are, however, slower than using shared memory.
Passing small objects between processes using processing.Queue is not significantly slower than passing the same objects between threads using Queue.Queue. (In fact on Linux it tends to be faster.) One can also use processing.Pipe which at least on Windows is quite a bit faster than processing.Queue.
The following benchmarks were performed on a single core Pentium 4, 2.5Ghz laptop running Windows XP and Ubuntu Linux 6.10 --- see test_speed.py.
Number of 256 bytes strings passed between processes/threads per sec:
Connection type | Windows | Linux |
---|---|---|
Queue.Queue | 49,000 | 17,000-50,000 [1] |
processing.PosixQueue | n/a | 62,000 [2] |
processing.PipeQueue | 27,000 | 37,000 [2] |
Queue managed by server | 6,900 | 6,500 |
processing.Pipe | 52,000 | 57,000 |
[1] | For some reason the performance of Queue.Queue is very variable on Linux. |
[2] | (1, 2) processing.Queue is an alias for processing.PosixQueue if available or processing.PipeQueue. |
Number of acquires/releases of a lock per sec:
Lock type | Windows | Linux |
---|---|---|
threading.Lock | 850,000 | 560,000 |
processing.Lock | 430,000 | 510,000 |
Lock managed by server | 10,000 | 8,400 |
threading.RLock | 93,000 | 76,000 |
processing.RLock | 430,000 | 500,000 |
RLock managed by server | 8,800 | 7,400 |
Number of interleaved waits/notifies per sec on a condition variable by two processes:
Condition type | Windows | Linux |
---|---|---|
threading.Condition | 26,000 | 28,000 |
processing.Condition | 22,000 | 18,000 |
Condition managed by server | 6,600 | 6,000 |
Number of integers retrieved from a sequence per sec:
Sequence type | Windows | Linux |
---|---|---|
list | 6,000,000 | 4,800,000 |
shared memory array | 120,000 | 110,000 |
list managed by server | 20,000 | 17,000 |