deutsch     english    français     Print

 

11.4 PARALLEL PROCESSING

 

 

INTRODUCTION

 

According to the von Neumann model, one can imagine a computer as a sequential machine that, based on a program, executes statement by statement in time intervals. In this model there are no simultaneous actions, so no parallel processing and no concurrency. In daily life however, parallel processes are omnipresent: Every living being exists as an independent individual and many processes run simultaneously in the human body.

The advantages of parallel processing are evident: It brings on a huge performance boost, since tasks are solved in equal time slices. Moreover, the redundancy and the odds of survival increase because the failure of a particular component does not automatically lead to the failure of the entire system.

However, parallelizing algorithms is a challenging task, which despite great efforts is still in its infancy. The problem is mainly the fact that the sub-processes usually use shared resources and have to wait on the results of other processes.

A thread is code running in parallel within the same program and a process is code that is executed in parallel by the operation system. Python provides a good support of both types of parallelism. Here, however, we only consider the use of multiple threads, so multithreading.

 

 

MULTITHREADING IS EASIER THAN IT SEEMS

 

In Python, it is very easy to run the code of one of your functions from its own thread: To do this you import the module threading and pass start_new_thread() the function name as well as possible parameter values that you pack into a tuple. The thread begins running immediately and executes the code of your function.

In your first program with threads, two turtles should draw a staircase independently of each other. To do this, you write an arbitrarily named function, in this case denoted by paint(), which may also have parameters, such as here the turtle and a flag that indicates whether the turtle draws a staircase to the left or the right. You then pass the function name and a tuple with the parameter values (the turtle and the flag) to the function thread.start_new_thread(). And now off you go!

 

from gturtle import *
import thread

def paint(t, isLeft):
    for i in range(16):
        t.forward(20)
        if isLeft:
            t.left(90)
        else:
            t.right(90)
        t.forward(20)
        if isLeft:
            t.right(90)
        else:
            t.left(90)
    
tf = TurtleFrame()
john = Turtle(tf)
john.setPos(-160, -160)
laura = Turtle(tf)
laura.setColor("red")
laura.setPenColor("red")
laura.setPos(160, -160)
thread.start_new_thread(paint, (john, False))
thread.start_new_thread(paint, (laura, True))
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

 

MEMO

 

In order to move both turtles in the same window you use a TurtleFrame tf and pass it to the turtle constructor.

A new thread is created and immediately started with start_new_thread(). The thread is terminated as soon as the passed function returns.

The parameter list has to be specified as a tuple. Be aware that an empty tuple () must be passed for a function without parameters, and that a tuple with a single element x should be written as (x, ) instead of (x)

 

 

CREATING AND STARTING THREADS AS A CLASS INSTANCE

 


You get a little more leeway when you define a separate class that is derived from the class Thread. In this class, you overwrite the method run() which contains the code to be executed.

To start the thread, you first create an instance and call the method start(). The system will then execute the method run() automatically in a new thread and stop the thread once run() is done.

 

 

from threading import Thread
from random import random
from gturtle import *

class TurtleAnimator(Thread):
    def __init__(self, turtle):
        Thread.__init__(self)
        self.t = turtle

    def run(self):
        while True:
            self.t.forward(150 * random())
            self.t.left(-180 + 360 * random())

tf = TurtleFrame()
john = Turtle(tf)
john.wrap()
laura = Turtle(tf)
laura.setColor("red")
laura.setPenColor("red")
laura.wrap()
thread1 = TurtleAnimator(john)
thread2 = TurtleAnimator(laura)
thread1.start() 
thread2.start()
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

 

MEMO

 

Even with a multiprocessor system the code is not really run in parallel, but rather in successive time slices. Therefore, it usually merely consists of quasi-parallel data processing. However, it is important to understand that the allocation of the processor to the threads takes place at unpredictable points in time, so somewhere in the middle of your code. The breakpoint and the local variables are automatically rescued when your thread is interrupted, and restored when it continues, but problems may arise if in the meantime other threads change shared global data. This also applies to the content of a graphics window. Therefore, it is not self-evident that the two turtles do not get in the way of each other [more... To avoid such effects, the Turtle graphics had specifically written for the use of thread is thread-safe].

You can see, the main part of the program finishes running, but the two threads keep performing their tasks until the window is closed.

 

 

ENDING THREADS

 

Once initiated, a thread can not be stopped directly using a method from outside, such as by another thread. In order to stop a thread, it must be ensured that the method run() comes to the end. That is why a non-breakable while loop in the method run() of a thread is never a good idea. Instead, you should use a global boolean variable isRunning for the while loop that is normally set to True, but that can also be set to False from another thread.

Both turtles execute a random movement in your program until one of the two moves out of a circular area.

from threading import Thread
from random import random
import time
from gturtle import *

class TurtleAnimator(Thread):
    def __init__(self, turtle):
        Thread.__init__(self)
        self.t = turtle

    def run(self):
        while isRunning:
            self.t.forward(50 * random())
            self.t.left(-180 + 360 * random())

tf = TurtleFrame()
john = Turtle(tf)
laura = Turtle(tf)
laura.setColor("red")
laura.setPenColor("red")
laura.setPos(-200, 0)
laura.rightCircle(200)
laura.setPos(0, 0)
thread1 = TurtleAnimator(john)
thread2 = TurtleAnimator(laura)
isRunning = True
thread1.start() 
thread2.start()

while isRunning and not tf.isDisposed():
    if laura.distance(0, 0) > 200 or john.distance(0, 0) > 200:
        isRunning = False
    time.sleep(0.001)
tf.setTitle("Limit exceeded")
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

 

MEMO

 

You should never use a "tight" loop that does not perform actions in the body, because you are wasting a lot of processor time. Always set a small waiting time of at least a few milliseconds using time.sleep(), Turtle.sleep(), or GPanel.delay().

Once a thread has ended it can not be initiated again. If you try to call start() once again, there is an error message.

 

 

STOPPING AND CONTINUING THREADS

 


In order to stop a thread only for a certain time, you can skip the actions in run() using a global flag  isPaused and continue again later using isPaused = False.

 

from threading import Thread
from random import random 
import time
from gturtle import *

class TurtleAnimator(Thread):
    def __init__(self, turtle):
        Thread.__init__(self)
        self.t = turtle

    def run(self):
        while True:
            if isPaused:
                Turtle.sleep(10)
            else: 
                self.t.forward(100 * random())
                self.t.left(-180 + 360 * random())

tf = TurtleFrame()
john = Turtle(tf)
laura = Turtle(tf)
laura.setColor("red")
laura.setPenColor("red")
laura.setPos(-200, 0)
laura.rightCircle(200)
laura.setPos(0, 0)
thread1 = TurtleAnimator(john)
thread2 = TurtleAnimator(laura)
isPaused = False
thread1.start() 
thread2.start()

tf.setTitle("Running")
while not isPaused and not tf.isDisposed():
    if laura.distance(0, 0) > 200 or john.distance(0, 0) > 200:
        isPaused = True
        tf.setTitle("Paused")
        Turtle.sleep(2000)
        laura.home()
        john.home()
        isPaused = False
        tf.setTitle("Running")
    time.sleep(0.001)
Highlight program code (Ctrl+C copy, Ctrl+V paste)

It is even smarter to stop the thread using Monitor.putSleep() and then continue later using Monitor.wakeUp().

from threading import Thread
from random import random
import time
from gturtle import *

class TurtleAnimator(Thread):
    def __init__(self, turtle):
        Thread.__init__(self)
        self.t = turtle

    def run(self):
        while True:
            if isPaused:
                Monitor.putSleep()
            self.t.forward(100 * random())
            self.t.left(-180 + 360 * random())

tf = TurtleFrame()
john = Turtle(tf)
laura = Turtle(tf)
laura.setColor("red")
laura.setPenColor("red")
laura.setPos(-200, 0)
laura.rightCircle(200)
laura.setPos(0, 0)
thread1 = TurtleAnimator(john)
thread2 = TurtleAnimator(laura)
isPaused = False
thread1.start() 
thread2.start()

tf.setTitle("Running")
while not isPaused and not tf.isDisposed():
    if laura.distance(0, 0) > 200 or john.distance(0, 0) > 200:
        isPaused = True
        tf.setTitle("Paused")
        Turtle.sleep(2000)
        laura.home()
        john.home()
        isPaused = False
        Monitor.wakeUp()
        tf.setTitle("Running")
    time.sleep(0.001)
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

 

MEMO

 

A thread can stop itself using the blocking method Monitor.putSleep() so that it does not waste any computing time. Another thread can activate it again using Monitor.wakeUp(), i.e. the blocking method Monitor.putSleep() returns.

 

 

WAITING ON THREAD RESULTS

 

In this program you employ a worker to calculate the sum of natural numbers from 1 to 1,000,000 simply by adding. You wait in the main program until the job is done and then determine the time that was required. Since this can vary slightly, you let the work be performed 10 times by a worker thread. In order to wait for the end of the thread, you use join().

from threading import Thread
import time

class WorkerThread(Thread):
     def __init__(self, begin, end):
         Thread.__init__(self)
         self.begin = begin
         self.end = end
         self.total = 0

     def run(self):
         for i in range(self.begin, self.end):
             self.total += i

startTime = time.clock()
repeat 10:
    thread = WorkerThread(0, 1000000)
    thread.start()
    thread.join() 
    print thread.total
print "Time elapsed:", time.clock() - startTime, "s"
Highlight program code (Ctrl+C copy, Ctrl+V paste)

Just like in real life, you can distribute tedious work to multiple workers. If you use two worker threads for this, so that each of them performs half of the work, you will need to wait for both to finish before you add up the sum.

from threading import Thread
import time

class WorkerThread(Thread):
     def __init__(self, begin, end):
         Thread.__init__(self)
         self.begin = begin
         self.end = end
         self.total = 0

     def run(self):
         for i in range(self.begin, self.end):
             self.total += i

startTime = time.clock()
repeat 10:
    thread1 = WorkerThread(0, 500000)
    thread2 = WorkerThread(500000, 1000000)
    thread1.start()
    thread2.start()
    thread1.join() 
    thread2.join()  
    result = thread1.total + thread2.total
    print result 
print "Time elapsed:", time.clock() - startTime, "s"
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

 

MEMO

 

You could also set a global flag isFinished() to True when the threads terminate, and test this flag in a waiting loop in the main part of the program. However, this solution is less elegant than just using join() because you are wasting computing time by constantly testing the flag.

 

 

CRITICAL SECTIONS AND LOCKS

 

Because threads execute code virtually independently, it is awkward if multiple threads modify shared data. To avoid collisions between threads, actions belonging together are encapsulated in a so-called critical program block and provided with a protection that ensures that the block is only executed uninterruptedly as a whole (atomically). If another thread tries to execute the block, it must wait until the current thread has left the block. This protection is carried out in Python using a lock. A lock is an instance of the class Lock and has two states, locked and unlocked, as well as two methods acquire() and release() with the following rules:

 state  call subsequent state/activity
 unlocked  acquire()  locked
 locked  acquire()  blocked until another thread calls
 release()
 unlocked  release()  error message (RuntimeException)
 locked  release()  unlocked

We say that a thread acquires the lock with acquire() and releases it again with release [more... In other programming languages, the lock mechanism is also known
as thread synchronization. A lock is also called monitor or semaphore
].

Specifically, to protect a critical block you proceed as follows: First you create a global lock object using  lock = Lock() that is of course unlocked in its initial state. Every thread then tries to acquire the lock using acquire() when entering the critical block. If this fails because the lock is already taken, the thread is automatically placed in a waiting state until the lock is free again. If a thread does indeed acquire the lock, it runs through the critical block and then release the lock again when done using release() so that the other threads can obtain it [more... If more than one thread is waiting for the lock, the operating system decides
which one gets it next. It ensures that each waiting thread has a chance to run and no one is "dying of starvation".
].

If you understand the passing of the critical block as a resource in a room with a closed door, you can imagine a lock as similar a key, which a thread needs in order to open the door of the room. Upon entry, it takes the key with it and closes the door from the inside. All threads that now want to enter the room must wait for a key in a line in front of the door. Once the thread has done its work in the room, it leaves and closes the door and hangs up the key. The first thread waiting in line takes the key and can now open the door to the room. If no threads are waiting in line, the key remains suspended until a new incoming thread needs it.
 

In your program, the critical block consists of the drawing and deleting of filled squares. When deleted, the square is painted over with the white background color. The main thread creates a flashing square by drawing the solid red square and then deleting it again after a certain waiting period. In a second thread MyThread, the keyboard is continuously prompted with getKeyCode(). If the user presses the spacebar, the blinking square is shifted to a random position.

It is self-evident that the critical block needs to be protected by a lock. If the shifting of the square takes place while it is still drawn or deleted, the result is a chaotic behavior.

from gpanel import *
from threading import Thread, Lock
from random import randint

class MyThread(Thread):
    def run(self):
        while not isDisposed():
            if getKeyCode() == 32:
                print "----------- Lock requested by MyThread"
                lock.acquire()
                print "----------- Lock acquired by MyThread"
                move(randint(2, 8), randint(2, 8))
                delay(500)  # for demonstration purposes
                print "----------- Lock releasing by MyThread..."
                lock.release()
            else:
                delay(1)

def square():
    print "Lock requested by main"
    lock.acquire()
    print "Lock acquired by main"
    setColor("red")
    fillRectangle(2, 2)
    delay(1000)
    setColor("white")
    fillRectangle(2, 2) 
    delay(1000)
    print "Lock releasing by main..."
    lock.release()

lock = Lock()
makeGPanel(0, 10, 0, 10)

t = MyThread()
t.start()
move(5, 5)
while not isDisposed():
    square()
    delay(1) # Give up thread for a short while
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

 

MEMO

 

You can track in the console how each thread obediently waits until the lock is releasede:

Lock requested by main
Lock acquired by main
----------- Lock requested by MyThread
Lock releasing by main...
----------- Lock acquired by MyThread
Lock requested by main
----------- Lock releasing by MyThread...
Lock acquired by main

If you deactivate the lock by commenting out, you will see that the squares are no longer correctly drawn and deleted.

Also note that you should always install a small waiting time in short loops in order to avoid consuming unnecessary processing time.

 

 

GUI-WORKERS

 

Callbacks that are triggered by GUI components run in a specific native thread (sometimes called Event Dispatch Thread (EDT)). This is responsible for ensuring that the entire graphics window along with all components (buttons, etc.) are correctly rendered on the screen. Since the rendering takes place at the end of the callback, the GUI appears to be frozen until the callback returns. This is why no graphical animations are possible in a GUI callback. You must strictly adhere to the following rule:

GUI callbacks must return quickly, i.e. no lengthy operations should be performed in GUI callbacks.

In this case, lengthy means a period of time longer than 10 ms. When dealing with this, you have to assume the worst possible case, i.e. a slow hardware and a heavy system load. If an action lasts longer, you should execute it in a separate thread called GUI worker.

You can draw a Rhodonea rose in your program by clicking on one of the two buttons. The drawing is animated and takes a certain amount of time. Therefore, you have to execute the drawing in a worker thread, which is not a problem with your current knowledge about threading.

There is yet another problem to consider: Since each click of the button creates a new thread, more drawings can be started shortly after one another, leading to chaos. You can prevent this by making the buttons gray (inactive) during the execution of the drawing [more... This is in contrast to freeze a legitimate look and feel]
from gpanel import *
from javax.swing import *
import math
import thread

def rho(phi):
    return math.sin(n * phi)

def onButtonClick(e):
    global n
    enableGui(False)
    if e.getSource() == btn1:
        n = math.e
    elif e.getSource() == btn2:
        n = math.pi
#    drawRhodonea()    
    thread.start_new_thread(drawRhodonea, ())

def drawRhodonea():
    clear()
    phi = 0
    while phi < nbTurns * math.pi:
        r = rho(phi)
        x = r * math.cos(phi)    
        y = r * math.sin(phi)    
        if phi == 0:
          move(x, y)
        else:
          draw(x, y)
        phi += dphi
    enableGui(True)

def enableGui(enable):
    btn1.setEnabled(enable)
    btn2.setEnabled(enable)
                    
dphi = 0.01
nbTurns = 100
makeGPanel(-1.2, 1.2, -1.2, 1.2)
btn1 = JButton("Go (e)", actionListener = onButtonClick)
btn2 = JButton("Go (pi)", actionListener = onButtonClick)
addComponent(btn1)
addComponent(btn2)
validate()
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

MEMO

 

Only short lasting code should be executed in GUI callbacks, otherwise the graphics system freezes. You have to outsource longer lasting code (more than 10 ms) in a separate worker thread.

At any moment, only the components of a GUI whose operation is allowed and useful may be active.

 

   

ADDITIONAL MATERIAL


 

RACE CONDITIONS, DEADLOCKS

 

Human beings work in a highly parallel way, but their logical reasoning is largely sequential. Because of this, it is difficult for us humans to retain an overview of programs with multiple threads. That is why using threads should be considered carefully, as elegant they may seem at first glance.

Besides applications based on random data, a program should always return the same results (postconditions) with the same initial conditions (preconditions). This is in no way guaranteed for programs with multiple threads accessing shared data, even if the critical areas are protected with locks. You have two threads in your program, thread1 and thread2, carrying out an addition and a multiplication with two global numbers a and b. a and b are protected by lock_a and lock_b. Generate and start both threads one after another in the main part and then wait until they reach the end. Finally, write out the values of a and b.

To generate threads in this example, you use a slightly different notation where you specify the method run() as a named parameter in the constructor of the class Thread.

from threading import Thread, Lock
from time import sleep

def run1():
    global a, b
    print "----------- lock_a requested by thread1"
    lock_a.acquire()
    print "----------- lock_a acquired by thread1"
    a += 5
#    sleep(1)
    print "----------- lock_b requested by thread1"
    lock_b.acquire()
    print "----------- lock_b acquired by thread1"
    b += 7
    print "----------- lock_a releasing by thread1"
    lock_a.release()
    print "----------- lock_b releasing by thread1"
    lock_b.release()

def run2():
    global a, b
    print "lock_b requested by thread2"
    lock_b.acquire()
    print "lock_b acquired by thread2"
    b *= 3
#    sleep(1)
    print "lock_a requested by thread2"
    lock_a.acquire()
    print "lock_a acquired by thread2"
    a *= 2
    print "lock_b releasing by thread2"
    lock_b.release()
    print "lock_a releasing by thread2"
    lock_a.release()

a = 100
b = 200
lock_a = Lock()
lock_b = Lock()

thread1 = Thread(target = run1)
thread1.start()
thread2 = Thread(target = run2)
thread2.start()
thread1.join()
thread2.join()
print "Result: a =", a, ", b =", b
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

MEMO

 

If you let the program run several times, it sometimes returns the results a = 205, b = 607 and other times a = 210, b = 621. It even happens that the programm blocks. How is this possible? The explanation is as follows:

Although thread1 is created and started in the main part before thread2, it is not certain which thread actually begins the execution first. As the first line

lock_a requested by thread1
or
lock_b requested by thread2

can be written out. The following course of events is not unique either, since the thread switch can happen anywhere. It could be that the addition or multiplication is performed with the numbers a and b first, which explains the varying results. Since both threads run together almost as if in a competition, a race condition arises.

However, it could be even worse because the program could also completely freeze. Before the "death" the following is written out:

----------- lock_a requested by thread1
lock_b requested by thread2
lock_b acquired by thread2
lock_a requested by thread2
----------- lock_a acquired by thread1
----------- lock_b requested by thread1

It takes a bit of detective work to find out what happened there. We will try: Apparently, the thread1 begins to run first and tries to get lock_a. Before it can write out the receipt, thread2 tries to get lock_b and gets this lock. Immediately afterwards, thread2 also tries to get lock_a, but apparently fails because in the meantime thread1 got it. thread2 therefore blocks. thread1 continues to run and tries to get lock_b, but also fails because thread2 has not yet returned it. This also blocks thread1 and therefore the entire program. Aptly, this situation is called a deadlock. (If you activate the two commented out lines using sleep(1), it always results in a deadlock. Think about why this is.)

As you can see, deadlocks occur when two threads thread1 and thread2 depend on two shared resources a and b and block them individually. As a consequence, it can happen that thread2 waits on lock_a and thread1 waits on lock_b and thus both are blocked, with the result that the locks are never released again.

To avoid deadlocks, adhere to the following rule:

Shared resources should be protected with a single lock whenever possible. In addition, it must be ensured that the lock is given back again.

 

 

THREAD-SAFE AND ATOMIC EXPRESSIONS

 

If there are several threads involved, you never know as a programmer the exact point in time or at which spot of the code thread switching occurs. As you have already seen before, it can lead to unexpected and incorrect behavior if the threads work with shared resources. This is especially the case if multiple threads change a window. So if you generate a worker thread in a callback in order to execute long-running code, you should almost always expect that this will result in chaos. In the previous program you avoided this by disabling the buttons during the callback. You can ensure that multiple threads can run the same code without getting in each others way by taking special precautions. Such code is called thread-safe. There is an art to writing thread-safe code so that it can be used in an environment with multiple threads without any conflicts [more.. If a thread executes a function and is then interrupted by another thread, calls the same function, so this is called
Reentrance. A thread-safe function can be easily accessed reentrant
].


There are few thread-safe libraries, since they are usually less performative and they have the risk of deadlocks. As you experienced above, the GPanel library is not thread-safe, whereas the turtle graphics is. You can move several turtles with multiple threads virtually simultaneously. In your program at each mouse click a new thread is created and a new turtle appears at the mouse position. The turtle autonomously draws a star and then fills it.

 


from gturtle import *
import thread

def onMousePressed(event):
#    createStar(event)
    thread.start_new_thread(createStar, (event,))

def createStar(event):
    t = Turtle(tf)
    x = t.toTurtleX(event.getX())
    y = t.toTurtleY(event.getY())
    t.setPos(x, y)
    t.startPath()
    repeat 9:
        t.forward(100)
        t.right(160)
    t.fillPath()        
     
tf = TurtleFrame(mousePressed = onMousePressed)
tf.setTitle("Klick To Create A Working Turtle")
Highlight program code (Ctrl+C copy, Ctrl+V paste)

 

MEMO

 

If you are not generating a new thread (commented out line), you will only see the already finished stars. However, you can write the program without its own thread if you use the named parameter mouseHit instead of mousePressed, just like you did in chapter 2.11. In this case, the thread will be automatically generated in the turtle library.

a = a + 1

or

a += 1

It is important that you know that switching threads can even be done in the middle of one line of code. For example, even in the line

a thread switch can occur between the reading and writing of the variable value.

In contrast, an expression is called atomic, if it can not be interrupted. Similar to most other programming languages, Python is also not really atomic. For example, it may happen that a print command is interrupted by print commands of other threads, resulting in a chaotic expression. It is the programmer's task to make functions, expressions, and parts of the code thread-safe and atomic by using locks.

 

 

EXERCISES

 

1.


Use a single lock in the above program and set it up so that there are no race conditions and no deadlocks.