PHP Classes

PHP Multithreading using pthreads extension

Recommend this page to a friend!
  Blog PHP Classes blog   RSS 1.0 feed RSS 2.0 feed   Blog PHP Multithreading us...   Post a comment Post a comment   See comments See comments (0)   Trackbacks (0)  

Author:

Viewers: 5,288

Last month viewers: 367

Categories: PHP Tutorials

A lot of PHP developers face situations on which they need to execute multiple tasks in parallel. This is the case for using multithreading solutions.

Recently I tried the pthreads extension and was pleasantly surprised. It is an extension that adds the ability to work in PHP running multiple tasks in parallel within the same process without any emulation, no magic tricks, no fake parallel tasks, it is real.

Read this article to learn more about the pthreads extension and how you can use it in your PHP applications to execute multiple parallel tasks.




Loaded Article

Contents

Introduction

What are pthreads

What's inside pthreads

Example

PHP Settings

Polifill

Analysis

Conclusion

Introduction

Consider this job. There is a list of tasks that need to be completed as quickly as possible. In PHP there are several solutions for this purpose that I will not cover. This article is only about the pthreads extension.

It is worth noting that the extension author, Joe Watkins, in his articles warns that multi-threading is not always easy and we must be prepared to use it correctly. If you are not afraid, go on.

What are pthreads?

Pthreads  is an object-oriented API which provides a convenient way to organize multi-threaded tasks in PHP. The API includes all the tools needed to create multi-threaded applications. PHP applications can create, read, write, execute and synchronize threads using objects of the classes Thread, Worker and Threaded.

What's inside pthreads?

The hierarchy of the basic classes, which we have just mentioned, is represented in the diagram.

The hierarchy of the basic classes

Threaded - It is pthreads basis. It makes it possible to run code in parallel. It provides methods for synchronization and other useful techniques.

Thread - You can create a sub-class of Thread and implement the method run(). This method starts executing a separate thread when you call the start() method. The thread can only be initiated from the process that created the thread. Combining threads can also only be done in this same process.

Worker - Worked threads have a persistent context that can be used by different threads. It is available only while the worker object has references to it and until shutdown() method is called.

In addition to these classes, there is the Pool class.

Pool - a pool is a container of workers that can be used for distribution jobs between Worker objects. Pool is the simplest and most effective way to organize several threads.

There is not much else to be said about the theory. So let's immediately try all this with an example.

Example

It is possible to solve complex problems splitting tasks in multiple threads. I was interested to solve one particular problem that seems to me that is very typical. Let me describe it. There is a pool of tasks that need to be executed as fast as possible.

So let's get started. To do this, create a data provider object of MyDataProvider (Threaded) class. There will be only one object and it is the same for all threads.

/**
 * Provider data flows
 */
class MyDataProvider extends Threaded
{
    /**
     * @var int How many items in our imaginary database
     */
    private $total = 2000000;

    /**
     * @var int How many items were processed
     */
    private $processed = 0;

    /**
     * We go to the next item and return it
     * 
     * @return mixed
     */
    public function getNext()
    {
        if ($this->processed === $this->total) {
            return null;
        }

        $this->processed++;

        return $this->processed;
    }
}

For each task we will use the MyWorker (Worker) class, which will be stored on the parent process.

/**
 * MyWorker here is used to share the provider between instances MyWork.
 */
class MyWorker extends Worker
{
    /**
     * @var MyDataProvider
     */
    private $provider;

    /**
     * @param MyDataProvider $provider
     */
    public function __construct(MyDataProvider $provider)
    {
        $this->provider = $provider;
    }

    /**
     * Called when sending in the Pool
     */
    public function run()
    {
        // In this example, we do not need to do anything here
    }

    /**
     * Returns the provider
     * 
     * @return MyDataProvider
     */
    public function getProvider()
    {
        return $this->provider;
    }
}

The processing of each pool task, lets suppose it is a kind of time consuming operation, is our bottleneck. So we start a multi-threaded task using MyWork (Threaded) class.

/**
 * MyWork is a task that can be executed in parallel
 */
class MyWork extends Threaded
{

    public function run()
    {
        do {
            $value = null;

            $provider = $this->worker->getProvider();

            // Sync obtaining data
            $provider->synchronized( function( $provider ) use (&$value) {
               $value = $provider->getNext();
            }, $provider);

            if ($value === null) {
                continue;
            }

            // A certain consuming operation
            $count = 100;
            for ($j = 1; $j <= $count; $j++) {
                sqrt($j+$value) + sin($value/$j) + cos($value);
            }
        }
        while ($value !== null);
    }

}

Note that to pick up the data from the service provider you need to use the synchronized() function. Otherwise, there is the possibility that the data be processed more than one time, or you may miss some data.

Now we make it all work using the Pool object.

require_once 'MyWorker.php';
require_once 'MyWork.php';
require_once 'MyDataProvider.php';

$threads = 8;

// Create provider. This service may for example read some data
// from a file or from the database
$provider = new MyDataProvider();

// Create a pool of workers
$pool = new Pool($threads, 'MyWorker', [$provider]);

$start = microtime(true);

// In this case flows are balanced. 
// Therefore, there is good to create as many threads as processes in our pool.
$workers = $threads;
for ($i = 0; $i < $workers; $i++) {
    $pool->submit(new MyWork());
}

$pool->shutdown();

printf("Done for %.2f seconds" . PHP_EOL, microtime(true) - $start);

This is a rather elegant solution in my opinion.

That's all! Well, that is almost everything you need to do.

Actually, there is something that might disappoint the inquisitive reader. This does not work on a standard PHP, compiled with the default options. To use pthreads multi-threading support it is necessary to compile PHP with the ZTS (Zend Thread Safety) option.

PHP Settings

The documentation says that PHP must be compiled with the option --enable-maintainer-zts. I have not tried to compile itself. Instead found a PHP package for Debian that was built with that option. You can install it like this.

sudo add-apt-repository ppa:ondrej/php-zts
sudo apt update
sudo apt-get install php7.0-zts php7.0-zts-dev

So I have the old PHP, which runs from the console in the usual way, using the php command. Accordingly, the Web server uses it the same. And there is another PHP, which can be run from the console through php7.0-zts.

You can then add the the pthreads extension.

git clone https://github.com/krakjoe/pthreads.git
./configure
make -j8
sudo make install
echo "extension=pthreads.so" > /etc/pthreads.ini
sudo cp pthreads.ini /etc/php/7.0-zts/cli/conf.d/pthreads.ini

Now that is all. Well almost everything. Imagine that you have written multi-threaded code, and PHP on the evnvironment of a colleague is not set up properly. This may cause confusion, doesn't it? But there is a solution.

Polyfill

Here again thank Joe Watkins for pthreads-polyfill package. The essence of the decision is as follows: this package contains classes that have the same names as in the pthreads extension. They allow you to execute your code, even if you do not have enabled the pthreads extension. Just code is executed in one thread.

For this to work, you simply install this package using composer and nothing else. There is a check whether the extension is installed. If the extension is installed, it uses the extension. Otherwise, the polyfill classes are used so you can start at least one thread.

Analysis

Let's now see whether the processing is really going on as we expect in multiple threads and evaluate the benefit of using this approach.

I will change the value of $threads variable from the above example and see what happens.

Information about the processor on which to run tests

$ lscpu
CPU(s):                8
Threads per core:      2
Cores per socket:      4
Model name:            Intel(R) Core(TM) i7-4700HQ CPU @ 2.40GHz

Let's see chart of the CPU cores. It is all in line with the expectations.

$threads = 1

pthreads_diagramm_threads_1

$threads = 2

pthreads_diagramm_threads_2

$threads = 4

pthreads_diagramm_threads_4

$threads = 8

pthreads_diagramm_threads_8

And now the most important thing for which all this. Compare the execution time.

$threadsNoteExecution Time, seconds
PHP without ZTS
1without pthreads, without polyfill265.05
1polyfill298.26
PHP with ZTS
1without pthreads, without polyfill37.65
168.58
226.18
316.87
412.96
512.57
612.07
711.78
811.62

From the first two lines it can be seen that when using polyfill we have lost about 13% of the performance in this example, a relatively linear code.

Furthermore, about PHP with ZTS, do not pay attention to such a big difference in execution time compared to PHP without ZTS (37.65 versus 265.05 seconds). I was not trying to optimize for the most common PHP settings. In the case of PHP without ZTS I have XDebug enabled for example.

As can be seen, by using 2 threads, the program execution speed is approximately 1.5 times higher than in the case of linear code. When using a 4 threads, 3 times.

You can note that even though the processor has 8 cores, run-time program execution almost did not change, if you used a 4 threads. It seems that this is due to the fact that the number of physical cores in my CPU 4. For clarity, depicted as a diagram plate.

Conclusion

It can be quite elegant and convenient to work with PHP using multi-threading with the pthreads extension. As you may have seen it can give a significant performance boost.

If you liked this article, please share with other PHP developers. If you have questions, please post a comment to this article here.




You need to be a registered user or login to post a comment

Login Immediately with your account on:



Comments:

No comments were submitted yet.



  Blog PHP Classes blog   RSS 1.0 feed RSS 2.0 feed   Blog PHP Multithreading us...   Post a comment Post a comment   See comments See comments (0)   Trackbacks (0)