Contents
The Pros and Cons of a Dynamic Programming Language
Dynamic class variables
The PHP 5.4 Class Variable Access Optimization
Future PHP 6 variable type optimizations
Conclusions
The Pros and Cons of a Dynamic Programming Language
As we all know, PHP is a dynamic language. Most of us PHP developers enjoy that characteristic because it makes programming more relaxed and flexible.
This means that in practice you do not need to declare in advance many resources that you need to use, for instance the names and types of variables, like you have in static languages.
Of course dynamic programming comes at a price. If do not to know in advance the types of variables, it is not possible to make all sorts of optimizations that you could if variables types were fixed and known.
Dynamic class variables
In PHP, classes may have an a variable number of variables. Despite you can declare the variables that you know will be necessary, you can also dynamically assign values to new variables that were not previously declared.
This implies that PHP runtime execution engine needs to take in account this possibility to manage the access to the class variable values. In practice this means that internally, class variable properties need to be stored in a dynamic data structure that can be expanded as necessary.
The PHP 5.4 Class Variable Access Optimization
Rasmus Schulz (not to be confused with PHP creator Rasmus Lerdorf) is a Danish developer that not a very long time ago raised the concern that PHP could run faster if it was executed on top of a JIT compiler engine.
More recently he raised another concern in the PHP internals mailing list regarding the excessive amount of memory used when creating many objects of the same class. His assertion was that this was due to the fact that for every object PHP was creating a dynamic data structure (hash table) to store the properties of each class variable.
This makes sense if PHP is expecting that the list of class variables changes during the life-time of each object. But if the list of variables never changes, it is a waste of memory because different objects of the same class will have the same list of variables, so the class could use the same list of properties if it was stored in a more efficient data structure shared by all objects of the same class.
That was when Portuguese core developer Gustavo Lopes explained that starting PHP 5.4 the dynamic properties hash table is created only if variables are added dynamically to objects at runtime.
Additionally, Tom Boutell concluded in the same discussion thread that after running an updated test script based on the one provided initially by Rasmus Schulz, using only pre-declared class variables the script not only uses much less memory, but it also runs 20% faster.
By now it should be easy for you to understand that you should avoid using class variables that were not declared explicitly in the class declaration. This means you should declare all class variables even if they do not have set an initial value.
Despite explicitly declaring all classes variables is a common practice, some developers do not declare all class variables they will use, especially if they are private and do not need to have an initial value.
The gains of this optimization are only noticeable if you need to create many objects of the same class in a PHP script. If you only create just a few objects of a class, you may not notice much difference.
Future PHP 6 variable type optimizations
This PHP 5.4 optimization works by making more efficient the lookup of class variables by name at runtime because when a class is loaded we already know what are all its variables, assuming that no dynamic variables are added later.
More optimizations could be done if the PHP engine also knew the type of the variables and functions' return values in advance. This is particularly true, not in the current Zend Engine version because it works by interpreting the compiled Zend opcodes, but rather when using a JIT (Just In Time) PHP compiler engine.
From what I know, there are at least 3 PHP JIT compilers: the Facebook HipHop PHP, Phalanger and Quercus, which would use the .NET and Java JIT compiler engines. There may be others that I am not aware, but I assume they all revolve around the same concepts.
JIT compilers are at least one generation ahead of the Zend Engine 2 used in PHP 5.x series since 2004. JIT compilers can make PHP run faster by generating and executing native machine code optimized for the current CPU.
Currently JIT compiler engines try to guess variable and function return value types from the PHP code context to make the generated machine code more efficient. For instance, consider this code:
$length = strlen($string);
strlen is a function that always returns an integer. Therefore the JIT compiler can declare a variable of type integer because it is certain it is always an integer. This kind of value type guessing is called type inference.
When the type of a variable cannot be guessed with certainty, JIT compilers use variables of a type called variant.
The use of variant variables is much less efficient because the operations with those variables may require many type conversions because we cannot assume what type is the current value stored in that variable. This means that the JIT compiled code will use more native machine code, more memory, more CPU cycles, thus less performance.
In the cases that the JIT compiler is unable to guess the variable types from the context, it would be helpful if the programmer told explicitly what type the variable is. This not only would make the generated code more efficient, but would eventually make the JIT compilation faster, as the JIT compiler would not have to make variable type guesses.
Typed class variable declarations could look like this:
public integer $l = 0;
Traditionally PHP core developers have been reluctant to accept any moves towards a more strict variable typing of the PHP language.
But the idea that I am proposing here is not to make PHP a strict typed language. The idea is to let PHP developers optionally declare explicitly the type at least of some variables, so JIT compilers can make a better job generating more efficient native machine code in less time.
For PHP5.x/Zend Engine 2 declaration of variable and function type values can hardly be useful due to the dynamic nature of Zend Engine 2.
However, for an eventual PHP 6 based on Zend Engine 3 with support for JIT compilation, any type hints that the PHP code can pass to the JIT engine will help making the compiled PHP code much more efficient.
As far as I know, the plans for PHP 6 and Zend Engine 3 are not public, if they exist at all. But it is neither soon nor late to discuss what they will be. They may well address this matter of efficiency having in sight the benefits of JIT compilers. I suspect that more and more the PHP community will be talking about these matters because it seems to be the next logical step. We will see.
Conclusions
To be accurate, the PHP 5.4 optimization was not really secret. The fact is that it was not commented much among the developers outside the PHP core.
As for PHP 6 and Zend Engine 3 speculations mentioned above, if they were discussed already between PHP core developers, at least for me they are really secret, so my speculations are only guesses of what it can be and in my probably not so humble opinion of what it should be.
Anyway, since the optimization was introduced in PHP 5.4, this is yet another reason that may convince you to upgrade in case you were wondering if it was really worth it. In any case, you should always evaluate carefully the pros and cons of upgrading to PHP 5.4 before you decided to do it.
So what do you think? Are you willing to upgrade to PHP 5.4 benefit of this and other optimizations or do you still have concerns that prevent you from upgrading?
What about PHP 6 and Zend Engine 3 speculations? Do you think the proposed optional type declaration for variables and functions would be something you would use to make PHP run faster?
Feel free to post a comment with your thoughts or questions.