Archive for April 2008
I have bought my first copy of php|architect magazine. The PDF version only costs $5. The quality of the first two articles I have read so far is great.
The first article was about the SimpleTest framework. A really nice and interesting approach compared to the default way of unit testing. It’s easy to test badly written code that may be harder to hook into using traditional unit tests. Sometimes the article goes into too much detail making the article sound more like a documentation page.
The second article was about internationalization (i18n) in PHP. It definitely was interesting. The intl extension based on ICU takes care of string comparison (collating), number formatting (currency included), message formatting, unicode string normalization, locales handling (parsing, lookups) and date formatting. Internationalization has never been easier.
I like the php|architect slogan ”It won’t make you smarter but it will make you a better PHP developer”. I hope it will.
I’m currently preparing to take the Zend PHP certificate exam and have just finished reading the Zend PHP 5 Certification Study Guide. To help myself and other PHP developers to prepare for the exam I thought I could review every Zend PHP5 certification study guide chapter to provide some highlights on things that may not be known to everyone.
Chapter I – PHP basics
- Anatomy of a PHP script
- Data types
You must know that PHP syntax is derived from the C language. PHP syntax has been influenced by Perl and JAVA (latest OOP additions).
PHP supports these opening tags: <?php ?>, <?= ?>, <? ?>, <script language=”php”></script>, <% %>. Interestingly no one knows why <% %> were introduced at all :). Short tags, script tags and ASP tags are all considered deprecated and their use is trongly discouraged.
PHP parser strips new lines after ?> closing tag. An easy way to prevent spurious output from an include file is to omit the closing tag at the end, which the parser considers this perfectly legal.
1.2 Anatomy of a PHP script
It is possible to skip the last semicolon in a PHP script though that is considered a parser quirk and should be avoided.
There are three types of comments in PHP: /* */, //, #. A comment can be ended with a newline or the php closing tag ?>
Interesting to know echo is not a function and, as such, it does not have return value. If you need to output data through a function, you can use print() instead.
An important function is die(); which itself is an alias of exit(); You can echo output with these functions by passing a string or return a numeric status to the process that called PHP by passing an integer.
1.3 Data types
PHP is loosely typed, meaning that it will implicitly change the type of a variable as needed, depending on the operation being performed on its value.
All data types in PHP are divided into two categories scalars and composites. Scalars are: ints, strings, floats, booleans. Numbers can be declared using several different notations: decimal, octal, hexadecimal. Octal numbers can be easily confused with decimal numbers and can lead to some… interesting consequences!
PHP supports two different notations for expressing floats: decimal and exponential. For example 1e2 equals 100. Floats can be as wide as your processor supports. It will be longer on 64 bits systems compared to 32 bits. Be aware that PHP does not track overflows so any operation with big scary numbers can have catastrophic consequences on the reliability of your application. Also be aware that basic operations with floats are not always precise. For example: echo (int) ((0.1 + 0.7) * 10); would output 7 instead of 8. Because internally in PHP the float value is 7.99999 and when casted to an integer becomes 7. To avoid this use extensions such as BCMath.
Strings are ordered collections of binary data. They can store anything from text to music recordings.
Boolean when converted from an integer becomes false if the integer is zero and becomes true otherwise. A string is converted to false only if it is empty or if it contains the single character. If it contains any other data—even multiple zeros—it is converted to true. When converted to a number or a string, a Boolean becomes 1 if it is true, and
Arrays are containers of ordered data elements; an array can be used to store nd retrieve any other data type, including numbers, Boolean values, strings, bjects and even other arrays.
Objects are containers of both data and code. They form the basis of Object oriented programming also known as OOP.
NULL indicates that a variable has no value. A variable is considered to be NULL if it has been assigned the special value NULL, or if it has not yet been assigned value at all.
The resource data type is used to indicate external resources that are not used atively by PHP, but that have meaning in the context of a special operation— such as, for example, handling files or manipulating images.
You can force PHP to convert some types to others. For example: echo (int) $x; Though you cannot convert any data types to resources though vice versa is available to get hold of a resource ID.
Variables can only be named letters, numbers, underscores. A variable can only start with an alpha character or an underscore. Variables and constants are the only two identifier types that are case sensitive.
PHP supports variables variables:
$name = '123';
/* 123 is your variable name, this would normally be invalid. */
$$name = '456';
Variables can also hold function names and functions can be called through variables like this:
$f = ’myFunc’;
$f(); // will call myFunc();
To determine whether a variable exists use isset(). It will return true when a variable is defined and is not NULL.
Constants can only contain scalar values and follow the same naming conventions as variables. They are also case sensitive.
What do you think the output of this code snippet would be:
$array = array (0.1 => 'a', 0.2 => 'b'); echo count ($array);
It’s 1! In PHP array keys can only be made from integers and strings. Strangely enough it accepts floats too and casts them to integers!
While reviewing old code you might find stuf like:
Note that the $array is passed by reference. Yet again it turns out this kind of technique is deprecated and one should redeclare the function to accept the $array by reference. Common in PHP itself. Sadly I didn’t get a helpful warning that this type of technique is deprecated.
Here’s an interesting question from the Zend exam practice test:
“Absent any actual need for choosing one method over the other, does passing arrays by value to a read-only function reduce performance compared to passing them by reference?’
I wondered well PHP would have to make a copy of the variable passed so that would be slower than passing it by reference. I was wrong. Turns out PHP uses a lazy-copy mechanism (also called copy-on-write) that does not actually create a copy of a variable until it is modified. And since PHP must create a set of structures that it uses to maintain the reference it is actually “slower” to pass a variable to a read-only function by reference.
Having programmed for quite a few years I could have thought of that myself. Strangely this was not in the study guide.
echo (int) ( (0.1+0.7) * 10 );
What do you think the output would be? Surprisingly it’s 7. Internally in PHP it’s stored as 7.999999. Gladly this problem is described in PHP manual about float precision.
And sadly .. How many developers starting to develop PHP applications read about integers and float? When I started with PHP I thought to myself.. Hey it’s there. It’s simple and it works. What could be wrong with it? And after a while you wake up developing so called “enterprise applications” and you might start banging your head to the wall when you find out your reports are not so accurate as you thought Probably not many developers make their priority to read about integers and floats?
Turns out this is a usual thing. A quote from PHP.net manual:
"It is quite usual that simple decimal fractions like 0.1 or 0.7 cannot be converted into their internal binary counterparts without a little loss of precision."
PHP.net manual suggests developers should use bcmath extension if higher precision is needed. Not only you can not cast floats to integers but you have to be careful comparing floats.
You might never run into this problem. But when you do it may come as an unpleasant surprise trying to find where is that single penny lost..
Did you know that most widespread encryption algorithm in electronic commerce is RSA? For two computers that have never communicated before to start a secure conversation RSA uses a public key and a private key.
For example if Tom wanted to send a secure message to Suzi he would encode his message with Suzi’s public key and send it to her. The message can only be decoded using Suzi’s secret private key that only she knows.
Interesting thing is that a private key is made of two large distinct random prime numbers. And a public key is the product of those two numbers. So RSA is only as safe as it’s hard to find those two prime numbers using the public key to get the private key. But more interestingly it turns out to be almost impossible. There is no such efficient algorithm on earth to find those two prime numbers using a public key. It would take years to decrypt a single number. You can actually earn some money trying to decrypt RSA messages.
RSA key’s are typically 1024, 2048 bits long. A shorter 256 bit key can easily be decrypted using a home computer. A 512 key can be decrypted with a Computer Cluster. Even 1024 is not considered really safe anymore. Even more Shor’s algorithm proves that a RSA key could be decrypted as fast as a computer can do a quick sort. Though that would require us to have a quantum computer. It’s a theoretical device by now but who knows what the future holds?
There are other security attacks known, but they are way too much complex for a simple presentation like this. So..