The Developer Day | Staying Curious

Archive for March 2009



PHP XML formatter tool rewrite

A while ago I blogged about an XML Beautifier Tool which is able to tokenize an XML string and output it in human readable format. Strangely enough I noticed that I get quite a few pageviews from people searching for such a tool. Though this tool may be useful to a lot of people I think it is flawed and has serious issues with formatting, speed and memory consumption.

This inspired me to write a new version of XML formatter. It’s based on a SAX parser which is kind of “ugly” to implement and build around but because of it’s event based nature it’s super fast and has a very low memory footprint. The new version of the formatter shouldn’t peak higher in memory than 200 - 300kb even when the XML files start to weight over a megabyte. It also should not have any problems with indentation because it no longer tries to tokenize the XML itself and uses libxml to do the job. I also tried to make the tool documented and extendable.

It’s usage is really simple. All you have to do is initialize an object of XML_Formatter by passing an xml input stream, xml output stream and an array of options and call the format method. You might wonder why it requests input and output streams instead of a file name or a string. It does that to avoid high memory consumption. Here’s an example of how one might use XML_Formatter:

$input = fopen("input.xml", "r");
$output = fopen("output.xml", "w+");
try {
    $formatter = new XML_Formatter($input, $output);
    echo "Success!";
} catch (Exception $e) {
    echo $e->getMessage(), "\n";

Nevertheless this tool is quite powerful in what it can do  (I was able to format other website’s XHTML or tidied HTML sources) it also has some problems which are not actually related to the formatter but may seem odd to the user. The PHP xml parser does not understand such entities as   or unsescaped ampersands like in ?x=1&y=1. So it’s the user’s responsibility to provide “correct” XMLs to the formatter.

Other than that I hope it will prove useful to someone. Download the latest version of the XML_Formatter.

, , , Hide

It’s been quite a while i have this sort of desire to offer my help for some opensource project i like. One of my most favorite candidates is Drizzle. I should say my knowledge of C is really poor and there’s a whole crazy world out there full of C applications and build tools.

Nevertheless i decided to atleast try and see if i would be able to build it and maybe change something, run some tests. As I am a Windows user i found out the only way for me to build Drizzle is through Cygwin. I started with installing the latest stable version of Cygwin 1.5.25-15. I must say that their installer is really nice but i would offer to add a package search feature. Might help when you want to install numerous packages.

So what’s next? I found this wiki page about building drizzle and figured first thing i should do is get Bazaar. I installed the following packages using Cygwin installer:

  • bison
  • bzr
  • gettext
  • readline
  • libpcre0
  • pcre
  • pcre-devel
  • libtoolize
  • gperf
  • e2fsprogs

And then went on to get the Drizzle sources:

mkdir ~/bzrwork
bzr init-repo ~/bzrwork
cd ~/bzrwork
bzr branch lp:drizzle

Now onto building. Here’s where all the fun begins.

Drizzle requires a tool named libevent which is not available through Cygwin installer and you must build it yourself. And still you can’t build libevent with the latest version of Cygwin because it lacks certain functionality. After some googling i found a patched IPV6 version of Cygwin that fixes these issues. Added the #define EAI_SYSTEM 11 to http.c and finally were able to ./configure && make && make install libevent.

You also need protobuf installed. And there’s no package for that either. Actually this protobuf is quite nice stuff. Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.

Now that we seem to have all the packages installed we can start building drizzle. It should be as easy as this:

cd drizzle
make install

It is not. First to be able to compile Drizzle you need to have gcc4. And even if you do, ./configure must need to know where it is. So we need to use additional flags CC and CXX. Then you need to show ./configure where libevent is installed by adding a flag -with-libevent-prefix=/usr/local or any other place you have it in. I also found a really ugly problem with warnings. I wasn’t able to compile drizzle because it stopped somwhere in gnulib complaining about some warnings that were treated as errors. Funny enough there is a sarcastic option to disable these warnings: -disable-pedantic-warnings. You also probably want to install Drizzle somwhere else than usual by using:  -prefix=/some/deploy/dir.

In the end you come up with something like this: 

./configure CC=gcc-4 CXX=g++-4 -with-libevent-prefix=/usr/local -disable-pedantic-warnings -prefix=/some/deploy/dir

That’s how far i’ve got with it. Though i’m still not able to compile it.  I get an error somwhere in mystrings library that is related to some datatype casting issues. Hopefully i’ll be able to hack through this ;)

, , , Hide



Sample PHP MVC application

Every web developer probably at some point heard something about MVC unless he or she was living in a cave. I definately have heard and read a lot about it. I won’t probably lie too much to say that most people know that MVC is the nowdays defacto design pattern for web applications. Atleast for PHP it is.

If you have ever had interest in design patterns and did some research on them you may know that design patterns may be interpreted and implemented different every time one tries to. And MVC is no exception to this rule. In my own career path I have seen many projects that claim to implement the MVC design pattern. And if it actually doesn’t - it may be called a hybrid of MVC. As ridiculous as it may be I think because of the MVC hype and everyone trying to be able to claim “yes we use MVC” it is one of the most misunderstood patterns of them all. And because of this … There are a LOT and i mean a LOT of articles and blogs and forums trying to explain MVC the way it should be.

And I myself have read a lot of versions of these blogs and articles. And to be honest I couldn’t answer to you for example what a controller should do and should not do. Well ofcourse I know it shouldn’t contain any business logic. If you would try to research that you would probaly find people saying that the controller should initiate the model, do something with the model and pass the result to the view and render it. You can even find some examples..

But to some extent I find it all synthetic and not very realistic. Most examples are of the level of Hello World program. I think the devil is in the details. If you would try to find any sample php mvc applications you probably wouldn’t find much. There are a few very simplistic sample MVC projects but I don’t find that to be an eye opener that goes deep into details.

I think the PHP community needs such an example. I believe Zend Framework is a great start for MVC. But it isn’t enough. It still doesn’t show you how a real life model or controller would look like. What each part of MVC would do and would not. I believe that one good example is better than a thousand words. I feel trully interested to try and find the “Equilibrium” of the famous MVC design pattern. Don’t you?

, , , , , , Hide



PHP session cluster with memcache

Reading I found this great post about how to solve PHP session clustering in an easy way. Though it’s no silver bullet but it’s definitely worth knowing about. It uses few insantances of memcache that stores the same copies of sessions. And if one instance gets down the other keeps serving the sessions. It’s nice to have such kind of failover. Ofcourse after recovery you have to sync the instances. And your performance is going to suffer more and more if you add more memcache instances.

Though myself I feel this is a great poor man’s solution ;) Might end up using it myself.

, , , Hide

Most developer teams work with version control tools like SVN or Git. Most of those teams use certain project management tools like Basecamp or Fogbugz in our case.

We came up with an idea to require developers to write down SVN revision numbers in commit messages. This helps to relate code changes to actual tasks if such a need arises. To do that we’ve created a pre commit hook that requires developers to insert internal project management tool task number to the SVN commit comment field.

A pre commit hook can be a simple bash script. In our case it was this script:

# Make sure that the log message contains some text.
$SVNLOOK info -t "$TXN" "$REPOS" | grep -P 'FB:\s*\d{3,6}' > /dev/null
if [ $? -ne 0 ]; then
echo -e "FogBugz case number is missing in the comment.\n" 1>&2
echo -e "Please add text 'FB: CASE_NUMBER' to your comment." 1>&2
exit 1
exit 0

, , , Hide



Making your life easier with FirePHP

I have been a user of FireBug for quite a long time and it saved me a lot of hours of pain debugging AJAX applications. When I see developers trying to figure out why their AJAX applications are not working without using FireBug it sometimes looks like an inexperienced woman trying to park a car without any luck whatsoever.

And then recently i found FirePHP while reading php|architect and it seems to me a such a nice tool it’s well worth to blog about it. If you want to debug ajax applications with FireBug you must do your own variable dumps which break the output. While FirePHP is capable of sending all this data through http headers without breaking the response of your json, xml, image responses. It also provides a very nice library for logging various stuff.

One of the things I really liked is the ability to easily view backtraces through FirePHP. Though it seems to me it can only view backtraces of your own defined local php function calls. It would be nice to see FirePHP integrated with Xdebug. Then it could do a full stack trace with all the parameters involved and even memory usage.

I also love the fact that there is a Zend_Log_Writer for Firebug. It also can send information to firebug console. But it’s not as “sweet & cute” like FirePHP.

I believe that FireBug, FirePHP, Xdebug are the must have tools for every php web developer. It saves more than a reasonable amount of time spent debugging.

, , , Hide

Find it!

Theme Design by