The Developer Day | Staying Curious

TAG | xml

Mar/09

31

PHP XML formatter tool rewrite

A while ago I blogged about an XML Beautifier Tool which is able to tokenize an XML string and output it in human readable format. Strangely enough I noticed that I get quite a few pageviews from people searching for such a tool. Though this tool may be useful to a lot of people I think it is flawed and has serious issues with formatting, speed and memory consumption.

This inspired me to write a new version of XML formatter. It’s based on a SAX parser which is kind of “ugly” to implement and build around but because of it’s event based nature it’s super fast and has a very low memory footprint. The new version of the formatter shouldn’t peak higher in memory than 200 - 300kb even when the XML files start to weight over a megabyte. It also should not have any problems with indentation because it no longer tries to tokenize the XML itself and uses libxml to do the job. I also tried to make the tool documented and extendable.

It’s usage is really simple. All you have to do is initialize an object of XML_Formatter by passing an xml input stream, xml output stream and an array of options and call the format method. You might wonder why it requests input and output streams instead of a file name or a string. It does that to avoid high memory consumption. Here’s an example of how one might use XML_Formatter:

require('XMLFormatter.php');
$input = fopen("input.xml", "r");
$output = fopen("output.xml", "w+");
try {
    $formatter = new XML_Formatter($input, $output);
    $formatter->format();
    echo "Success!";
} catch (Exception $e) {
    echo $e->getMessage(), "\n";
}

Nevertheless this tool is quite powerful in what it can do  (I was able to format other website’s XHTML or tidied HTML sources) it also has some problems which are not actually related to the formatter but may seem odd to the user. The PHP xml parser does not understand such entities as   or unsescaped ampersands like in ?x=1&y=1. So it’s the user’s responsibility to provide “correct” XMLs to the formatter.

Other than that I hope it will prove useful to someone. Download the latest version of the XML_Formatter.

, , , Hide

Oct/08

15

PHP SoapClient absolute certificate path bug

I have found a bug in PHP 5.2.6 related to SoapClient. If you pass a relative path of local_cert option to the SoapClient on Windows machines the client does not work and refuses to connect to the service. Actually this is my first bug to report and it got fixed. (I was worried I might be one of those annoying pests who report not bogus stuff) I’m happy I did a tiny itsy bitsy amount of good to PHP.

, , , , Hide

May/08

20

PHP SoapClient proxy port problem

We couldn’t get soapclient with proxy working on our local machines. It was very lucky of me to find the following comment on php.net:

I kept having a problem using an HTTP proxy with SOAP. The proxy_port parameter has to be an integer, ie. "proxy_port" => "80" won't work, you'll have to use "proxy_port" => 80.

Incredibly hard to catch bugs like these. It’s probably a bug that should get reported. It would be nice to have a note in the documentation to warn other developers from experiencing the same problem.

, , , , , Hide

Jun/07

20

XML beautifier tool

This post describes a tool that I wrote long time ago. By now I have published a new refactored version of the XML beautifier which solves a few problems of the original tool.

I’m currently working with a system that is communicating with various third parties using web services. There are times when I need to view some of the XMLs or output them for some particular reason. Because of the last reason I wanted the beautifier to be written in PHP so that I could implement it easily. After spending some time googling with keywords like “php xml indent”, “php xml output”, “php xml human readable”, “php xml beautifier” I found that there are some code snippets or incomplete /not fully working classes / functions.

I’ve also tried pear.php.net and there indeed is a package named “XML Beautifier” that is no longer maintained. And I was frightened away by lot’s of dependencies: PEAR, XML Parser, XML Utilities, XML Beautifier .. Brrrr ..

Finally I visited phpclasses.org and found a class named “Beauty XML”. It’s simple, works but has a few bugs. So I modified the class to do the following things:

  • To ignore adding a new level after if finds a tag like <tag />
  • To remove all XML headers
  • To remove white spaces from situations like “</tag2> </tag1>”

You can download the beauty XML or try it out yourself.

I think there still may be some XMLs that don’t get indented nicely and if i don’t find out that myself maybe You will. So please tell me if you do. I’ve also added a simple interface file to view XMLs in your web browser.

Maybe you already have or know a tool that does this job perfectly and I would be happy if you shared this with me.

, , Hide

Find it!

Theme Design by devolux.org