A while ago I blogged about an XML Beautifier Tool which is able to tokenize an XML string and output it in human readable format. Strangely enough I noticed that I get quite a few pageviews from people searching for such a tool. Though this tool may be useful to a lot of people I think it is flawed and has serious issues with formatting, speed and memory consumption.
This inspired me to write a new version of XML formatter. It’s based on a SAX parser which is kind of “ugly” to implement and build around but because of it’s event based nature it’s super fast and has a very low memory footprint. The new version of the formatter shouldn’t peak higher in memory than 200 - 300kb even when the XML files start to weight over a megabyte. It also should not have any problems with indentation because it no longer tries to tokenize the XML itself and uses libxml to do the job. I also tried to make the tool documented and extendable.
It’s usage is really simple. All you have to do is initialize an object of XML_Formatter by passing an xml input stream, xml output stream and an array of options and call the format method. You might wonder why it requests input and output streams instead of a file name or a string. It does that to avoid high memory consumption. Here’s an example of how one might use XML_Formatter:
require('XMLFormatter.php'); $input = fopen("input.xml", "r"); $output = fopen("output.xml", "w+"); try { $formatter = new XML_Formatter($input, $output); $formatter->format(); echo "Success!"; } catch (Exception $e) { echo $e->getMessage(), "\n"; }
Nevertheless this tool is quite powerful in what it can do (I was able to format other website’s XHTML or tidied HTML sources) it also has some problems which are not actually related to the formatter but may seem odd to the user. The PHP xml parser does not understand such entities as or unsescaped ampersands like in ?x=1&y=1. So it’s the user’s responsibility to provide “correct” XMLs to the formatter.
Other than that I hope it will prove useful to someone. Download the latest version of the XML_Formatter.
16 Comments for PHP XML formatter tool rewrite
The Developer Day » Blog Archive » XML beautifier tool | September 8, 2009 at 2:34 AM
Aswin Anand | September 28, 2009 at 9:03 AM
Do you have a zip archive of this?
hakre | January 25, 2010 at 5:50 PM
Thanks for providing the source code. Can you say under which license you released it?
Ken S | February 27, 2010 at 1:10 PM
Thanks a TON! I was trying to figure out my XML doc for about 6hrs. After I used your MAGIC rewrite tool, I had it figured out in about 5 mins!!!
Nick Weavers | October 4, 2010 at 10:04 PM
Very useful, thanks for sharing it. Would be nice to have it work with strings too. I just want to format SOAP headers so passing in a string variable and getting another out would be very convenient.
Rob | October 14, 2010 at 9:46 PM
Shot - this is awesome
Eric M | November 19, 2010 at 1:27 PM
This is the only thing I’ve seen to do what I needed. Thanks a ton!!! So easy to use too, even for a novice.
Tommy K. | November 26, 2010 at 10:49 AM
Excuse me for my *maybe* noob request, but can you make this tool available with a textbox input like the older version (paste the code hit the button and it’s beautifully formatted), I’m working as a front-end developer and sometimes I need to make minor modifications in xsl files which are horribly formatted, and this would be a really big help.
Much appreciated your work!
Tommy K. | November 26, 2010 at 11:08 AM
Yeap, it does what I need beautifully, would you make this available for download?
Jon | October 16, 2011 at 12:47 PM
Great effort Žilvinas - thank you.
I am finding that it removes comments, which I would like to preserve. Also, it replaces self-closed tags with a manual close - I’d like to respect the choice of the original XML.
I’ll have a hack about to see if I can sort these things out, but do let me know if there’s any known fixes for them
Jon | October 16, 2011 at 5:24 PM
I’ve got a fix for the close tag issue. I suspect this works fine on your install, since you are probably using libxml2; since I am using Expat, the tag is not consumed until after the start_element_handler is called, and accordingly the _open value is wrongly set.
I will push my fix to Github in due course.
XMLFormatter2 | Crowdedplace | January 16, 2013 at 1:41 PM
[...] is a XML beautifying tool deriving from XML_Formatter originally developed by Žilvinas Šaltys. If you are in need of handling big files you should still resort to that solution, since it reads [...]
Leave a comment!
<<
[...] post describes a tool that I wrote long time ago. By now I have published a new refactored version of the XML beautifier which solves a few problems of the original [...]