The Developer Day | Staying Curious

CAT | SEO

Sep/08

12

If you found this blog post on while searching for a php google page rank class implementation that works on non windows machines then it is your lucky day! You found it! Congratulations!

To my knowledge this is the first available php google page rank retrieval implementation that works on any platform. We have spent hours searching for such a thing online but we couldn’t find it. There are some php google pagerank tools online, they all work, but they are all limited to windows machines.

Why is so you might ask? Well originally the google pagerank retrieval algorigthm is not public. But google made a browser plugin that was able to calculate the google pagerank for any website you visit. So some freaky geeks dissasembled that plugin and got their hands on the google page rank calculation implementation.  Then this implementation was ported to various languages such as Javascript, PHP. The google page rank implementation is sort of protected by calculating a “unique” hash of the given URL. And here the MAGIC begins.

To calculate this hash the algorithm overflows 32bit integers on XOR operations. Aaaand.. 32bit XOR overflows work quite differently on windows and linux in PHP! If you overflow a 32bit integer on windows it just truncates the result to 32 left most bits and returns a new integer. SMART! And on linux XOR overflow just returns the MAX INTEGER value. What did we do? Oh.. We created a simple class to simulate windows 32bit XOR operations overflow using the PHP gmp extension. Tadam! We have also cleaned up the code, documented and made it look shiny ;)

You can download  and use it at your own will. I hope this will help you. If it did just leave a comment and say thanks because we are such nice guys to help you out ;)

To use the class try:

require_once("GooglePageRank.php");
echo GooglePageRank::get("http://www.yahoo.com");

Happy Programmers Day!

p.s  Google™ search engine and PageRank™ algorithm are the trademarks of Google Inc.

Update: PageRank class relies on the GMP extension which is not always enabled by default. On Linux Ubuntu it comes as a separate package php5-gmp.

, , , Hide

Jul/07

9

Free multiple XHTML, HTML validator tool

If you’re working with SEO you want to make sure your website is XHTML valid. SE crawlers understand your website better if a website is XHTML valid. To make sure that a website is XHTML valid one would use http://validator.w3.org validator.

A problem arises when you need to check fifty or a hundred pages. For that particular reason I wrote a yet another extremely simple tool to validate lists of URL’s. It uses the same w3.org validator by opening the URL and searching for a phrase “page is valid”. You can check out my XHTML validator online or you can download the source code of HTML validator.

Basically all you need is a list of URLs and the validator will tell you which pages are valid and which ones are not. I hope you will find it useful. Let me know what you think by leaving a comment.

, , , Hide

Jun/07

22

Free Online SEO Sitemap Crawler Tool

The company that I work for has a lot to do with SEO, SEM, PPC. There are such a thing as sitemaps and URL lists that are used in SEO for Search Engines to optimize your site easier and better.

We have a lot of sites which have some SEO development going on. To generate these URL lists is not a task for a human being to do and I didn’t want to develop a custom sitemap generator for each of our projects.

I have tried to search for a tool that could generate these URL lists given only a single root URL. It was a while ago so I cannot say that such tools don’t exist. I also like my tools to be simple and easily extended so I developed a tool of my own. It’s a simple crawler that only needs to know the start URL and the base URL and it can find all your URLs in no time. The tool remembers the pages it has already visited. So for example let’s say I wanted to know all my blog URL’s. I just feed the crawler two URLs and wait for the results. The tool also ignores any outside URLs and adds the base URL to relative links.

You can download the crawler or try it out for yourself.

Please read comments below if you are interested why the second input field is needed.

, , , , Hide

Find it!

Theme Design by devolux.org