CAT | MySQL
30
PyDumpy – Partial sorted MySQL database dumps
0 Comments | Posted by Žilvinas Šaltys in MySQL, Tools
PyDumpy is a simple Python utility that might be helpful for developers struggling to get fast and partial database snapshots from production databases. It does it’s job by checking the database information schema to find out the approximate rows count available in each table and limits the table if needed to avoid dumping too much data as some databases may have hundreds of gigabytes of data. It then passes all the limits information it gathers to mysqldump a tool created by MySQL to do the actual dumping.
Python does not have a built in package to connect to MySQL as for example PHP does and therefore PyDumpy relies on MySQL for Python package to work. PyDumpy also relies on mysqladmin to do the actual dumping.
PyDumpy is very simple to use. For example to dump a maximum of 50 000 rows from each table type:
./pydumpy.py -H host -u user -p pass -n dbname -limit=50000
PyDumpy also allows to specify row limits and sorting preferences for each table specifically:
./pydumpy.py -H host -u user -p pass -n dbname -limit=50000 –ask-to-limit –ask-to-sort
If you find this tool useful please feel free to provide feedback by leaving a comment.
26
Building Drizzle on Cygwin or getting as far as possible
0 Comments | Posted by Žilvinas Šaltys in MySQL, Tools
It’s been quite a while i have this sort of desire to offer my help for some opensource project i like. One of my most favorite candidates is Drizzle. I should say my knowledge of C is really poor and there’s a whole crazy world out there full of C applications and build tools.
Nevertheless i decided to atleast try and see if i would be able to build it and maybe change something, run some tests. As I am a Windows user i found out the only way for me to build Drizzle is through Cygwin. I started with installing the latest stable version of Cygwin 1.5.25-15. I must say that their installer is really nice but i would offer to add a package search feature. Might help when you want to install numerous packages.
So what’s next? I found this wiki page about building drizzle and figured first thing i should do is get Bazaar. I installed the following packages using Cygwin installer:
- bison
- bzr
- gettext
- readline
- libpcre0
- pcre
- pcre-devel
- libtoolize
- gperf
- e2fsprogs
And then went on to get the Drizzle sources:
mkdir ~/bzrwork
bzr init-repo ~/bzrwork
cd ~/bzrwork
bzr branch lp:drizzle
Now onto building. Here’s where all the fun begins.
Drizzle requires a tool named libevent which is not available through Cygwin installer and you must build it yourself. And still you can’t build libevent with the latest version of Cygwin because it lacks certain functionality. After some googling i found a patched IPV6 version of Cygwin that fixes these issues. Added the #define EAI_SYSTEM 11 to http.c and finally were able to ./configure && make && make install libevent.
You also need protobuf installed. And there’s no package for that either. Actually this protobuf is quite nice stuff. Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.
Now that we seem to have all the packages installed we can start building drizzle. It should be as easy as this:
cd drizzle
./config/autorun.sh
./configure
make
make install
It is not. First to be able to compile Drizzle you need to have gcc4. And even if you do, ./configure must need to know where it is. So we need to use additional flags CC and CXX. Then you need to show ./configure where libevent is installed by adding a flag –with-libevent-prefix=/usr/local or any other place you have it in. I also found a really ugly problem with warnings. I wasn’t able to compile drizzle because it stopped somwhere in gnulib complaining about some warnings that were treated as errors. Funny enough there is a sarcastic option to disable these warnings: –disable-pedantic-warnings. You also probably want to install Drizzle somwhere else than usual by using: –prefix=/some/deploy/dir.
In the end you come up with something like this:Â
./configure CC=gcc-4 CXX=g++-4 –with-libevent-prefix=/usr/local –disable-pedantic-warnings –prefix=/some/deploy/dir
That’s how far i’ve got with it. Though i’m still not able to compile it. Â I get an error somwhere in mystrings library that is related to some datatype casting issues. Hopefully i’ll be able to hack through this
MySQL has atleast one funny named command line option that made me chuckle. The option is named “–i-am-a-dummy”. From the MySQL manual:
Allow only those UPDATE and DELETE statements that specify which rows to modify by using key values. If you have set this option in an option file, you can override it by using –safe-updates on the command line.
This option is an alias of “–safe-upfates”. I wonder for a while what is the story behind having two alternate names for the same thing. Maybe it’s just the MySQL folks having some fun.
8
MySQL – ORDER BY RAND() optimization
0 Comments | Posted by Žilvinas Šaltys in Lazyweb, MySQL
Interesting read about order by rand optimization by Jan Kneschke. Haven’t used this myself but seems might be rather useful. The performance difference is huge. I wonder if MySQL could optimize ORDER BY RAND() itself when there are no data holes.
I’ve recently got listed on the Zend yellow pages. Even managed to get a comment from a Zend employee on my last blog post about PHP5 certification.
I’m also interested in MySQL developer and MySQL DBA exams. I have been working with MySQL for more than 5 years and I still don’t feel that I know all it’s angles really well. Recently I’ve started reading the MySQL 5.0 certification study guide. It will be interesting to compare the Zend and MySQL certification programs. While reading the book it was odd to find out that MySQL Query Browser was the third chapter of the book. And you must be able to know your way around the tool to take the developer exam. I have never used MySQL Query Browser and phpmyadmin has allways sufficed my needs as a web applications developer. Though I agree it is very useful tool. The study guide book comes with a CD that has training exercises in it. I was a little disappointed when I didn’t find a mock exam application to test what I’ve learned but instead found a PDF e-book with questions from every chapter. Still better than nothing. Even though the guide states that it is designed to make you think instead of memorize the material I think it is more effective to learn by taking tests.
19
Percona offers XtraDB to replace InnoDB
0 Comments | Posted by Žilvinas Šaltys in Lazyweb, MySQL
Who could have guessed Percona would decide to release their own storage engine named XtraDB for MySQL. It’s a drop in replacement for InnoDB. They are forking a version of InnoDB and applying patches of their own. Not only that they are going to put more effort to it to make it something more than just a small variation of InnoDB. Even now it seems to offer better performance and scalability than InnoDB. It also frees XtraDB from Oracle that did an awesome job creating InnoDB but it seems percona want’s to go to the next level. Happy to see XtraDB will be available for others to modify and submit patches. It’s also interesting to find out Percona will be trying to sponsor future changes of XtraDB through their customers.
It is also nice too see that there are some thoughts to merge XtraDB into Drizzle.
While reading planetmysql I’ve found a very interesting article about Top N Sort algorithm. It would be nice to have this implemented in MySQL. There are millions of applications that have sorted and paginated lists of data. Some of those lists are really big. For example company that I work for has a web application with a report that has milions of rows and it has to be sorted and it only shows a few hundred rows in a single page. This would definitely be a great improvement to MySQL.
8
MySQL character sets and collations
0 Comments | Posted by Žilvinas Šaltys in MySQL, Optimization, PHP
Another great article I’ve found about MySQL character sets and collations that demystifies all the so often found problems in projects made by confused developers. This also can help you to save some space your database takes and improve your queries performance because more of your database can fit into memory.
18
What is the future of relational databases?
0 Comments | Posted by Žilvinas Šaltys in MySQL
A great recorded video keynote from the OpenSQL camp conference featuring Brian Aker. He asks some interesting questions about the state of databases. He has some interesting points about the way we currently use hardware and problems that are at our door and how opensource projects should step up to solve them.
Found some nice MySQL data types tips how to create a database schema. Valuable information for intermediate and begginer developers to avoid the common database schema design pitfalls.
