The Developer Day | Staying Curious

TAG | MySQL

Dec/09

30

PyDumpy – Partial sorted MySQL database dumps

PyDumpy is a simple Python utility that might be helpful for developers struggling to get fast and partial database snapshots from production databases. It does it’s job by checking the database information schema to find out the approximate rows count available in each table and limits the table if needed to avoid dumping too much data as some databases may have hundreds of gigabytes of data. It then passes all the limits information it gathers to mysqldump a tool created by MySQL to do the actual dumping.

Python does not have a built in package to connect to MySQL as for example PHP does and therefore PyDumpy relies on MySQL for Python package to work. PyDumpy also relies on mysqladmin to do the actual dumping.

PyDumpy is very simple to use. For example to dump a maximum of 50 000 rows from each table type:

./pydumpy.py -H host -u user -p pass -n dbname -limit=50000

PyDumpy also allows to specify row limits and sorting preferences for each table specifically:

./pydumpy.py -H host -u user -p pass -n dbname -limit=50000 –ask-to-limit –ask-to-sort

If you find this tool useful please feel free to provide feedback by leaving a comment.

, , , Hide

Mar/09

26

Building Drizzle on Cygwin or getting as far as possible

It’s been quite a while i have this sort of desire to offer my help for some opensource project i like. One of my most favorite candidates is Drizzle. I should say my knowledge of C is really poor and there’s a whole crazy world out there full of C applications and build tools.

Nevertheless i decided to atleast try and see if i would be able to build it and maybe change something, run some tests. As I am a Windows user i found out the only way for me to build Drizzle is through Cygwin. I started with installing the latest stable version of Cygwin 1.5.25-15. I must say that their installer is really nice but i would offer to add a package search feature. Might help when you want to install numerous packages.

So what’s next? I found this wiki page about building drizzle and figured first thing i should do is get Bazaar. I installed the following packages using Cygwin installer:

  • bison
  • bzr
  • gettext
  • readline
  • libpcre0
  • pcre
  • pcre-devel
  • libtoolize
  • gperf
  • e2fsprogs

And then went on to get the Drizzle sources:

mkdir ~/bzrwork
bzr init-repo ~/bzrwork
cd ~/bzrwork
bzr branch lp:drizzle

Now onto building. Here’s where all the fun begins.

Drizzle requires a tool named libevent which is not available through Cygwin installer and you must build it yourself. And still you can’t build libevent with the latest version of Cygwin because it lacks certain functionality. After some googling i found a patched IPV6 version of Cygwin that fixes these issues. Added the #define EAI_SYSTEM 11 to http.c and finally were able to ./configure && make && make install libevent.

You also need protobuf installed. And there’s no package for that either. Actually this protobuf is quite nice stuff. Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.

Now that we seem to have all the packages installed we can start building drizzle. It should be as easy as this:

cd drizzle
./config/autorun.sh
./configure
make
make install

It is not. First to be able to compile Drizzle you need to have gcc4. And even if you do, ./configure must need to know where it is. So we need to use additional flags CC and CXX. Then you need to show ./configure where libevent is installed by adding a flag –with-libevent-prefix=/usr/local or any other place you have it in. I also found a really ugly problem with warnings. I wasn’t able to compile drizzle because it stopped somwhere in gnulib complaining about some warnings that were treated as errors. Funny enough there is a sarcastic option to disable these warnings: –disable-pedantic-warnings. You also probably want to install Drizzle somwhere else than usual by using:  –prefix=/some/deploy/dir.

In the end you come up with something like this: 

./configure CC=gcc-4 CXX=g++-4 –with-libevent-prefix=/usr/local –disable-pedantic-warnings –prefix=/some/deploy/dir

That’s how far i’ve got with it. Though i’m still not able to compile it.  I get an error somwhere in mystrings library that is related to some datatype casting issues. Hopefully i’ll be able to hack through this ;)

, , , Hide

Jan/09

16

Funny MySQL command line option

MySQL has atleast one funny named command line option that made me chuckle. The option is named “–i-am-a-dummy”. From the MySQL manual:

Allow only those UPDATE and DELETE statements that specify which rows to modify by using key values. If you have set this option in an option file, you can override it by using –safe-updates on the command line.

This option is an alias of “–safe-upfates”. I wonder for a while what is the story behind having two alternate names for the same thing. Maybe it’s just the MySQL folks having some fun.

, Hide

Jan/09

8

MySQL – ORDER BY RAND() optimization

Interesting read about order by rand optimization by Jan Kneschke. Haven’t used this myself but seems might be rather useful. The performance difference is huge. I wonder if MySQL could optimize ORDER BY RAND() itself when there are no data holes.

, , , Hide

Dec/08

19

Percona offers XtraDB to replace InnoDB

Who could have guessed Percona would decide to release their own storage engine named XtraDB for MySQL. It’s a drop in replacement for InnoDB. They are forking a version of InnoDB and applying patches of their own. Not only that they are going to put more effort to it to make it something more than just a small variation of InnoDB. Even now it seems to offer better performance and scalability than InnoDB. It also frees XtraDB from Oracle that did an awesome job creating InnoDB but it seems percona want’s to go to the next level. Happy to see XtraDB will be available for others to modify and submit patches. It’s also interesting to find out Percona will be trying to sponsor future changes of XtraDB through their customers.

It is also nice too see that there are some thoughts to merge XtraDB into Drizzle.

, , , Hide

Dec/08

12

MySQL Top N Sort Algorithm

While reading planetmysql I’ve found a very interesting article about Top N Sort algorithm. It would be nice to have this implemented in MySQL. There are millions of applications that have sorted and paginated lists of data. Some of those lists are really big. For example company that I work for has a web application with a report that has milions of rows and it has to be sorted and it only shows a few hundred rows in a single page. This would definitely be a great improvement to MySQL.

, , Hide

Dec/08

8

MySQL character sets and collations

Another great article I’ve found about MySQL character sets and collations that demystifies all the so often found problems in projects made by confused developers. This also can help you to save some space your database takes and improve your queries performance because more of your database can fit into memory.

, , , Hide

Nov/08

18

What is the future of relational databases?

A great recorded video keynote from the OpenSQL camp conference featuring Brian Aker. He asks some interesting questions about the state of databases. He has some interesting points about the way we currently use hardware and problems that are at our door and how opensource projects should step up to solve them.

, , Hide

Nov/08

18

Common wrong Data Types compilation

Found some nice MySQL data types tips how to create a database schema. Valuable information for intermediate and begginer developers to avoid the common database schema design pitfalls.

, Hide