Importing data into databases

2007-01-08 2-minute read

I’ve spent a large chunk of my life importing data from one database into another and have only recently realized the time savings involved with scripting that work. My previous method had been to convert the original data into a format I could manipulate in a spread sheet. Then, manually tweak the data (using calculations whenever possible). Then saving the spread sheet in a format that MySQL could import.

It makes sense and it works.

The only problem is that the data entry has to freeze during this process, which can often take days. Furthermore, after spending hours doing the import, you may realize that you made mistake early on. Often, this means repeating the entire painful process.

There is a better way!

Although it takes more time up front, you will often save time by scripting the entire process. You may need one manual step to dump the data from the original database (this could be automated depending on where you get the data from). Then - write a script that parses this data, tweaks what needs to be tweaked, and then automatically imports it into the destination database.

The two advantages are:

  1. You can work on a copy of the data all you want without having to freeze data entry. Once you have it working and you are sure the import was successful, you can do the final import with a single command.

  2. If you make mistakes (woops, switched a column or forgot a chunk of the original data) you can easily make a small tweak to your script and re-run the whole process.

  3. More enjoyable. Who wants to tweak data by hand when you can be programming?

An important tool in this type of endeavor is a good script. Attached to this blog is a PHP class for reading and parsing delimited or variable width data files. By default, it works pretty well on your standard tab delimited data file, but can be used for any type of delimited file or even variable width files.

The top of the class has sample php code for how to use it. If you like it, leave a comment on the blog!