Reducing data footprints on test systems

Picture the scene:

You are a DBA. You have to make everyone happy, including DEVs (yes, including them too). Part of that is to ensure that your DEVs have a fresh copy of your production data so that they can test their programming with current data and structures. The issue with that is your production database is multiple hundreds of GB in size and your test system doesn’t have the capacity to store a full copy of your database. Your IT Budget will also not stretch to purchasing more storage! What can you do?

I had a similar situation recently and took a look at the possibilties and here is what I came up with:

1. Try out tools like SQL Virtual Restore to allow you to “mount” a backup from a remote server avoiding the space issue all together. This is, of course, a solution that requires a license (no budget maybe?)

2. Use the great features offered to you by SQL Server Developer Edition: data compression to the rescue!

So of course, I chose door number 2.

As you may know, Developer Edition offers the same features as Enterprise Edition with the caveat that the instance cannot be used for production purposes in any way. This means that the seriously powerful and useful data compression becomes available for your test servers! This counts even if you only use Standard Edition in production – joy of joys! The good thing being that Developer Edition only costs around $50 (or less if you have an MSDN subscription) – even more joys!

If you have never seen/used compression take a quick look over on MSDN to see what it all is (Data Compression). Basically, you can compress data in indexes and tables at the storage level avoiding any query rewrites and still profiting from the storage savings. This can be a major advantage in terms of raw storage needs, but can also benefit you in high read environments with low RAM sizes. The data remains compressed when held in the buffer pool and is only de-compressed when being accessed. This means that you can keep more data in the buffer pool and reduce hard disk I/O (obviously not for free, compressing and de-compressing costs CPU cycles).  This may be acceptable on a test system with extremely constrained storage space.

The usage in my example scenario is now an even better proposition, not only can I reduce my storage footprint, I can also potentially increase test system performance on the I/O side of the equation (who doesn’t have I/O issues, especially on a test box!).

The next hurdle is of course identifying which indexes and tables that are in the database you want to squeeze down. This is possible via SSMS’ object explorer, but only if you want to spend an eternity doing so! The best way is to take a look at the meta-data tables/views/functions to interrogate the system objects. These are really interesting on their own (at least they are to me), but after writing my Index Script Creator, I realised the potential for using the information in these system objects to allow me to automate certain tasks like this one.

Similarly to the Index Script Creator, my Index Compressor Script (uploaded as a doc  but it is a .sql file really – stupid wordpress restrictions!) runs off and investigates all tables and indexes (down to the partition level) and then generates an index rebuild command to compress any currently non-compressed partitions of indexes / tables. The resulting commands can then be run and space can be reclaimed – just be aware that rebuilding indexes does cause log activity.

After compressing your tables and indexes you should have a much smaller amount of data (depending on how well your data compresses) and be able to reclaim some disk space.

I realise that this method will mean that you have to have a large enough disk to have the database in its uncompressed state to begin with, but you can end up with a cost effective solution to a budget-constrained test environment.

Happy zipping!

6 thoughts on “Reducing data footprints on test systems

    1. WilliamD

      Hi John,

      It is something that seems obvious after the fact, right?

      As we will be doing dev work on workstations it’ll definitely come I handy.

      Thanks for the comment


  1. thomasrushton

    The real problem is the disk space on dev/test servers, yes?

    In which case, what about offloading the compression to a dev workstation, which could well have lots of disk space free (what price an internal 1TB drive?), do the compression there over a weekend, and then backup & restore to the dev/test servers?

    Just a though…

    1. WilliamD

      Hi Thomas,

      Valid point at offloading the work, it is something we will be doing anyway, but keeping it zipped on a dev server afterwards is something to consider too.

      Thanks for the tip!


  2. Dirk Hondong

    Interesting reading.
    So you only hand over production data for testing/developing purposes to the devs which run their local installation or are there dedicated test environments?
    It’s a bit difficult to figure out up to which point you can use a dev edition and when do you need to use a standard or enterprise license, especially in test environments.
    For example: Someone who is primary testing software (and just a bit of developing) in a test env with a sql server in the back end, that person is working “productive” in that case. So then would you need a “full” license and not a dev license? I think that’s a grey area where you can save or loose a lot of money.

    1. WilliamD

      Hi Dirk,The way I understand it is that a development server, either local or centralised, that really is only for development uses developer edition.  The dev is actually working on it, but not serving data for day to day work. It is definitely an area where you have to be careful though. When in doubt speak to a Microsoft licensing operator.CheersWilliam


Leave a Reply