Register
Sunday, February 05, 2012
 
 DBAs And ProgrammersBlog
  
News! Minimize
   
 
 Print   
 
Misc Blog Stuff Minimize
   
 
 Print   
 
The Reluctant DBA Minimize
 
 
 
 Print   
 
Reluctant DBA Minimize
   
 
  
 
Reluctant DBA Minimize
   
 
  
 
The Reluctant DBA Minimize
 
Apr28

Written by:CarpDeus
4/28/2008 7:39 AM 

One fairly common job of a DBA is to import data from another source for use by a company. I have several third party databases that I import regularly for my job and I've encountered plenty of interesting data formats.

But I have been doing performance tuning on an import process of read-only data that we get and have re-discovered one of those old points of performance tuning: Functions are slow, even when they are system functions.

A good example of this is that I have a very denormalized table of data that I get in that I need to normalize for use by our system. One requirement is that we have SEO friendly names for everything, so we take any lookup field (category, subcategory, etc) and create a lookup version of the name that could be placed in a url path. So something like "Large Widgets" becomes large-widgets. We also require a way to pull up items with categories and subcategories that are NULL, so we make the NULLs into a single dash.

Yes, I know that a NULL means no data and it seems like using a - to represent a NULL doesn't make sense but I don't write the requirements, I just have to implement them.

So, what I was doing was loading the reference tables for the main table using an ISNULL(columname, '-') and then including that when I did the join. This drains the server and takes a long time to run and was filling up several gigabytes of tempdb space.

When you stop and think about it, it makes sense. First, before SQL Server can process the join, it needs to create a temporary version of the table with the result of the function for that column. Which, on a multi-million row table, requires a multi-million row temp table. Which then probaby needs to be reindexed to handle the join and... well, you get the idea.

So, I took the procedures that I was running to populate the refernece tables and added:

UPDATE inputtable SET nullablereferencecolumn = '-' WHERE nullablereferencecolumn IS NULL

The result? 90 minutes worth of processing reduced to half an hour and the tempdb kept fit and trim the way it should be.

Tags:

Your name:
Your email:
(Optional) Email used only to show Gravatar.
Your website:
Title:
Comment:
Security Code
Enter the code shown above in the box below
Add Comment  Cancel 
 
 
  
 
Privacy Statement | Terms Of Use Copyright 2001-2008 by ReluctantDBA.com