Published on: April 7th 2010

Hello Guys,

It is usually pretty straightforward to read an execution plan, to understand why one query would use an index instead of another one, to understand the needs of a nested loop or a hash join, etc...

It is usually....

But then comes .... the horrible tables!. Usually you get to be the firefighter at a client that developped a "monster" table, that is usually how the client calls it () and you look at all their queries that hammered the server with ios and cpu and memory swaping,e tc... The catastroph!

Then you look closer and you realize your table is millions of row long and 250 columns large with, obviously to make it fun, a large amount of text, image or other exotic datatype columns.

They are obvious improvements to perform on the query side like:

1. make sure the client does not do a "select *", select useless columns that susequently do not reside on the index you scan or seek and utimately perform a very large key lookup due to the poor page density.

2. make sure the drive where the file that hold the data and indexes reside in very well formatted (cluster size and disk alignment)

3. make sure you try to perform the select only one or two times instead of thousands time on your application page especially if you do not reuse the same execution plan...

 

All of that is fine but sometimes, you realize that SQl server decides to use an index instead of another one. One can say randomly but usually it is based on the number of IO it estimates to read versus RAM it might use.

Also, on large table, the estimation plan gets worse and forcing indexes is the solution for better performance. We can definitely see the difference running before and after optimisation on the profiler. The Sql server profiler is your empirical friend 😉

Bottom line is:

- try to avoid if possible blob/in-lob storage data or at least put them in different table to minimize storage within the same clustered index.

- try to increase Page density, it makes your indexes less fragmented during large operations of updates, deletes and make the work of your DBA much more conventional. The execution plans are correctly estimated, the fragmentation is organically growing and easy to maintain, etc...

The worse and that happens to too many clients is when they mix very very low page density on large table with random-GUID clustered index. This is the worse. You even have no way of paritioning correctly your table (and by the way most of those clients cannot afford enterprise edition anyway)...

Hopefully, and this is a howler to clients, you make the DBA create the data model before your developers go a-wire with monster tables!