SQL server: how to improve multi-threaded inserts on frequently-accessed tables
So today we will talk about an issue that I have seen in two of my clients.
One of them is a very suffessful Ecommerce that persists the shopping carts (not in cookie) and has millions of users storing in the cart products, shopping between 8AM to 10AM massively. The shopping cart duration is 15 mn. When the shopping cart is released, the data is not useful anymore and can be deleted. The table is not that big. Yet, the table is heavily updated and inserted.
The other one is a simple log table but that gets hit all the time with very very large blob. The insert can then take a long time due to the very large size of one row.
Both companies face locks when inserting or updating as the page where the data are locked (for updates) and as the PFS lock contention is heavy.
Here is a definition of PFS (Page Free Space):
The PFS page record information for each data page resulting in 1 byte of storage. This information indicates the space left on it and what it’s used for, which means that a single PFS page can store information about roughly 64MB of pages. Therefore, you’ll find a new PFS page at close to 64MB intervals throughout a database data file.
To avoid PFS locks and speed up performance, we can use a computed persisted column and use it to manage heavy insert loads.
This setup for example creates the computed column and applies the partition scheme based on the HashId column:
CREATE TABLE <schema>.<Table>
( Id [int] IDENTITY(1,1) NOT NULL,
[HashID] AS (CONVERT([tinyint],abs(Id%(8)))) PERSISTED NOT NULL, --> create the persisted computed column
CONSTRAINT [PK_ServiceLog] PRIMARY KEY CLUSTERED
[HashID] ASC --> add the partition column
) ON Ps(HashId) --> put the table on the partition scheme
In this example, we use 8 partitions but you can use more partitions.
In fact you can even be creative and add a rolling window to set from 1 to 8 the week 1 and then 9 to 16 the week 2. This would enable you to add switch out solution for large purge when you want to keep x number of days, weeks, months only (but this is for another article).
Once setup the PFS contentions are gone and the performance on the table DML operations are much faster:
On large selects, you might get a small performance hit because of the fact that the rows are spread out onto 8 different logical locations (and maybe 8 different files if you set your partitions on different files).
Let me know what you think guys! good solution? bad solution?
I know it works great! even with partition scheme on Primary filegroup.