Clement Huge
08.07.2018

Rolling Snapshot DBS Setup to Correct Quickly Large Errors on Production

You might have wondered if there was a solution that could keep a version of your database in case you/a process made a huge mistake (large update without the where for example) on a production database.

You also want the solution to avoid calling your dba systematically to restore the database to sync back the data and you need a quick way to correct the issue.

WELL, THERE IS ONE! and even a feature that is now in the standard edition of sql server 2016!

Why not using the database snapshot solution. I implemented at some of my clients this solution. Here it is

The trick is you want to avoid keeping the snapshot for too long but if it is available for a few hours, there is no real harm. It also does work nicely even with always-on on. For mirroring database (on Sql server 2005, I also added a function that check the mirroring status of the database but I removed it because nowdays, 2005 version is from the past).

My solution is to setup a sql server job running on an hourly schedule with the following T-SQL:

EXEC DBATool.Tool.RefreshSnapshot @DBName = 'DBNAME'

Then , as many DBA do, you have an administration database. Mine is called DBATool.

In DBATool, I create the following stored procedures:

CREATE PROCEDURE [Tool].[RefreshSnapshots]
    @DbName Varchar(128) = 'ECC',
    @Suffix Varchar(128) = '_Snapshot',
    @ActionType TINYINT = 3
AS
BEGIN TRY

    DECLARE @LogicalNameSnapshot Varchar(128), 
            @PhysicalNameSnapshot Varchar(512)

    DECLARE @DbNameSnapshot Varchar(128), 
            @DbNameSnapshotTmp Varchar(128)

    SET @DbNameSnapshot = @DbName + @Suffix
    SET @DbNameSnapshotTmp = @DbNameSnapshot -- + '_Tmp'

    DECLARE @LogicalName Varchar(128), 
            @PhysicalName Varchar(512)

    DECLARE @LogicalNameSnapshotTmp Varchar(128), 
            @PhysicalNameSnapshotTmp Varchar(512)

    DECLARE @Files NVARCHAR(2048)
    SET @Files = N''

    SELECT @Files = @Files + ISNULL(
        N',(NAME = ''' + Name + ''',FILENAME = ''' 
        + LEFT(Physical_Name, LEN(Physical_Name) - CHARINDEX('.', REVERSE(Physical_Name))) 
        + '_' + CONVERT(VARCHAR(16), CONVERT(INT, rand()*1000000)) 
        + '.ss' + RIGHT(Physical_name,3) + ''')','')
    FROM Master.sys.Master_files 
    WHERE db_name(database_id) = @DbName 
      AND type = 0 
      AND state_desc = 'ONLINE'

    SET @Files = RIGHT(@Files, LEN(@Files)-1)

    SET @LogicalNameSnapshot = @LogicalName
    SET @PhysicalNameSnapshot = LEFT(@PhysicalName, LEN(@PhysicalName) - CHARINDEX('.', REVERSE(@PhysicalName))) + '.ss'
    SET @PhysicalNameSnapshotTmp = LEFT(@PhysicalName, LEN(@PhysicalName) - CHARINDEX('.', REVERSE(@PhysicalName))) 
                                   + '_' + CONVERT(VARCHAR(16), CONVERT(INT,rand()*1000000)) + '.ss'

    DECLARE @Stmt nvarchar(4000)
    DECLARE @NbTries TINYINT,
            @NbMaxTries TINYINT,
            @Success BIT,
            @Msg VARCHAR(1024)

    -- Drop snapshot if exists
    IF @ActionType & 2 = 2
    BEGIN
        IF EXISTS (
            SELECT * 
            FROM sys.databases 
            WHERE Name = @DbNameSnapshot
              AND source_database_id IS NOT NULL
        )
        BEGIN
            SET @Stmt = 'DROP DATABASE ' + @DbNameSnapshot
            EXEC sp_ExecuteSql @Stmt
        END
    END

    SET @NbMaxTries = 3
    SET @NbTries = 0
    SET @Success = 0

    WHILE @NbTries < @NbMaxTries AND @Success = 0
    BEGIN
        BEGIN TRY
            IF @ActionType & 1 = 1
            BEGIN
                IF EXISTS (SELECT * FROM sys.databases WHERE Name = @DbNameSnapshot)
                BEGIN
                    RAISERROR('The snapshot does already exist', 16, 1)
                    RETURN
                END

                SET @Stmt = 'CREATE DATABASE [' + @DbNameSnapshotTmp + '] ON ' 
                            + @Files + ' AS SNAPSHOT OF [' + @DbName + ']'
                SELECT @Stmt
                EXEC sp_ExecuteSql @Stmt
            END
            SET @Success = 1
        END TRY
        BEGIN CATCH
            PRINT ERROR_MESSAGE()
            WAITFOR DELAY '00:00:10'
        END CATCH

        SET @NbTries = @NbTries + 1
    END

    IF @Success = 1
    BEGIN
        SET @Msg = 'Refresh of snapshot successful after ' 
                   + CONVERT(VARCHAR(10), @NbTries) 
                   + ' tries for the database ' + @DbName 
                   + ' with suffix ' + @Suffix + ''
        PRINT @Msg
        -- IF @@TRANCOUNT > 0 COMMIT TRAN
    END
    ELSE
    BEGIN
        SET @Msg = 'Refresh of snapshot FAILED for the database ' 
                   + @DbName + ' after ' 
                   + CONVERT(VARCHAR(10), @NbTries) + ' tries !'
        -- IF @@TRANCOUNT > 0 ROLLBACK TRAN
        RAISERROR (@Msg, 16, 1)
    END

    -- EXECUTE sp_renamedb @DbNameSnapshotTmp, @DbNameSnapshot

END TRY
BEGIN CATCH
    -- DECLARE @Error VARCHAR(1024)
    -- SET @Error = ERROR_MESSAGE()
    EXEC error.P_DBA_RethrowError
END CATCH
GO

-- this sp manages the hourly schedule and keep only the last 4 hours of snapshots.

CREATE PROCEDURE [Tool].[RefreshSnapshot]
    @DbName varchar(128)
AS
BEGIN TRY
    DROP TABLE #time
END TRY
BEGIN CATCH
END CATCH

CREATE TABLE #Time 
(
    Time varchar(16),
    ActionType varchar(16)
)
INSERT INTO #time (Time)
SELECT TOP 24 
       ROW_NUMBER() OVER (ORDER BY Object_id) AS Time
FROM Sys.objects

UPDATE #Time 
SET Time = '0' + Time 
WHERE LEN(Time) = 1

DECLARE @CurrentTime VARCHAR(16) = DATEPART(HH, GETDATE())

IF LEN(@CurrentTime) = 1 
    SET @CurrentTime = '0' + @CurrentTime

DECLARE @Time TABLE (Time varchar(16))

INSERT INTO @Time
VALUES
    (DATEPART(HH, GETDATE())),
    (DATEPART(HH, DATEADD(HH, -1, GETDATE()))),
    (DATEPART(HH, DATEADD(HH, -2, GETDATE()))),
    (DATEPART(HH, DATEADD(HH, -3, GETDATE())))

UPDATE @Time 
SET Time = '0' + Time 
WHERE LEN(Time) = 1

UPDATE PUB
SET ActionType = CASE 
                    WHEN SUB.Time IS NULL THEN 2 
                    ELSE 3 
                 END
FROM #Time PUB
LEFT JOIN @Time SUB 
       ON PUB.Time = SUB.Time

DECLARE @Stmt varchar(max) = ''

SELECT @Stmt = @Stmt + 
               'EXEC [Tool].[RefreshSnapshots] @DbName = ''' + @DbName 
               + ''', @Suffix = ''_Snapshot_' + Time + ''', @ActionType = ' 
               + ActionType + ';'
FROM #time
WHERE Time = @CurrentTime OR ActionType = 2

PRINT @Stmt
EXEC (@Stmt)
GO

And voilà! This code actually gives you 4 snapshot dbs of the same database that corresponds to the last 4 hours. This keeps rolling, keeping the snapshots reasonnably close, to avoid difficult maintenance on old snapshots but keeping enough history to realize mistakes and correct them quickly without waiting painful hours of restoration of a large database...

Please share your thoughts: if you find the solution risky, elegant, useless, etc... Do not hesitate!