Controlling Large Deletions and Transaction Log Growth

Deleting from very large tables can cause uncontrolled transaction log growth.  The way around this is to perform an iterative TOP(n) delete where you limit the number of rows to be deleted in each batch.  I got this idea from an article by Hugo Kornelis in the first volume of SQL Server MVP Deep Dives.  It’s titled Set-based iteration, the third alternative.  It’s a worthwhile read.

A basic iterative deletion script might look something like this. We set a batch size variable, @BatchSize equal to the number of rows to delete per iteration.  As long as the number of rows deleted is equal to the batch size, the WHILE loop will continue.  Otherwise it will terminate:

1
2
3
4
5
6
7
8
9
10
DECLARE @BatchSize int, @RowCnt int;
SET @BatchSize = 10000;
SET @RowCnt = @BatchSize;
WHILE @RowCnt = @BatchSize
BEGIN;
DELETE TOP (@BatchSize)
FROM   dbo.sometable
WHERE  somecolumn < 2395;
SET @RowCnt = @@ROWCOUNT;
END;

The script shown above isn’t really suitable for production.  Below is a more production-ready script.  I’ve incorporated similar scripts into scheduled SQL Agent jobs to maintain a specific data retention target.  For example, seven years.  The script has a couple of noteworthy features.  First, it incorporates a WAITFOR DELAY into the loop.  This gives other processes a certain amount of time to use the physical resources of the computer.  It also gives processes a chance to access the table you are deleting.  Second, it incorporates a cutoff time of day.  After that time, the loop will terminate.  For example, you might use SQL Agent to schedule this process every Sunday at 4:00 PM and set the cutoff time to 8:00 PM, because 8:00 PM is when other maintenance processes start up.  Note that you will have to tune the @BatchSize and WAITFOR DELAY parameters to suit your particular situation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
--controlled deletion prevents long-term locks & blowing out transaction log
 
DECLARE @BatchSize INT ,
    @sPurgeMessage VARCHAR(64) ,
    @RowCnt INT ,
    @StartRowCount BIGINT ,
    @EndRowCount BIGINT ,
    @sMessage VARCHAR(512) ,
	@dStart DATETIME, 
    @StopTime DATETIME ,
	@cutoffDate DATETIME;
 
	--Set the stopping time for 18:00 on the current day
        SELECT  @StopTime = DATEADD(HOUR, 16,
                                    CAST(FLOOR(CAST(GETDATE() AS FLOAT)) AS DATETIME))
 
	-- @cutoffDate is the date value below which rows are purged
        SELECT  @cutoffDate = DATEADD(DAY, -2555, GETDATE())
 
 
	--informational - get & store starting row count
	--BTW, this method of counting rows is considered to be optimal in terms of speed
        SELECT  @StartRowCount = SUM(row_count)
        FROM    sys.dm_db_partition_stats
        WHERE   object_id = OBJECT_ID('dbo.SomeGynormousTable')
                AND ( index_id = 0
                      OR index_id = 1
                    );
 
        SET NOCOUNT ON;
	--set number of deletions per batch
        SET @BatchSize = 1;
 
        SET @RowCnt = @BatchSize
 
        WHILE ( @StopTime > GETDATE() )
            AND @RowCnt > 0
            BEGIN
                SELECT  @dStart = GETDATE()
                BEGIN TRAN
                DELETE TOP ( @BatchSize )
                FROM    dbo.SomeGynormousTable
                WHERE   [Date] < @cutoffDate;
                SET @RowCnt = @@ROWCOUNT
                COMMIT;
                SELECT  @sPurgeMessage = 'Purged '
                        + CAST(@RowCnt AS VARCHAR(10)) + ' rows in '
                        + CAST(DATEDIFF(SECOND, @dStart, GETDATE()) AS VARCHAR(6))
                        + ' seconds.'
                RAISERROR (@sPurgeMessage, 1, 1) WITH NOWAIT  
                WAITFOR DELAY '00:00:05' -- give other spids a chance to work
            END;
 
		--informational - get & store ending row count
        SELECT  @EndRowCount = SUM(row_count)
        FROM    sys.dm_db_partition_stats
        WHERE   object_id = OBJECT_ID('dbo.SomeGynormousTable')
                AND ( index_id = 0
                      OR index_id = 1
                    );
	-- optionally send out a completion notice
        SET @sMessage = CAST(( @StartRowCount - @EndRowCount ) AS VARCHAR(32))
            + ' rows were deleted'
 
        EXEC msdb.dbo.sp_send_dbmail @profile_name = 'someprofile',
            @recipients = 'somedba@gmail.com', @body = @sMessage,
            @subject = 'purge of dbo.SomeGynormousTable done';
This entry was posted in Uncategorized. Bookmark the permalink.