«

»

Mar 26

How to move data or a table from one data file to another in the same filegroup! – Part 2 of 2

This post is a continuation of the previous post, How to move data or a table from one data file to another in the same filegroup! – Part 1 of 2, and will show you how SQL handles data when it has more than 1 file. I will show you what happens when someone adds a second data file because they ran out of space on the first. I have seen this done when a disk is out of space, so someone adds a second file to the file group on a different disk. In this demo, we will use the same disk, but limit the size and growth of the first data file.

To start, here is a section from BOL on Database Files and Filegroups:

For example, three files, Data1.ndf, Data2.ndf, and Data3.ndf, can be created on three disk drives, respectively, and assigned to the filegroup fgroup1. A table can then be created specifically on the filegroup fgroup1. Queries for data from the table will be spread across the three disks; this will improve performance. The same performance improvement can be accomplished by using a single file created on a RAID (redundant array of independent disks) stripe set. However, files and filegroups let you easily add new files to new disks.

Now I will show you how you do not get to decide which file within the group the data will go to.

Demo 1 – Adding a file to a full database in order to give some free space

First we will create a database with one data file in the PRIMARY filegroup and check the space of the table:

USE [master];
GO

--Check for existence of database and drop if exists
IF EXISTS (
 SELECT 1
 FROM sys.databases
 WHERE NAME = 'TestDataFileAllocation'
 )
 DROP DATABASE TestDataFileAllocation;
GO

--Create a database with 1 data file in the PRIMARY filegroup
CREATE DATABASE TestDataFileAllocation ON PRIMARY (
 NAME = N'TestDataFileAllocation'
 ,FILENAME = N'F:\DATA\TestDataFileAllocation.mdf'
 ,SIZE = 10 MB
 ,MAXSIZE = 100 MB
 ,FILEGROWTH = 0
 ) LOG ON (
 NAME = N'TestDataFileAllocation_log'
 ,FILENAME = N'F:\Log\TestDataFileAllocation_log.ldf'
 ,SIZE = 1024 MB
 ,MAXSIZE = UNLIMITED
 ,FILEGROWTH = 1
 );
GO

--Switch database context to new database
USE TestDataFileAllocation;
GO

--Create a table for some data
CREATE TABLE SomeData (
 idCol INT IDENTITY(1, 1)
 ,NAME VARCHAR(MAX)
 ,number INT
 );
GO

--Add a clustered primary key to our new table
ALTER TABLE SomeData ADD CONSTRAINT PK_idCol PRIMARY KEY CLUSTERED (idCol);
GO

--Look at the file allocation using system tables and DMVs
SELECT OBJECT_NAME(p.object_id) AS tableName
 ,df.file_id
 ,df.name AS file_logical_name
 ,df.physical_name AS file_physical_name
 ,CAST(df.size AS BIGINT)*8/1024 AS file_size_MB
 ,CAST(df.max_size AS BIGINT)*8/1024 AS file_max_size_MB
 ,df.growth AS file_autogrow_ind
 ,fsu.total_page_count AS file_total_page_count
 ,fsu.allocated_extent_page_count AS file_allocated_extent_page_count
 ,fsu.unallocated_extent_page_count AS file_unallocated_extent_page_count
 ,au.total_pages AS table_allocated_pages
 ,au.used_pages AS table_used_pages
 ,au.data_pages AS table_data_pages
 ,p.rows AS table_rows
FROM sys.allocation_units au
INNER JOIN sys.database_files df ON au.data_space_id = df.data_space_id
INNER JOIN sys.partitions p ON au.container_id = p.hobt_id
INNER JOIN sys.dm_db_file_space_usage fsu ON df.file_id = fsu.file_id
WHERE au.type IN (
 1
 ,3
 )
 AND OBJECT_NAME(p.object_id) = 'SomeData';
GO

Notice the results of the allocation on the file:

NoData

So far, 328 pages are allocated while 952 are unallocated for the file itself. This particular table is currently empty and has 0 pages. Now lets add dome data to the table and see what happens.

--Insert data into the SomeData table
INSERT INTO SomeData (
NAME
,number
)
VALUES (
REPLICATE('sqlknowitall', 1000)
,10000
);
GO

And using the same query to look at the data as we used earlier:

1Row

Now we have 1 row in the file and we can see that nothing has changed on the file itself, but that the table has added 1 data page. So, not much has changed… However, what happens if we add so much data that the file runs out of space?

--Insert data into the SomeData table and max out the file
INSERT INTO SomeData (
 NAME
 ,number
 )
VALUES (
 REPLICATE('sqlknowitall', 1000)
 ,10000
 );
GO 10000

Run the query to see what happened:

MaxData

We now have the file full. The file has no more unallocated pages and the table itself has 953 allocated and used pages. To allow more data into this table, we need to either grow the file or add a new one. For the sake of this article and the point I am trying to make, we will add a second file to the file group.

 --Add a second file to the filegroup
ALTER DATABASE TestDataFileAllocation ADD FILE (
NAME = TestDataFileAllocation_2
,FILENAME = N'F:\DATA\TestDataFileAllocation_2.ndf'
,SIZE = 1024 MB
,MAXSIZE = UNLIMITED
,FILEGROWTH = 512 MB
) TO FILEGROUP [PRIMARY];
GO

Look at the allocation:

SecondFile

We now see the new file with 8 pages. The only reason we are still showing table data with this new file is because the DMVs don’t tell us which file the table resides on, only which FILEGROUP. However, since we only have 1 table with no changes, we know that it still all resides on the first file.

The suggestion was to drop and recreate the clustered index to move the table. Well, let’s see what that does:

--Drop the clustered index
ALTER TABLE SomeData
DROP CONSTRAINT PK_idCol;
GO
--Recreate the same clustered index
ALTER TABLE SomeData ADD CONSTRAINT PK_idCol PRIMARY KEY CLUSTERED (idCol);
GO

Results:
SecondFileRebuildCI

Hmm… Well, some of the data moved to the new file. But how much? Well… There are several more technical ways to come to this, but to keep it simple I will just look at the starting. Before we had any data in the table, the first file showed 328 allocated pages. Before we had any data in the second file, it had 8 allocated pages. After the drop and recreate on the clustered index, the first file now has 336 pages allocated while the second has 968. We can see that not all of the data has moved off of the first file, but a majority did. According to Paul Randall:

SQL Server’s allocation system uses a round-robin mechanism where allocations are spread over all the data files in a filegroup – an allocation from file 1, then an allocation from file 2, and so on, then back to file 1 again. Combined with that, it also uses an algorithm called proportional fill that aims to allocate data from the files according to how much free space they have relative to the other files in the filegroup.
The basic premise of proportional fill is that each file has a weighting, where files with less free space will have a higher weighting, and files with lots of free space will have a low weighting. The files with the lower weightings will be allocated from more frequently – i.e. those files with more free space will be allocated from the most.
This means that if you add a new data file to a filegroup that has mostly full data files, the proportional fill weightings will be such that the new file will be the one where allocations come from until it fills up to the same level as the older files. The new file in essence becomes an allocation hot spot.

In closing, you cannot decide how the data moves between files within the same filegroup. SQL Server uses its own algorithm for doing this and it is out of our control. However, what if you decided that having 2 files was hurting the performance or was too hard to manage? Can you somehow get rid of 1 of them? Yes, you can… but ONLY if the file you are trying to remove is not the first file created when the database was created.

--Empty file 1 and remove it
CHECKPOINT
GO

DBCC SHRINKFILE (TestDataFileAllocation, EMPTYFILE);
GO

ALTER DATABASE TestDataFileAllocation REMOVE FILE TestDataFileAllocation;
GO 

This will result in an error:

EmptyFileError

The only thing we can do in this case is move our first file to a larger drive (if that was the issue to begin with, who knows) and move everyting from file 2 to the original after growing the file to accomodate the data.

--Grow our first file and move everything there to get rid of second file
USE master;
GO
ALTER DATABASE TestDataFileAllocation
MODIFY FILE
(NAME = TestDataFileAllocation,
SIZE = 1024MB);
GO

USE TestDataFileAllocation;
GO

CHECKPOINT;
GO

DBCC SHRINKFILE (TestDataFileAllocation_2, EMPTYFILE);
GO

ALTER DATABASE TestDataFileAllocation REMOVE FILE TestDataFileAllocation_2;
GO

Just double-checking with our allocation query:

FinalFile

Tada!

 

 

 

Leave a Reply