SQL Server - How to split single mdf file into multiple, and partition table data across them?

N73k

6/15/23, 4:25 PM

In the hopes of improving SQL Server performance, I'd like to take my simple 1-file (1 mdf & 1 ldf) database and split up the mdf files (and maybe the ldf files) across multiple disk drives. I'm thinking that the most performant way to do this would be to partition the big tables across those multiple mdf files. (I'm planning just to use a partition function which is a modulus of the primary key (which is an int or bigint), like (MyTable.Id % 8), where 8 is the number of disks I have).

What's the fastest/best/correct way to do this? Like, what SQL commands or SSMS operations do I need to do to get me from where I am now to where I want to go.

Any other suggestions would be appreciated too. (Combining the drives via Storage Spaces just doesn't look like an option, because, for some reason, I'm just not getting good write IOPS performance when I do that.)

0 + 0

table-partitioning

sql-server

Score:1

Server

Ben Thul

6/15/23, 8:51 PM

You don't have to involve partitioning (but could if you have another reason to). If you create a new filegroup containing the physical files you'd like and rebuild the index on the new filegroup, that should be sufficient to spread the data across those physical files.

The rebuild process will look something like this. Say the current index has the following definition:

create clustered index [CIX_foo] on dbo.foo (FooID) on [PRIMARY];

You could rebuild it like this:

create clustered index [CIX_foo] on dbo.foo (FooID) with (drop_existing = on) on [NewFileGroup];

Use whatever other options you'd normally use in creating that index (i.e. sort_in_tempdb, pad_index, online, resumable, etc). Also consider whether changing the compression setting of the existing index makes sense - you're rewriting the entire index so now would be the time!

Lastly, as a recommendation, I'd create at least two files per drive in the new filegroup. Why? If your SWAG of 8 files (and presumably 8 drives) is not sufficient for your use case, you'll have to go through the above index rebuild process again. But if you do, say two files per drive (for a total of sixteen), you can provision the new storage and merely do either a filesystem file copy (while the db is offline) or a restore specifying the new locations. Either of those should be faster than rebuilding the indexes.

0 + 0

N73k

6/16/23, 3:06 PM

Thanks Ben. Question 1: How would SQL Server know which file a given record is in? Question 2: Would there be some speed advantage in using a partition function like I was saying in my original question? Question 3: I'm guessing that if I want to move indexes (and not just tables) to that new filegroup (and spread the data across the files), I'd have to do the same kind of command for each index.

Ben Thul

6/16/23, 6:38 PM

1. Somewhat beyond the scope of the question. That is, what problem does having that knowledge help you solve? 2. Partitioning in mssql is more of a data management sol'n than a performance one. It allows you to swap data in/out of a table as a metadata operation. The classic case is loading a DW where you'd ETL a day's worth of data into a staging table and then swap that in when you're done. 3. Yes - every heap or B-tree exists on a data space. A file group is a type of data space (the other being a partition scheme). Any index you want in your new FG would need to be moved explicitly.

N73k

6/16/23, 7:05 PM

For Q1: I'm interested in performance. So, if SQL Server wants to upsert some records, will it have to check every drive to see where those records are or should be, or does it have a quick way of knowing (even without a partition function) where that data is (or should go) so that only 1 disk is accessed?

Ben Thul

6/16/23, 10:44 PM

Roughly speaking, the data will be spread evenly across all files in the FG. Addressing your concern, SQL doesn't do scatter/gather to locate the data but rather navigates the B-tree to locate the appropriate data pages (and, by extension, the files that contain those pages).

N73k

6/17/23, 3:40 PM

Well Ben, it's been a while and you still don't have a gold. But not anymore.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: SQL Server - How to split single mdf file into multiple, and partition table data across them?

TH: SQL Server - วิธีแยกไฟล์ mdf เดียวออกเป็นหลาย ๆ ไฟล์และข้อมูลตารางพาร์ติชันระหว่างกัน

RO: SQL Server - Cum să împărțiți un singur fișier mdf în mai multe și să partițiți datele din tabelul peste ele?

RU: SQL Server - Как разбить один файл mdf на несколько и разделить данные таблицы между ними?

VI: SQL Server - Làm cách nào để chia một tệp mdf thành nhiều và dữ liệu bảng phân vùng trên chúng?

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.