Score:0

SQL Server - How to split single mdf file into multiple, and partition table data across them?

pl flag

In the hopes of improving SQL Server performance, I'd like to take my simple 1-file (1 mdf & 1 ldf) database and split up the mdf files (and maybe the ldf files) across multiple disk drives. I'm thinking that the most performant way to do this would be to partition the big tables across those multiple mdf files. (I'm planning just to use a partition function which is a modulus of the primary key (which is an int or bigint), like (MyTable.Id % 8), where 8 is the number of disks I have).

What's the fastest/best/correct way to do this? Like, what SQL commands or SSMS operations do I need to do to get me from where I am now to where I want to go.

Any other suggestions would be appreciated too. (Combining the drives via Storage Spaces just doesn't look like an option, because, for some reason, I'm just not getting good write IOPS performance when I do that.)

Score:1
cn flag

You don't have to involve partitioning (but could if you have another reason to). If you create a new filegroup containing the physical files you'd like and rebuild the index on the new filegroup, that should be sufficient to spread the data across those physical files.

The rebuild process will look something like this. Say the current index has the following definition:

create clustered index [CIX_foo] on dbo.foo (FooID) on [PRIMARY];

You could rebuild it like this:

create clustered index [CIX_foo] on dbo.foo (FooID) with (drop_existing = on) on [NewFileGroup];

Use whatever other options you'd normally use in creating that index (i.e. sort_in_tempdb, pad_index, online, resumable, etc). Also consider whether changing the compression setting of the existing index makes sense - you're rewriting the entire index so now would be the time!

Lastly, as a recommendation, I'd create at least two files per drive in the new filegroup. Why? If your SWAG of 8 files (and presumably 8 drives) is not sufficient for your use case, you'll have to go through the above index rebuild process again. But if you do, say two files per drive (for a total of sixteen), you can provision the new storage and merely do either a filesystem file copy (while the db is offline) or a restore specifying the new locations. Either of those should be faster than rebuilding the indexes.

N73k avatar
pl flag
Thanks Ben. Question 1: How would SQL Server know which file a given record is in? Question 2: Would there be some speed advantage in using a partition function like I was saying in my original question? Question 3: I'm guessing that if I want to move indexes (and not just tables) to that new filegroup (and spread the data across the files), I'd have to do the same kind of command for each index.
Ben Thul avatar
cn flag
1. Somewhat beyond the scope of the question. That is, what problem does having that knowledge help you solve? 2. Partitioning in mssql is more of a data management sol'n than a performance one. It allows you to swap data in/out of a table as a metadata operation. The classic case is loading a DW where you'd ETL a day's worth of data into a staging table and then swap that in when you're done. 3. Yes - every heap or B-tree exists on a data space. A file group is a type of data space (the other being a partition scheme). Any index you want in your new FG would need to be moved explicitly.
N73k avatar
pl flag
For Q1: I'm interested in performance. So, if SQL Server wants to upsert some records, will it have to check every drive to see where those records are or should be, or does it have a quick way of knowing (even without a partition function) where that data is (or should go) so that only 1 disk is accessed?
Ben Thul avatar
cn flag
Roughly speaking, the data will be spread evenly across all files in the FG. Addressing your concern, SQL doesn't do scatter/gather to locate the data but rather navigates the B-tree to locate the appropriate data pages (and, by extension, the files that contain those pages).
N73k avatar
pl flag
Well Ben, it's been a while and you still don't have a gold. But not anymore.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.