Adding large amount of storage (500TB) to 1 VM

cn flag

What is the best way to add a large amount of storage to a single Windows host in VMware vSphere 7? Say 500TB? In the past I've made a bunch of VMFS datastores and split it up per drive in Windows, one 50TB VMFS and one 50TB Windows drive times 10. However a new application would work better if that was all combined into one drive...I know I can combine them in Windows, but that seems like an inefficient mess.

paladin avatar
id flag
You build a super NAS server. 25x 20TB SAS drives, attached to multiple SAS controller on a high performance server mainboard within a big chassis like _SuperMicro SuperChassis 847BE2C-R1K23WB_. Don't forget to buy an external tape drive or tape server for backups (at least LTO-9!). The NAS server will combine all drives for you and will serve it over network, which can be implemented as network drive in Windows. I recommend to use BTRFS as filesystem for the super NAS.
cn flag
@paladin I have the storage already, I just need to add it all to a single VM
br flag

I don’t know how seriously to take this question given the massive budget you obviously have combined with the lack of experience your question indicates, but I will answer as if you’re serious.

For all I know a single virtual hard drive can be up to 62 TB in size (
You’d need to set up a soft RAID in Windows consisting of multiple smaller virtual disks. This would quickly get complex in terms of maintenance.

You would probably be better off looking at scalable open source or proprietary storage solutions - and don’t forget to budget for backups and disaster recovery.

At that point you really also should discuss the software design with your developers: how will this storage be used? Can it be broken down into smaller chunks? Does all of it need to be accessible with identical priority, or can the storage be split up into high-performance and lower-cost pools respectively? Is hosting the solution with a cloud provider an option? And so on.

id flag

The best would be, if you would combine all those filesystems to a single filesystem and mount the single filesystem into Windows. Building some kind of network JBOD is not very performance friendly and has many disadvantages. If you really insist doing so, I would suggest the following.

Configure each of your ten 50TB-NAS systems to serve their individual combined disk space as an iSCSI device to some kind of MasterNAS server.

The MasterNAS server (should be a Linux system) should then combine all 10 iSCSI devices as a btrfs super filesystem.

You'll need sudo apt install open-iscsi btrfs-progs (for ubuntu/debian server) for this.

  1. You configure all your 10 NAS to serve their entire disk space as a combined iSCSI device. Example: Your NAS-servers use a hardware raid controller which maps the all disk drives to 1 logical drive. This 1 logical drive is propagated by an iSCSI initiator.

  2. You use your new and performant and independent MasterNAS server which is "collecting/mounting" all 10 iSCSI targets from all 10 NAS servers. I suggest this MasterNAS uses an modern Linux and uses btrfs as filesystem for the iSCSI-NAS-drives. You then just create a super btrfs filesystem in single mode (which is essential some kind of cool JBOD device which uses filesystem level data stipping). You can do this like so:

sudo mkfs.btrfs --data single /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj (this combines all 10 drives to a JBOD like linear btrfs filesystem)

You can mount that super btrfs filesystem just by using any of it's devices, like so: mount /dev/sda /mnt.

The advantage of a btrfs JBOD like device is, should one of the NAS severs going down, your filesystem won't crash, you'll "just lose" the files from the offline NAS server. (Ok' it will crash in default settings, but data recovery is much easier, as file stripping is happening on filesystem level with btrfs)

After doing all this stuff, you install a samba server onto your MasterNAS and share this super btrfs filesystem to your WindowsVM.

PS there is also a free btrfs driver for Windows, but I've no experience with it -->

Mikael H avatar
br flag
Stacking physical and virtual RAIDs like this would become extremely brittle. At this scale I’d say a storage cluster like Ceph would be the way to go if the OP insists on managing the solution in-house. Reconsidering the software design to utilize object storage rather than traditional file systems or network shares could further enhance the resilience and scalability of the solution.
br flag

Probably not what you want to hear but have you considered using some form of public cloud storage for this? 500TB is going to take a while to fill, so why pay for it until you know for a fact you need it. Obviously you'd need to pick your cloud and which service but I just wanted you to consider it before you race out and spaff a load of day-one cash up the wall.


Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.