Score:1

AWS i3en.3xlarge really low iops

lk flag

I just launched a new instance ec2 instance of type i3en.3xlarge. Operating system is Ubuntu. I mounted the NVMe Instance store but every speed test I run is incredible low at around 7k iops. What am I doing wrong?

Here are the steps I did:

1) Check available ssds with nvme -list:

---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     vol012301587a8724842 Amazon Elastic Block Store               1           8.59  GB /   8.59  GB    512   B +  0 B   1.0     
/dev/nvme1n1     AWS16AAAC6C7BFAC4972 Amazon EC2 NVMe Instance Storage         1           7.50  TB /   7.50  TB    512   B +  0 B   0

2) create a new xfs file system for nvme1n1:

sudo mkfs -t xfs /dev/nvme1n1

3) mount it to /home

sudo mount /dev/nvme1n1 /home

4) check df -h:

    ubuntu@ip-172-31-35-146:/home$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root       7.7G  2.8G  4.9G  37% /
devtmpfs         47G     0   47G   0% /dev
tmpfs            47G     0   47G   0% /dev/shm
tmpfs           9.4G  852K  9.4G   1% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            47G     0   47G   0% /sys/fs/cgroup
/dev/loop0       25M   25M     0 100% /snap/amazon-ssm-agent/4046
/dev/loop3       43M   43M     0 100% /snap/snapd/14066
/dev/loop2       68M   68M     0 100% /snap/lxd/21835
/dev/loop1       56M   56M     0 100% /snap/core18/2284
/dev/loop4       62M   62M     0 100% /snap/core20/1242
/dev/loop6       56M   56M     0 100% /snap/core18/2253
/dev/loop5       44M   44M     0 100% /snap/snapd/14549
/dev/loop7       62M   62M     0 100% /snap/core20/1328
tmpfs           9.4G     0  9.4G   0% /run/user/1000
/dev/nvme1n1    6.9T   49G  6.8T   1% /home

5)run test with fio:

fio -direct=1 -iodepth=1 -rw=randread -ioengine=libaio -bs=4k -size=1G -numjobs=1 -runtime=1000 -group_reporting -filename=iotest -name=Rand_Read_Testing

Fio Results:

fio-3.16
Starting 1 process
Rand_Read_Testing: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [r(1)][100.0%][r=28.5MiB/s][r=7297 IOPS][eta 00m:00s]
Rand_Read_Testing: (groupid=0, jobs=1): err= 0: pid=1701: Sat Jan 29 22:28:17 2022
  read: IOPS=7139, BW=27.9MiB/s (29.2MB/s)(1024MiB/36717msec)
    slat (nsec): min=2301, max=39139, avg=2448.98, stdev=311.68
    clat (usec): min=32, max=677, avg=137.06, stdev=26.98
     lat (usec): min=35, max=680, avg=139.59, stdev=26.99
    clat percentiles (usec):
     |  1.00th=[   35],  5.00th=[   99], 10.00th=[  100], 20.00th=[  124],
     | 30.00th=[  125], 40.00th=[  126], 50.00th=[  139], 60.00th=[  141],
     | 70.00th=[  165], 80.00th=[  167], 90.00th=[  169], 95.00th=[  169],
     | 99.00th=[  172], 99.50th=[  174], 99.90th=[  212], 99.95th=[  281],
     | 99.99th=[  453]
   bw (  KiB/s): min=28040, max=31152, per=99.82%, avg=28506.48, stdev=367.13, samples=73
   iops        : min= 7010, max= 7788, avg=7126.59, stdev=91.80, samples=73
  lat (usec)   : 50=1.29%, 100=9.46%, 250=89.19%, 500=0.06%, 750=0.01%
  cpu          : usr=1.43%, sys=2.94%, ctx=262144, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=27.9MiB/s (29.2MB/s), 27.9MiB/s-27.9MiB/s (29.2MB/s-29.2MB/s), io=1024MiB (1074MB), run=36717-36717msec

Disk stats (read/write):
  nvme1n1: ios=259894/5, merge=0/3, ticks=35404/0, in_queue=35404, util=99.77%

According to benchmarks like here the iops performance should be way better.

So am I missing something here?

Thanks in advance

Tim avatar
gp flag
Tim
Hopefully someone can help you. If not, with big instances like that I think you might find having AWS Support really useful, developer support isn't particularly expensive and they can be really helpful.
Score:1
lk flag

Thanks to @shearn89's response and the aws support I figured the way I ran the fio test wast the issue.

Here's what AWS told me:

To begin with, the instance type i3.4xlarge has a listed read/write IOPS of 825k and 360k respectively [1]. This IOPS performance can be obtained using up to 4KB block size and at queue depth saturation.

The volume queue length is the number of pending I/O requests for a device. Optimal queue length varies for each workload, depending on your particular application's sensitivity to IOPS and latency. If your workload is not delivering enough I/O requests to fully use the performance available to your EBS volume, then your volume might not deliver the IOPS or throughput that you have provisioned [2].

To determine the optimal queue length for your workload on SSD-backed volumes, we recommend that you target a queue length of 1 for every 1000 IOPS available [3]. Increasing the queue length is beneficial until you achieve the provisioned IOPS, throughput or optimal system queue length value, which is currently set to 32. For more information on queue depth, please refer to these third-party articles which explain the term in great detail [4][5][6].

To replicate your issue, I launched an instance of the same type and AMI, created a RAID 0 array using the 2 instance store NVMe devices [7], and ran fio with the same parameters you provided. The results are similar to what you achieved:

$ sudo fio -direct=1 -iodepth=1 -rw=randread -ioengine=libaio -bs=4k -size=1G -numjobs=1 -runtime=1000 -group_reporting -filename=iotest -name=Rand_Read_Testing
iops        : min= 8820, max= 9196, avg=8905.17, stdev=102.04, samples=58

$ sudo fio -direct=1 -iodepth=1 -rw=randwrite -ioengine=libaio -bs=4k -size=1G -numjobs=1 -runtime=1000 -group_reporting -filename=iotest -name=Rand_Read_Testing
iops        : min= 1552, max= 2012, avg=1883.84, stdev=59.06, samples=278

I repeated the test above and was able to reach R/W IOPS of 824k and 460k respectively, by setting the parameters "iodepth=32" and "numjobs=16":

$ sudo fio --directory=/mnt/raid --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G --numjobs=16 --time_based --runtime=180 --group_reporting --norandommap --iodepth=32 --ioengine=libaio
iops        : min=572631, max=910386, avg=824619.49, stdev=3518.58, samples=5745
   
$ sudo fio --directory=/mnt/raid --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G --numjobs=16 --time_based --runtime=180 --group_reporting --norandommap --iodepth=32 --ioengine=libaio
iops        : min=291970, max=509505, avg=360163.50, stdev=2193.22, samples=5760

Please be reminded that the instance store IOPS is also dependent on many factors including the ones already mentioned above, such as I/O type, block size, I/O size, I/O engine, I/O depth, number of files/devices, and number of threads/processes. For more information on how to tune the parameters to optimise performance, please refer to these articles [8][9][10].

Also, an instance store provides temporary storage for your instance, and the data will be lost if the underlying disk fails or if the instance is stopped/terminated [11]. Therefore, if you require persistence data storage, consider a more durable option such as Amazon EBS [12].

I hope you find the information above useful. Please let me know if you have any additional technical questions.

Thank you.

Score:0
cn flag

So I span up one of these instances to test for myself. My steps were only a little different:

  1. Partition the disk first using parted
  2. Make the filesystem
  3. Mount at /opt as /home was already there and had my user's home directory in (ubuntu).
  4. apt update && apt upgrade, then install fio
  5. Run the same command as you: fio -direct=1 -iodepth=1 -rw=randread -ioengine=libaio -bs=4k -size=1G -numjobs=1 -runtime=1000 -group_reporting -filename=iotest -name=Rand_Read_Testing from within /opt, with sudo.

I got similar results, with read: IOPS=7147.

I then ran another test:

/opt$ sudo fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=fiotest --filename=testfio --bs=4k --iodepth=64 --size=8G --readwrite=randrw --rwmixread=75
fiotest: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.16
Starting 1 process
fiotest: Laying out IO file (1 file / 8192MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=332MiB/s,w=109MiB/s][r=85.1k,w=28.0k IOPS][eta 00m:00s]
fiotest: (groupid=0, jobs=1): err= 0: pid=26470: Mon Jan 31 09:14:45 2022
  read: IOPS=91.5k, BW=357MiB/s (375MB/s)(6141MiB/17187msec)
   bw (  KiB/s): min=339568, max=509896, per=100.00%, avg=366195.29, stdev=59791.96, samples=34
   iops        : min=84892, max=127474, avg=91548.82, stdev=14947.99, samples=34
  write: IOPS=30.5k, BW=119MiB/s (125MB/s)(2051MiB/17187msec); 0 zone resets
   bw (  KiB/s): min=111264, max=170424, per=100.00%, avg=122280.71, stdev=20225.33, samples=34
   iops        : min=27816, max=42606, avg=30570.18, stdev=5056.32, samples=34
  cpu          : usr=19.73%, sys=41.60%, ctx=742611, majf=0, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=1572145,525007,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=357MiB/s (375MB/s), 357MiB/s-357MiB/s (375MB/s-375MB/s), io=6141MiB (6440MB), run=17187-17187msec
  WRITE: bw=119MiB/s (125MB/s), 119MiB/s-119MiB/s (125MB/s-125MB/s), io=2051MiB (2150MB), run=17187-17187msec

Disk stats (read/write):
  nvme1n1: ios=1563986/522310, merge=0/0, ticks=927244/24031, in_queue=951275, util=99.46%

...which looks a lot better - read: IOPS=91.5k.

I suspect it's due to how the read-only test works? Or some nuance of reading off the disk you're on, and some other limitation?

I ran my test a couple more times and got similar results each time.

I then ran another read-only test using the command from here, and got this:

/opt$ sudo fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=fiotest --filename=testfio --bs=4k --iodepth=64 --size=8G --readwrite=randread
fiotest: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.16
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=332MiB/s][r=85.1k IOPS][eta 00m:00s]
fiotest: (groupid=0, jobs=1): err= 0: pid=26503: Mon Jan 31 09:17:57 2022
  read: IOPS=88.6k, BW=346MiB/s (363MB/s)(8192MiB/23663msec)
   bw (  KiB/s): min=339560, max=787720, per=100.00%, avg=354565.45, stdev=72963.81, samples=47
   iops        : min=84890, max=196930, avg=88641.40, stdev=18240.94, samples=47
  cpu          : usr=15.37%, sys=31.05%, ctx=844523, majf=0, minf=72
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=2097152,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=346MiB/s (363MB/s), 346MiB/s-346MiB/s (363MB/s-363MB/s), io=8192MiB (8590MB), run=23663-23663msec

Disk stats (read/write):
  nvme1n1: ios=2095751/1, merge=0/0, ticks=1468160/0, in_queue=1468159, util=99.64%

So much better read performance. I suspect the arguments you gave your command are not allowing the test to get the best performance from the disk, maybe due to block size, file size, etc. I did notice they were all single-dashed arguments (e.g. -bs=4k) not double (--bs=4k), so they might not even be being parsed correctly...

Raphael Noero avatar
lk flag
Thank you so much for this sophisticated reply. I think you are right and this is similar to what the aws support told me.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.