I have exported a single disk as an iSCSI target using targetcli
on a Linux host. On a remote machine I have connected to the device, and placed a ZFS volume on it (as a replication target with zrepl
).
However any time I reboot the iSCSI target machine, the initiator machine running ZFS fails with write errors. After doing some research, the type of failures it has only occur when ZFS sends a write to a device, it's told the write completed successfully, but then the write is lost.
This means the iSCSI target must be caching writes and returning success before they are actually written to the disk, and when the machine is rebooted those cached writes don't always get flushed.
Is there a way to use targetcli
to disable this write caching, so that writes are not returned as successful until they have actually been written to the physical disk?
I tried exporting the physical disk as the pscsi
backstore instead, but this just makes the iSCSI target kernel crash as soon as the disk is accessed. (There are no VMs in use, just two native Linux servers, so no unusual SCSI commands should be used.)
/> ls
o- / .............................................................................................. [...]
o- backstores ................................................................................... [...]
| o- block ....................................................................... [Storage Objects: 1]
| | o- ar0 .......... [/dev/disk/by-id/ata-ST22000NT001-3LS101_ZX20XTYG (20.0TiB) write-thru activated]
| | o- alua ........................................................................ [ALUA Groups: 1]
| | o- default_tg_pt_gp ............................................ [ALUA state: Active/optimized]
| o- fileio ...................................................................... [Storage Objects: 0]
| o- pscsi ....................................................................... [Storage Objects: 0]
| o- ramdisk ..................................................................... [Storage Objects: 0]
o- iscsi ................................................................................. [Targets: 1]
| o- iqn.2023-07.com.example.host:sn.zx20xtyg ............................................... [TPGs: 1]
| o- tpg1 .................................................................... [no-gen-acls, no-auth]
| o- acls ............................................................................... [ACLs: 1]
| | o- iqn.2023-07.com.example.client:783af16f240 ................................ [Mapped LUNs: 1]
| | o- mapped_lun0 ........................................................ [lun0 block/ar0 (rw)]
| o- luns ............................................................................... [LUNs: 1]
| | o- lun0 ..... [block/ar0 (/dev/disk/by-id/ata-ST22000NT001-3LS101_ZX20XTYG) (default_tg_pt_gp)]
| o- portals ......................................................................... [Portals: 1]
| o- 0.0.0.0:3260 .......................................................................... [OK]
o- loopback .............................................................................. [Targets: 0]
o- vhost ................................................................................. [Targets: 0]
o- xen-pvscsi ............................................................................ [Targets: 0]
pool: iscsi
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-JQ
config:
NAME STATE READ WRITE CKSUM
iscsi ONLINE 0 0 0
wwn-0x60014050b85f5b1d12d4875bdb68f41a ONLINE 3 153 0
errors: List of errors unavailable: pool I/O is currently suspended
$ zpool clear iscsu
$ zpool status -v
pool: iscsi
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: resilvered 360K in 00:00:02 with 0 errors on Fri Jul 28 12:58:10 2023
config:
NAME STATE READ WRITE CKSUM
iscsi ONLINE 0 0 0
wwn-0x60014050b85f5b1d12d4875bdb68f41a ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
<metadata>:<0x0>
<metadata>:<0x101>
<metadata>:<0x104>
<metadata>:<0x105>
<metadata>:<0x33>
<metadata>:<0x34>
<metadata>:<0x3d>
<metadata>:<0xe8d>
<metadata>:<0x1e1f9>
<metadata>:<0x1e1fb>