Score:0

VMWare virtualized e1000e: 2nd card failing with "Tx Unit Hang"

mx flag

Ever since upgrading the only Linux VM with more than one network adapter to debian 11 it's been failing to get its second adapter to do anything.

The config is fairly straightforward: Two VMWare network adapters, each using an emulated E1000E. Trying a simple;

   ping 10.0.26.5

Where 10.0.26.201 is the IP on the VM, and 10.0.26.5 is that of the host, fails. Checking dmesg reveals the problem is a complicated low-level issue:

[ 5112.037590] e1000 0000:02:02.0 ens34: Detected Tx Unit Hang
                 Tx Queue             <0>
                 TDH                  <0>
                 TDT                  <1>
                 next_to_use          <1>
                 next_to_clean        <0>
               buffer_info[next_to_clean]
                 time_stamp           <10012595a>
                 next_to_watch        <0>
                 jiffies              <100125b50>
                 next_to_watch.status <0>
[ 5113.895573] e1000 0000:02:02.0 ens34: Detected Tx Unit Hang
                 Tx Queue             <0>
                 TDH                  <0>
                 TDT                  <1>
                 next_to_use          <1>
                 next_to_clean        <0>
               buffer_info[next_to_clean]
                 time_stamp           <10012595a>
                 next_to_watch        <0>
                 jiffies              <100125d20>
                 next_to_watch.status <0>

This is repeated roughly every 2 seconds. It's only for ens34, not for ens32, even though these are two identical software emulated e1000e cards. The issue persists across reboots, network resets, or changing various aspects of the card's configuration, like the speed to 100 Mbit, or disabling various offloading features, or re-creating the card in VMware. Never seen something like this before, where the exact same driver would work perfectly for one card, and fail to do anything for the other.

Most googling seems to turn up old issues with actual physical versions of this chip that were resolved long ago.

The fact that it's all emulated rules out actual hardware issues (aside from this, ens34 just happens to be mapped to the same port I'm accessing VMWare remotely over, and I can still connect).

Setting the debug level a bit higher gives a very verbose dump; condensed;


[ 5625.646265] e1000: Register dump
[ 5625.646270] e1000: CTRL             00c00249
[ 5625.646272] e1000: STATUS           0000cb83
[ 5625.646273] e1000: RCTL             00008002
[ 5625.646275] e1000: RDLEN            00001000
[ 5625.646277] e1000: RDH              00000001
[ 5625.646278] e1000: RDT              000000fe
[ 5625.646280] e1000: RDTR             00000000
[ 5625.646281] e1000: TCTL             0103f0fa
[ 5625.646283] e1000: TDBAL            35d13000
[ 5625.646284] e1000: TDBAH            00000000
[ 5625.646285] e1000: TDLEN            00001000
[ 5625.646286] e1000: TDH              00000000
[ 5625.646288] e1000: TDT              00000001
[ 5625.646289] e1000: TIDV             00000008
[ 5625.646290] e1000: TXDCTL           01010000
[ 5625.646292] e1000: TADV             00000020
[ 5625.646293] e1000: TARC0            00000000
[ 5625.646294] e1000: TDBAL1           00000000
[ 5625.646296] e1000: TDBAH1           00000000
[ 5625.646297] e1000: TDLEN1           00000000
[ 5625.646298] e1000: TDH1             00000000
[ 5625.646300] e1000: TDT1             00000000
[ 5625.646301] e1000: TXDCTL1          00000000
[ 5625.646302] e1000: TARC1            00000000
[ 5625.646304] e1000: CTRL_EXT         00000cc0
[ 5625.646305] e1000: ERT              00000000
[ 5625.646306] e1000: RDBAL            35d14000
[ 5625.646308] e1000: RDBAH            00000000
[ 5625.646309] e1000: TDFH             00000000
[ 5625.646310] e1000: TDFT             00000000
[ 5625.646312] e1000: TDFHS            00000000
[ 5625.646313] e1000: TDFTS            00000000
[ 5625.646314] e1000: TDFPC            00000000
[ 5625.646315] e1000: RDFH             00000000
[ 5625.646317] e1000: RDFT             00000000
[ 5625.646318] e1000: RDFHS            00000000
[ 5625.646319] e1000: RDFTS            00000000
[ 5625.646320] e1000: RDFPC            00000000


[ 5625.646322] e1000: TX Desc ring0 dump
[ 5625.646324] e1000: Tc[desc]     [Ce CoCsIpceCoS] [MssHlRSCm0Plen] [bi->dma       ] leng  ntw timestmp         bi->skb
[ 5625.646325] e1000: Td[desc]     [address 63:0  ] [VlaPoRSCm1Dlen] [bi->dma       ] leng  ntw timestmp         bi->skb
[ 5625.646337] e1000: Tc[0x000]    00000000BCCFA800 000000008B00005A 00000000BCCFA800 005A    0 0000000100144D57 00000000fde394a8 NTC
[ 5625.646390] e1000: Tc[0x001]    0000000000000000 0000000000000000 0000000000000000 0000    0 0000000000000000 0000000000000000 NTU
[ 5625.646400] e1000: Tc[0x002]    0000000000000000 0000000000000000 0000000000000000 0000    0 0000000000000000 0000000000000000
[ 5625.646407] e1000: Tc[0x003]    0000000000000000 0000000000000000 0000000000000000 0000    0 0000000000000000 0000000000000000

(Remainder is all zeroes) 

              RX Desc ring dump
[ 5625.647449] e1000: R[desc]      [address 63:0  ] [vl er S cks ln] [bi->dma       ] [bi->skb]
[ 5625.647453] e1000: R[0x000]     00000000BCC7B000 00000003000000CA 00000000BCC7B000 00000000973ca48e NTC
[ 5625.647455] e1000: R[0x001]     00000000BCC7B800 0000000000000000 00000000BCC7B800 00000000dfca3cec
[ 5625.647457] e1000: R[0x002]     00000000BCC7C000 0000000000000000 00000000BCC7C000 0000000053a8edb6
[ 5625.647459] e1000: R[0x003]     00000000BCC7C800 0000000000000000 00000000BCC7C800 00000000a22e57ab
[ 5625.647461] e1000: R[0x004]     00000000BCC7D000 0000000000000000 00000000BCC7D000 000000002b2619b7
[ 5625.647463] e1000: R[0x005]     00000000BCC7D800 0000000000000000 00000000BCC7D800 0000000057cbfe1e
[ 5625.647465] e1000: R[0x006]     00000000BCC7E000 0000000000000000 00000000BCC7E000 00000000cc0b4642
[ 5625.647467] e1000: R[0x007]     00000000BCC7E800 0000000000000000 00000000BCC7E800 0000000005926f61
[ 5625.647469] e1000: R[0x008]     00000000BCC7F000 0000000000000000 00000000BCC7F000 0000000041b9d724
[ 5625.647471] e1000: R[0x009]     00000000BCC7F800 0000000000000000 00000000BCC7F800 000000003a463662
[ 5625.647473] e1000: R[0x00A]     00000000BCC80000 0000000000000000 00000000BCC80000 00000000cee2865a
[ 5625.647475] e1000: R[0x00B]     00000000BCC80800 0000000000000000 00000000BCC80800 00000000ce4b0575
[ 5625.647477] e1000: R[0x00C]     00000000BCC81000 0000000000000000 00000000BCC81000 00000000e1972cf2
[ 5625.647479] e1000: R[0x00D]     00000000BCC81800 0000000000000000 00000000BCC81800 000000006a14dab9
[ 5625.647481] e1000: R[0x00E]     00000000BCC82000 0000000000000000 00000000BCC82000 000000004e6d9e5b
[ 5625.647483] e1000: R[0x00F]     00000000BCC82800 0000000000000000 00000000BCC82800 00000000e4f9351b
[ 5625.647485] e1000: R[0x010]     00000000BCC83000 0000000000000000 00000000BCC83000 000000001c33caa3
[ 5625.647487] e1000: R[0x011]     00000000BCC83800 0000000000000000 00000000BCC83800 0000000045f1802c

(etc. stripped due to post size limit).

[ 5625.648320] e1000: R[0x0F0]     00000000BCCF3000 0000000000000000 00000000BCCF3000 0000000003287ca3
[ 5625.648323] e1000: R[0x0F1]     00000000BCCF3800 0000000000000000 00000000BCCF3800 00000000dec9e9d4
[ 5625.648325] e1000: R[0x0F2]     00000000BCCF4000 0000000000000000 00000000BCCF4000 00000000baa78be5
[ 5625.648327] e1000: R[0x0F3]     00000000BCCF4800 0000000000000000 00000000BCCF4800 000000002d8e36e0
[ 5625.648330] e1000: R[0x0F4]     00000000BCCF5000 0000000000000000 00000000BCCF5000 00000000d6828d9d
[ 5625.648332] e1000: R[0x0F5]     00000000BCCF5800 0000000000000000 00000000BCCF5800 000000008fd0876f
[ 5625.648334] e1000: R[0x0F6]     00000000BCCF6000 0000000000000000 00000000BCCF6000 0000000077d739be
[ 5625.648339] e1000: R[0x0F7]     00000000BCCF6800 0000000000000000 00000000BCCF6800 000000008ed1b539
[ 5625.648341] e1000: R[0x0F8]     00000000BCCF7000 0000000000000000 00000000BCCF7000 000000007f314645
[ 5625.648343] e1000: R[0x0F9]     00000000BCCF7800 0000000000000000 00000000BCCF7800 0000000042b3e5b8
[ 5625.648345] e1000: R[0x0FA]     00000000BCCF8000 0000000000000000 00000000BCCF8000 000000008e9183fc
[ 5625.648347] e1000: R[0x0FB]     00000000BCCF8800 0000000000000000 00000000BCCF8800 00000000bd73dcc4
[ 5625.648349] e1000: R[0x0FC]     00000000BCCF9000 0000000000000000 00000000BCCF9000 000000007c655b1e
[ 5625.648351] e1000: R[0x0FD]     00000000BCCF9800 0000000000000000 00000000BCCF9800 00000000f885d1d7
[ 5625.648353] e1000: R[0x0FE]     00000000BCCFA000 0000000000000000 00000000BCCFA000 0000000086bc0e6b
[ 5625.648355] e1000: R[0x0FF]     0000000000000000 0000000000000000 0000000000000000 0000000000000000 NTU

[ 5625.648356] e1000: Rx descriptor cache in 64bit format
[ 5625.648435] e1000: R6000: 00000000|00000000 00000000|00000000
[ 5625.648500] e1000: R6010: 00000000|00000000 00000000|00000000
[ 5625.648563] e1000: R6020: 00000000|00000000 00000000|00000000

(remainder is all zeroes)

[ 5625.652335] e1000: Tx descriptor cache in 64bit format
[ 5625.652397] e1000: T7000: 00000000|00000000 00000000|00000000
[ 5625.652458] e1000: T7010: 00000000|00000000 00000000|00000000
[ 5625.652518] e1000: T7020: 00000000|00000000 00000000|00000000

(remainder is all zeroes)

Score:0
mx flag

I have since managed to solve this problem by implementing the following albeit unsatisfying "Microsoft solution".

  1. Shutdown the virtual machine.
  2. Remove the offending network adapter from this VM.
  3. Boot the VM
  4. Shut it down again.
  5. Add a new network adapter to the VM. Same type, same name.
  6. Boot the VM.
  7. Check that the new network adapter has the same name as the previous one in linux. If it doesn't, adjust config files such as /etc/network/interfaces, for example sed -i 's/ens33/ens34/g' /etc/network/interfaces
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.