Score:2

Infiniband fabric with 3 nodes - newbie

in flag

I am trying to connect 3 HP z840 workstations using:

Mellanox ConnectX-3 VPI 40 / 56GbE Dual-Port QSFP Adapter MCX354A-FCBT
Mellanox SX6005 12-port Non-blocking Unmanaged 56Gb/s

Description of machines to be connected: oak-rd0-linux (main node where I will run things from and where opensm is running) oak-rd1-linux oak-rd2-linux

I have installed the latest fw on the cards and installed the latest mlnx ofed driver that supports my cards (MLNX_OFED_LINUX-4.9-4.1.7.0-ubuntu20.04-x86_64). Running ubuntu 20.04 (Linux 5.4.0-26-generic kernel as required by the mlnx_ofed driver).

how I installed the MLNX OFED:

sudo touch /etc/apt/sources.list.d/mlnx_ofed.list
sudo nano /etc/apt/sources.list.d/mlnx_ofed.list
deb file:/home/user/infiniband/MLNX_OFED_LINUX-4.9-4.1.7.0-ubuntu20.04-x86_64/DEBS/UPSTREAM_LIBS ./
wget -qO - http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox | sudo apt-key add -
apt-key list
sudo apt-get update
sudo apt-get install mlnx-ofed-all

I also got the hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-4.7-1.0.0.1

I start opensm in deamon mode with:

/etc/init.d/opensmd start

I run sudo ibdiagnet and it yields a clean summary (NOTE: I cannot run ibdiagnet without sudo )

Running: ibdiagnet -r
----------
Load Plugins from:
/usr/share/ibdiagnet2.1.1/plugins/
(You can specify more paths to be looked in with "IBDIAGNET_PLUGINS_PATH" env variable)

Plugin Name Result Comment
libibdiagnet_cable_diag_plugin-2.1.1 Succeeded Plugin loaded
libibdiagnet_phy_diag_plugin-2.1.1 Succeeded Plugin loaded

---------------------------------------------
Discovery
-I- Discovering ... 4 nodes (1 Switches & 3 CA-s) discovered.
-I- Fabric Discover finished successfully

-I- Discovered 4 nodes (1 Switches & 3 CA-s).

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- VS Capability GMP finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- VS Capability SMP finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- VS ExtendedPortInfo finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Port Info Extended finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Switch Info retrieving finished successfully

-I- Duplicated GUIDs detection finished successfully

-I- Duplicated Node Description detection finished successfully

---------------------------------------------
Lids Check
-I- Lids Check finished successfully

---------------------------------------------
Links Check
-I- Links Check finished successfully

---------------------------------------------
Subnet Manager
-I- SM Info retrieving finished successfully

-I- Subnet Manager Check finished successfully

---------------------------------------------
Port Counters
-I- Retrieving PMClassPortInfo ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Retrieving PMPortSampleControl ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Ports counters retrieving finished successfully

-I- Going to sleep for 1 seconds until next counters sample
-I- Time left to sleep ... 1 seconds.

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Ports counters retrieving (second time) finished successfully

-I- Ports counters value Check finished successfully

-I- Ports counters Difference Check (during run) finished successfully

---------------------------------------------
Nodes Information
-I- Devid: 4099(0x1003), PSID: MT_1090120019, Latest FW Version:2.42.5000
-I- Devid: 51000(0xc738), PSID: EMC1260110021, Latest FW Version:9.3.8000
-I- FW Check finished successfully

---------------------------------------------
Speed / Width checks
-I- Link Speed Check (Compare to supported link speed)
-I- Links Speed Check finished successfully

-I- Link Width Check (Compare to supported link width)
-I- Links Width Check finished successfully

---------------------------------------------
Alias GUIDs
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Alias GUIDs retrieving finished successfully

-I- Alias GUIDs finished successfully

---------------------------------------------
Virtualization
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Virtualization finished successfully

-I- Virtual ports retrieving finished successfully

-I- Virtual ports retrieving finished successfully

---------------------------------------------
Partition Keys
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Partition Keys retrieving finished successfully

-I- Partition Keys finished successfully

---------------------------------------------
Temperature Sensing
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Temperature Sensing finished successfully

---------------------------------------------
Routing

-I- EXT switch info retrieving finished successfully

-I- PLFT is enabled on 0 switches.
-I- PLFT data retrieving finished successfully

-I- Adaptive Routing is enabled on 0 switches.
-I- AR data retrieving finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Unicast FDBS Info retrieving finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Multicast FDBS Info retrieving finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Dump SLVL Table finished successfully

-I- Load SLVL file.
---------------------------------------------
Summary
-I- Stage Warnings Errors Comment
-I- Discovery 0 0
-I- Lids Check 0 0
-I- Links Check 0 0
-I- Subnet Manager 0 0
-I- Port Counters 0 0
-I- Nodes Information 0 0
-I- Speed / Width checks 0 0
-I- Alias GUIDs 0 0
-I- Virtualization 0 0
-I- Partition Keys 0 0
-I- Temperature Sensing 0 0
-I- Routing 0 0

-I- You can find detailed errors/warnings in: /var/tmp/ibdiagnet2/ibdiagnet2.log


-I- ibdiagnet database file : /var/tmp/ibdiagnet2/ibdiagnet2.db_csv
-I- LST file : /var/tmp/ibdiagnet2/ibdiagnet2.lst
-I- Network dump file : /var/tmp/ibdiagnet2/ibdiagnet2.net_dump
-I- Subnet Manager file : /var/tmp/ibdiagnet2/ibdiagnet2.sm
-I- Ports Counters file : /var/tmp/ibdiagnet2/ibdiagnet2.pm
-I- Nodes Information file : /var/tmp/ibdiagnet2/ibdiagnet2.nodes_info
-I- Alias guids file : /var/tmp/ibdiagnet2/ibdiagnet2.aguid
-I- VPorts file : /var/tmp/ibdiagnet2/ibdiagnet2.vports
-I- VPorts Pkey file : /var/tmp/ibdiagnet2/ibdiagnet2.vports_pkey
-I- Partition keys file : /var/tmp/ibdiagnet2/ibdiagnet2.pkey
-I- VL2VL file : /var/tmp/ibdiagnet2/ibdiagnet2.vl2vl
-I- PLFT file : /var/tmp/ibdiagnet2/ibdiagnet2.plft
-I- AR file : /var/tmp/ibdiagnet2/ibdiagnet2.ar
-I- Full AR file : /var/tmp/ibdiagnet2/ibdiagnet2.far
-I- Unicast FDBS file : /var/tmp/ibdiagnet2/ibdiagnet2.fdbs
-I- Multicast FDBS file : /var/tmp/ibdiagnet2/ibdiagnet2.mcfdbs
-I- SLVL Table file : /var/tmp/ibdiagnet2/ibdiagnet2.slvl

ibping seems to be running fine although I am not sure if these are good performance values

ibstat | egrep "Port|Base"

(base) baird@oak-rd0-linux:~$ ibstat | egrep "Port|Base"
Port 1:
Base lid: 0
Port GUID: 0x0010e00001885689
Port 2:
Base lid: 1
Port GUID: 0x0010e0000188568a

server ( oak-rd0-linux )
ibping -S -P 2 -d (I know that Port2 is the active one)

I can then ibping from host1 and host2 with:
ibping -P 1 1

Host1 ( oak-rd1-linux )
baird@oak-rd1-linux:~$ sudo ibping -P 1 1
Pong from oak-rd0-linux.(none) (Lid 1): time 0.027 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.037 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.042 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.044 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.038 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.029 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.042 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.029 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.038 ms
^C
--- oak-rd0-linux.(none) (Lid 1) ibping statistics ---
9 packets transmitted, 9 received, 0% packet loss, time 8028 ms
rtt min/avg/max = 0.027/0.036/0.044 ms

Host2 ( oak-rd2-linux )
(base) baird@oak-rd2-linux:~$ sudo ibping -P 1 1
Pong from oak-rd0-linux.(none) (Lid 1): time 0.029 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.015 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.041 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.043 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.044 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.037 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.042 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.039 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.038 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.040 ms
^C
--- oak-rd0-linux.(none) (Lid 1) ibping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9055 ms
rtt min/avg/max = 0.015/0.036/0.044 ms

seems to be working fine assuming that all is good on the infiniband side, here's my problem:

I can run the tests in the ompi that comes with hpcx

mpicc $HPCX_MPI_TESTS_DIR/examples/hello_c.c -o $HPCX_MPI_TESTS_DIR/examples/hello_c
mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_c

Hello, world, I am 1 of 2, (Open MPI v4.0.3rc4, package: Open MPI root@0e5a40994726 Distribution, ident: 4.0.3rc4, repo rev: v4.0.3rc4-6-g8b4a8cd34c, Unreleased developer copy, 148)
Hello, world, I am 0 of 2, (Open MPI v4.0.3rc4, package: Open MPI root@0e5a40994726 Distribution, ident: 4.0.3rc4, repo rev: v4.0.3rc4-6-g8b4a8cd34c, Unreleased developer copy, 148)

However, when I try to run with:

mpirun -x LD_LIBRARY_PATH -np 2 -H oak-rd0-linux,oak-rd1-linux $HPCX_MPI_TESTS_DIR/examples/hello_c

I don't get any feedback, no errors no output, it seems to be hanging.

-Can someone please guide me on how to connect/use my other hosts' CPU? -What are the utils I need to use to debug the issue?

I am a complete newbie in this and I would greatly appreciate any help/suggestion etc. I am ready to provide any additional information, test suggestions out etc. Cheers!

vn flag
This looks like an MPI issue and not exactly an Infiniband issue. Check first if mpirun works with TCP interfaces. Guiding is available on question 7: https://www-lb.open-mpi.org/faq/?category=tcp#tcp-selection. Also update MLNX to 4.9 LTS and post `ofed_info`.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.