I have four brand new identicalDell Poweredge R730's with BCM57406 10G nic adapters (on the 6.7U1 HCL)
Model : BCM57406
Device Type : Network
Brand Name : DELL
Number of Ports: 2
DID : 16d2
SVID : 14e4
SSID : 4060
VID : 14e4
One of the four servers will load bnxtnet driver and activate the nic just fine. The other three will not and I am stumped. I have checked any and all bios/nic settings. All firmware is identical, PCI slots are identical, esxi 6.7U1 is loaded identical.... and yet I cannot get three of them past this error.
vmkernel.log from server that works....
2019-01-31T12:19:13.436Z cpu1:2097664)Loading module bnxtnet ...
2019-01-31T12:19:13.437Z cpu1:2097664)Elf: 2101: module bnxtnet has license BSD
2019-01-31T12:19:13.441Z cpu1:2097664)Device: 192: Registered driver 'bnxtnet' from 22
2019-01-31T12:19:13.441Z cpu1:2097664)Mod: 4962: Initialization of bnxtnet succeeded with module ID 22.
2019-01-31T12:19:13.441Z cpu1:2097664)bnxtnet loaded successfully.
2019-01-31T12:19:13.442Z cpu6:2097620)bnxtnet: bnxtnet_initialize_devname:61: [0000:06:00.0 : 0x4309fd3bfe10] PCI device 16d2:14e4:4060:14e4 detected
2019-01-31T12:19:13.442Z cpu6:2097620)bnxtnet: bnxtnet_dev_probe:1275: [0000:06:00.0 : 0x4309fd3bfe10] Starting Cumulus device probe
2019-01-31T12:19:13.442Z cpu6:2097620)DMA: 679: DMA Engine 'cumulus-0000:06:00.0' created using mapper 'DMANull'.
2019-01-31T12:19:13.442Z cpu6:2097620)DMA: 679: DMA Engine 'cumulus-co-0000:06:00.0' created using mapper 'DMANull'.
2019-01-31T12:19:13.442Z cpu6:2097620)VMK_PCI: 914: device 0000:06:00.0 pciBar 0 bus_addr 0x91c20000 size 0x10000
2019-01-31T12:19:13.442Z cpu6:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:06:00.0 : 0x4309fd3bfe10] mapped pci bar 0 at vaddr 0x450196a40000
2019-01-31T12:19:13.442Z cpu6:2097620)VMK_PCI: 914: device 0000:06:00.0 pciBar 2 bus_addr 0x91c30000 size 0x10000
2019-01-31T12:19:13.442Z cpu6:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:06:00.0 : 0x4309fd3bfe10] mapped pci bar 2 at vaddr 0x450196a60000
2019-01-31T12:19:13.442Z cpu6:2097620)VMK_PCI: 914: device 0000:06:00.0 pciBar 4 bus_addr 0x91dc2000 size 0x2000
2019-01-31T12:19:13.442Z cpu6:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:06:00.0 : 0x4309fd3bfe10] mapped pci bar 4 at vaddr 0x450196468000
2019-01-31T12:19:13.443Z cpu6:2097620)bnxtnet: dev_init_device_info:1113: [0000:06:00.0 : 0x4309fd3bfe10] PHY is AutoGrEEEn capable
2019-01-31T12:19:13.479Z cpu6:2097620)WARNING: bnxtnet: bnxtnet_alloc_mem_probe:933: [0000:06:00.0 : 0x4309fd3bfe10] Disable VXLAN/Geneve RX filter due to firmware bug. Refer to VMware Compatibilit
2019-01-31T12:19:13.479Z cpu6:2097620)bnxtnet: bnxtnet_alloc_intr_resources:899: [0000:06:00.0 : 0x4309fd3bfe10] The intr type set to MSIX
2019-01-31T12:19:13.479Z cpu6:2097620)VMK_PCI: 764: device 0000:06:00.0 allocated 16 MSIX interrupts
2019-01-31T12:19:13.479Z cpu6:2097620)bnxtnet: bnxtnet_dev_probe:1352: [0000:06:00.0 : 0x4309fd3bfe10] Interrupt mode: MSIX, max fastpaths: 16 max roce irqs: 0
2019-01-31T12:19:13.479Z cpu6:2097620)bnxtnet: bnxtnet_dev_probe:1358: [0000:06:00.0 : 0x4309fd3bfe10] Ending successfully cumulus device probe
2019-01-31T12:19:13.479Z cpu6:2097620)bnxtnet: bnxtnet_attach_device:235: [0000:06:00.0 : 0x4309fd3bfe10] Driver successfully attached cumulus device (0x2d544305d9cc7d46) with Chip ID=0x16D2 Rev/Me
2019-01-31T12:19:13.480Z cpu6:2097620)Device: 327: Found driver bnxtnet for device 0x2d544305d9cc7d46
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097666 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097667 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097668 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097669 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097670 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097671 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097672 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097673 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097674 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097675 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097676 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097677 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097678 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097679 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097680 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)CpuSched: 697: user latency of 2097681 netpoll-backup 0 changed by 2097620 vmkdevmgr -6
2019-01-31T12:19:13.480Z cpu6:2097620)bnxtnet: bnxtnet_start_device:389: [0000:06:00.0 : 0x4309fd3bfe10] Driver successfully started cumulus device (0x2d544305d9cc7d46)
2019-01-31T12:19:13.480Z cpu6:2097620)Device: 1466: Registered device: 0x4305d9cc0070 pci#s00000005.00#0 com.vmware.uplink (parent=0x2d544305d9cc7d46)
2019-01-31T12:19:13.480Z cpu6:2097620)bnxtnet: bnxtnet_scan_device:559: [0000:06:00.0 : 0x4309fd3bfe10] Successfully registered uplink device
vmkernel.log from other three servers that don't work....
2019-01-31T12:18:56.545Z cpu4:2097664)Loading module bnxtnet ...
2019-01-31T12:18:56.546Z cpu4:2097664)Elf: 2101: module bnxtnet has license BSD
2019-01-31T12:18:56.550Z cpu4:2097664)Device: 192: Registered driver 'bnxtnet' from 22
2019-01-31T12:18:56.550Z cpu4:2097664)Mod: 4962: Initialization of bnxtnet succeeded with module ID 22.
2019-01-31T12:18:56.550Z cpu4:2097664)bnxtnet loaded successfully.
2019-01-31T12:18:56.551Z cpu7:2097620)bnxtnet: bnxtnet_initialize_devname:61: [0000:05:00.0 : 0x4309fd3bfe10] PCI device 16d2:14e4:4060:14e4 detected
2019-01-31T12:18:56.552Z cpu7:2097620)bnxtnet: bnxtnet_dev_probe:1275: [0000:05:00.0 : 0x4309fd3bfe10] Starting Cumulus device probe
2019-01-31T12:18:56.552Z cpu7:2097620)DMA: 679: DMA Engine 'cumulus-0000:05:00.0' created using mapper 'DMANull'.
2019-01-31T12:18:56.552Z cpu7:2097620)DMA: 679: DMA Engine 'cumulus-co-0000:05:00.0' created using mapper 'DMANull'.
2019-01-31T12:18:56.552Z cpu7:2097620)VMK_PCI: 914: device 0000:05:00.0 pciBar 0 bus_addr 0x91c20000 size 0x10000
2019-01-31T12:18:56.552Z cpu7:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:05:00.0 : 0x4309fd3bfe10] mapped pci bar 0 at vaddr 0x450196540000
2019-01-31T12:18:56.552Z cpu7:2097620)VMK_PCI: 914: device 0000:05:00.0 pciBar 2 bus_addr 0x91c30000 size 0x10000
2019-01-31T12:18:56.552Z cpu7:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:05:00.0 : 0x4309fd3bfe10] mapped pci bar 2 at vaddr 0x450196560000
2019-01-31T12:18:56.552Z cpu7:2097620)VMK_PCI: 914: device 0000:05:00.0 pciBar 4 bus_addr 0x91c42000 size 0x2000
2019-01-31T12:18:56.552Z cpu7:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:05:00.0 : 0x4309fd3bfe10] mapped pci bar 4 at vaddr 0x450196468000
2019-01-31T12:18:56.552Z cpu7:2097620)bnxtnet: dev_init_device_info:1113: [0000:05:00.0 : 0x4309fd3bfe10] PHY is AutoGrEEEn capable
2019-01-31T12:18:58.068Z cpu7:2097620)WARNING: bnxtnet: hwrm_send_msg:168: [0000:05:00.0 : 0x4309fd3bfe10] HWRM cmd resp_len timeout, cmd_type 0x11(HWRM_FUNC_RESET) seq 5
2019-01-31T12:18:59.583Z cpu7:2097620)WARNING: bnxtnet: hwrm_send_msg:168: [0000:05:00.0 : 0x4309fd3bfe10] HWRM cmd resp_len timeout, cmd_type 0x11(HWRM_FUNC_RESET) seq 6
2019-01-31T12:18:59.583Z cpu7:2097620)DMA: 724: DMA Engine 'cumulus-0000:05:00.0' destroyed.
2019-01-31T12:18:59.583Z cpu7:2097620)DMA: 724: DMA Engine 'cumulus-co-0000:05:00.0' destroyed.
2019-01-31T12:18:59.583Z cpu7:2097620)WARNING: bnxtnet: bnxtnet_attach_device:208: [0000:05:00.0 : 0x4309fd3bfe10] failed to find cumulus device (status: Failure)
2019-01-31T12:18:59.583Z cpu7:2097620)Device: 2628: Module 22 did not claim device 0x1bd34305d9cc7d46.
2019-01-31T12:18:59.584Z cpu7:2097620)bnxtnet: bnxtnet_initialize_devname:61: [0000:05:00.1 : 0x4309fd3bfe10] PCI device 16d2:14e4:4060:14e4 detected
2019-01-31T12:18:59.584Z cpu7:2097620)bnxtnet: bnxtnet_dev_probe:1275: [0000:05:00.1 : 0x4309fd3bfe10] Starting Cumulus device probe
2019-01-31T12:18:59.585Z cpu7:2097620)DMA: 679: DMA Engine 'cumulus-0000:05:00.1' created using mapper 'DMANull'.
2019-01-31T12:18:59.585Z cpu7:2097620)DMA: 679: DMA Engine 'cumulus-co-0000:05:00.1' created using mapper 'DMANull'.
2019-01-31T12:18:59.585Z cpu7:2097620)VMK_PCI: 914: device 0000:05:00.1 pciBar 0 bus_addr 0x91c00000 size 0x10000
2019-01-31T12:18:59.585Z cpu7:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:05:00.1 : 0x4309fd3bfe10] mapped pci bar 0 at vaddr 0x450196500000
2019-01-31T12:18:59.585Z cpu7:2097620)VMK_PCI: 914: device 0000:05:00.1 pciBar 2 bus_addr 0x91c10000 size 0x10000
2019-01-31T12:18:59.585Z cpu7:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:05:00.1 : 0x4309fd3bfe10] mapped pci bar 2 at vaddr 0x450196520000
2019-01-31T12:18:59.585Z cpu7:2097620)VMK_PCI: 914: device 0000:05:00.1 pciBar 4 bus_addr 0x91c40000 size 0x2000
2019-01-31T12:18:59.585Z cpu7:2097620)bnxtnet: bnxtnet_map_pci_mem:784: [0000:05:00.1 : 0x4309fd3bfe10] mapped pci bar 4 at vaddr 0x45019469c000
2019-01-31T12:19:00.090Z cpu7:2097620)WARNING: bnxtnet: hwrm_send_msg:168: [0000:05:00.1 : 0x4309fd3bfe10] HWRM cmd resp_len timeout, cmd_type 0x0(HWRM_VER_GET) seq 0
2019-01-31T12:19:00.090Z cpu7:2097620)DMA: 724: DMA Engine 'cumulus-0000:05:00.1' destroyed.
2019-01-31T12:19:00.090Z cpu7:2097620)DMA: 724: DMA Engine 'cumulus-co-0000:05:00.1' destroyed.
2019-01-31T12:19:00.090Z cpu7:2097620)WARNING: bnxtnet: bnxtnet_attach_device:208: [0000:05:00.1 : 0x4309fd3bfe10] failed to find cumulus device (status: Failure)
2019-01-31T12:19:00.090Z cpu7:2097620)Device: 2628: Module 22 did not claim device 0x602e4305d9cc7eef.
The server with the working nic is actually working with the older driver
bnxtnet 20.6.101.7-11vmw.670.0.0.8169922 VMW VMwareCertified 2019-01-16
bnxtroce 20.6.101.0-20vmw.670.1.28.10302608 VMW VMwareCertified 2019-01-16
But I have tried the older and the newest version on the other three
bnxtnet 212.0.119.0-1OEM.670.0.0.8169922 BCM VMwareCertified 2019-01-31
bnxtroce 212.0.114.0-1OEM.670.0.0.8169922 BCM VMwareCertified 2019-01-31
I have swapped nics between the servers and the results are the same... the server with the working nic works with any of the nics and the other three servers won't so the physical nic cards are fine.
I don't know if this is a vmware or Dell issue.
Any ideas/thoughts on possible issues or other things to try? Next step is to swap the Dell PCI riser and see if maybe somehow that might be an issue.