Hi Nested Community,
I have been using nested ESXi 6.7 under QEMU/KVM on a Ubuntu 18 host for a while now, and it has been working perfectly when the QEMU "e1000" network device is selected. I use the default NAT network that libvirt sets up where is it adds the guests (my ESXi hosts) to a virbr0 device automatically. Nested ESXi hosts are able to talk to each other and reach the outside world.
However, now that e1000 has been removed starting with ESXi 7, I tried using vmxnet3, which comes with Ubuntu 18 already compiled and available as a QEMU network device. The nested ESXi hosts boot just fine with this, and I can login to the ESXi web UI and SSH to them from my Ubuntu 18 host. The weird thing is they fail/timeout at certain tasks, for example:
- Adding a host to vCenter hangs at 80% and never completes with the error of "A general system error occurred: Unable to push signed certificate to host". I am able to add the host that the VCSA runs on, but no other hosts.
- When I go to download a OVA file when deploying a new OVA via vCenter GUI, after I put the URL in it asks me to verify the SSL thumbprint, but then hangs and fails with the error of "Unable to retrieve manifest or certificate file."
On ESXi 6.7, I simply stop the ESXi hosts, switch back to e1000, then everything works as expected. The problem is e1000 is not supported in ESXi 7, so I am out of luck running nested virt with this version.
Has anyone else came across this issue before?
I tried coming up with a few workarounds, but QEMU is limited in what network cards it can emulate. Please let me know if you have any ideas!
Thanks!