
It seems like either the ARP broadcast from the one server, or the reply back from the target isn't making it through. Moving one of the servers to a different host and the problem came back. Nothing worked except when we put both VM’s on the same host. We tried using vMotion to move the servers to a different host, different blade chassis, etc. Through all this both server “A” and server “B” have no issues communicating with any other machines. If we go into server “A” and set a static ARP entry (“arp –s”) for server “B”, everything works OK. However after pinging “B” to “A”, we can now ping “A” to “B”, at least for a while until the entry falls out of the ARP cache. Using PING, it works fine in one direction, but we get an “unreachable” error when going the other way, unless we ping from the target back to the source first.įor example: we have servers, “A” and “B”. What’s going on is that everything seems to be OK, but then out of nowhere, we will get communication failures between specific machines. The VMWare hosts are at 5.0, the guest VM’s are a mix on Windows 2003, 2008, and 2008R2. The blade chassis are connected to our core Cisco 6500 switches. We have about 250 VM’s running on about 20 HP BL465C blades installed on two HP C7000 chassis, using the HP Virtual Connect interconnect modules.


We are having a problem with some of our virtual machines intermittently losing communication with each other, and I’m at a loss as to the source.
