Home > Business Critical Applications, VMware > Workaround for Multi-NIC vMotion Unicast Flooding in vSphere 5

Workaround for Multi-NIC vMotion Unicast Flooding in vSphere 5

In my previous article “The Good, The Great and the Gotcha with Multi-NIC vMotion in vSphere 5” I discussed an issue that could cause unicast port flooding. One of my large financial customers has come up with a workaround for this problem.  This is an unsupported workaround but might do the trick until the official fix is available.

Note: this is an UNSUPPORTED workaround but appears to do the trick. Test before use and use this at your own risk. This is no longer needed provided you patch your hosts up to vSphere 5.0 U2. 

This script should work with vSphere 5.0, 5.0 U1, and 5.1 GA. It has not been tested yet in 5.1GA however. This is not required for vSphere 5.0 U2.

Create a shell script keepalive.sh on a shared datastore with the following contents:

#### Start keepalive.sh – Creator Justin Turver

a=0

for i in `esxcli network ip connection list |grep vmotionStreamHelper

|awk ‘{print $5}’ |grep 8000|cut -d’:’ -f 1`;

do

logger -t “ARPInvalidate” “Removing ARP Entry for $i”

a=`expr $a + 1`

logger -t “ARPInvalidate” $a

vsish -e set /net/tcpip/v4/neighbor del $i done logger -t

“ARPInvalidate” “Removed $a entries from ARP Cache”

#### End keepalive.sh

Make sure the script is executable by doing a chmod +x after you’ve saved it.

To ensure that this script runs in cron every minute and persists beyond reboots you will need to make some changes to the startup scripts.

Once you’ve created the .sh script with lines above, place it on a shared datastore and add command to /var/spool/cron/crontabs/root

For example, you might place the script in /vmfs/volumes/<scratchvolume>/keepalive/keepalive.sh where <scratchvolume> is the location of your ESXi scratch location.

If want to persist over reboots, need to append e.g. the following to /etc/rc.local:

/bin/kill $(cat /var/run/crond.pid)

/bin/echo “*    *    *   *   *

/vmfs/volumes/<scratchvolume>/keepalive/keepalive.sh 2>&1 ” >>

/var/spool/cron/crontabs/root /bin/busybox crond

Be sure to replace <scratchvolume> with the actual volume of the keepalive script.

Final Word

I would like to thank Justin Turver for providing this workaround. Again please note that this is an unsupported workaround and ordinarily you should not be modifying the cron within the ESXi shell and you should not be modifying the startup scripts. However until a proper fix for this issue is available if you want to use Multi-NIC vMotion and have it work this appears to be one way of achieving it.

This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.comby Michael Webster +. Copyright © 2012 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.

Advertisements

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: