Micorosoft has released the following HotFix Regarding This Issue
Please obtain the fix from Microsoft directly as this seems to change as newer security fixes are made to the OS.
As of May 7, 2013, this issue can be fixed with the hotfix. Please be sure that your Windows 2008 R2 is patched SP1 and most recent security updates being applied.
We have found that the change in the Gratuitous ARP behavior under Windows 2008 will cause some routers (typically CISCO) from not recognizing the Smart IP fail-over from one machine to another.
The key symptom is that the IP address fail-over DOES occur, we can ping and access the VIP address from other servers and machines that on the same network, but across the subnet over a router, the change appears to have not been detected. It will actually be detected but it will be over minutes and hours before the router realizes of the changes.
Up to Windows Server 2003, this worked instantly because when a network interface address change happened, the server sent a Gratuitous ARP request so that other devices on the network, most importantly to the the router would detect this change. When a router becomes aware of this change it binds to the new MAC address of the IP address and then the new traffic is routed accordingly. This behavior was eliminated in Windows 2008 Server most likely due to a security measure.
When a gratuitous ARP is sent by a Windows Vista or Windows Server 2008, the GARP sent to the network has the SPA field in the initial request set to 0.0.0.0. When a gratuitous ARP is received by Windows Vista or Windows Server 2008, these systems will not update their cache with incorrect information 0.0.0.0 (on purpose). This way the ARP or neighbor caches of systems receiving this request are not updated if the IP address is duplicated.
Explanations of the GARP Changes in Windows 2008 and Vista
Possible Code Example to "home brew" a network driver to solve this issue by Sending GRAP in Windows is here: http://msdn.microsoft.com/en-us/library/aa366358(VS.85).aspx
At least on our CISCO based environment the following was found to work without the use of the Gratuitous ARP
Note that ARP deals with the Physical Layer of the networking. ARP does not cross over different networks. So this is why this works only on the same subnet.
For most CISCO routers, the default ARP cache timeout (out of the box) appears to be set to 4 hours. This means if we wait for 4 hours the Smart IP will eventually become routed. It is possible to immediately clear the ARP cache by typing in the following IOS command to the router.
To change the default timeout value the IOS command is:
Before considering the alternatives, please be sure that applying the HOTFIX (http://support.microsoft.com/kb/2582281) resolves the issue.
After we apply the hotfix on both servers, please test the fail-over and send data for each server we perform the fail-over.
If this technique fails then here are the alternatives;
Windows has netsh command to allow configuration of the Ethernet interface. Note that this command requires an elevated permission level (i.e., "Run as Administrator.")
netsh interface ip add address name="Local Area Connection" addr=172.16.16.123 mask=255.255.255.0 gateway=172.16.16.1 gwmetric=1
netsh interface ip delete address name="Local Area Connection" addr=172.16.16.123
The Imorgon Server has a cluster management service called Imorgon Server Monitor Service. This automatically detects a cluster fail-over and assigns the DICOM Storage SCP end-point Cluster IP Addrses. This is often called a "Smart IP", "Cluster IP", or "Floating IP" address in various Imorgon literature or customer communications.
Imorgon hosts a fully mirrored servers for its high availability solution (HAS). In modern software, client software accessing the mirrored server can automatically detects the available server and establishes communication. In most modern Internet computing scenario, this type of fail-over can be handled either by a load balancing router or intelligent DNS server directing traffic transparently to whichever server is alive. However most modalities, especially the older systems, are neither capable of automatically detecting this type of condition or be able to use DNS.
To address this issue, Imorgon servers hosts multiple IP addresses for each server, where one of the IP address is to be used to point DICOM modalities and used as a DICOM endpoint address. Should one server needs to be taken down, or otherwise fails to operate the other servers can automatically detect a down condition and the DICOM endpoint address gets activated on the surviving server.
When this condition occurs, the Imorgon Server Monitor Service uses the Microsoft WMI object to add or remove the IP address from the server's Network Interface (e.g., the Ethernet or a Smart Machine's Network Interface.)
This IP address is configured in the Imorgon database table.
On some installations there are Link-Layer Topology Discovery Features turned on the TCP/IP interfaces. We have found out that when Link-Layer Topology Discover Mapper I/O Driver is enabled, the WMI command gets ignored.
To fix this issue, open the TCP/IP control panel property and disable the Link-Layer Topology Dicovery Mapper I/O Driver feature as illustrated below.