Recently, I was working on a database using HADR with failover handled by TSAMP. The database was configured to use a virtual IP (VIP) address. However, when I tested the failover with the application team, some of the applications were getting SQL1776N errors from the database. We quickly recognized that the applications were trying (and failing) to connect to the same physical server (which had become the standby copy) instead of the new primary, even though TSAMP had moved the VIP to the standby server. It was clear that some of the applications had not been configured to use the VIP.
In an environment with tens or hundreds of client machines that connect to a database, it’s relatively easy for someone to make a mistake when setting up a database connection on one of the machines. In a production environment, though, it’s not always feasible to wait until a failover occurs (whether scheduled or unscheduled) in order to validate that all of the applications have been configured correctly.
Fortunately, it is possible to verify, from the database server, which IP address clients are using to connect to the database.
Capturing Client Configurations from the Db2 Server
First, take a look at which IP addresses are in use on your system. The ip command is used on Linux systems to display configuration information:
$ ip address show
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:b2:70:12 brd ff:ff:ff:ff:ff:ff
inet 172.16.171.15/24 brd 172.16.171.255 scope global eth0
valid_lft forever preferred_lft forever
inet 172.16.171.25/24 brd 172.16.171.255 scope global secondary eth0:0
valid_lft forever preferred_lft forever
3: eth1: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:b2:70:1c brd ff:ff:ff:ff:ff:ff
inet 192.168.242.15/24 brd 192.168.242.255 scope global eth1
valid_lft forever preferred_lft forever
In this example output, the physical server’s IP address is 172.16.171.15, and the VIP is 172.16.171.25. Knowing both of these IP addresses is important when reviewing the list of active connections.
The netstat command lists open network and socket connections on the server, showing the source IP address and port, as well as the target IP address and port. By filtering specifically for the Db2 TCPIP port(s), we get this information:
$ netstat -tn | grep 50000
tcp 0 0 172.16.171.25:50000 172.16.171.18:62586 ESTABLISHED
tcp 0 0 172.16.171.25:50000 172.16.171.17:62380 ESTABLISHED
tcp 0 0 172.16.171.25:50000 172.16.171.18:62531 ESTABLISHED
tcp 0 0 172.16.171.25:50000 172.16.171.18:62337 ESTABLISHED
tcp 0 0 172.16.171.25:50000 172.16.171.19:62647 ESTABLISHED
tcp 0 0 172.16.171.25:50000 172.16.171.19:62311 ESTABLISHED
tcp 0 0 172.16.171.15:50000 172.16.171.22:62912 ESTABLISHED
tcp 0 0 172.16.171.25:50000 172.16.171.17:61344 ESTABLISHED
In the output above, the 4th column shows the IP address/port on the database server, and the 5th column represents the IP address/port on the client workstation.
Looking in the 4th column, note that all connections except for one are using the VIP (172.16.171.25). However, one connection is coming via the physical IP address (172.16.171.15):
tcp 0 0 172.16.171.15:50000 172.16.171.22:62912 ESTABLISHED
The fifth column indicates that the machine 172.16.171.22 has been configured to connect to Db2 using the physical IP address instead of the VIP. Using this information, the team responsible for the application on this server can review and resolve the configuration problem.
Platform Portability
The commands in this post will work on Linux systems; however, this methodology can be applied to other UNIX platforms and Windows. On UNIX forms you will need to use the ifconfig command to display the database server’s IP addresses; Windows has the ipconfig command. The netstat command is available on all platforms, the -n option is particularly useful as it displays IP addresses rather than (truncated) hostnames.
Conclusion
Fortunately, the misconfiguration I encountered occurred during a failover test, meaning that there was no real impact to the business as a result of the misconfiguration. Finding these problems is exactly the reason why testing is so important for any environment that has high availability requirements. However, using the methods from this blog post, you should be able to identify issues in your environment – before they become problems.