DB2 - Problem description
Problem IT36213 | Status: Closed |
REMOVE IRRELEVANT ALERT ABOUT NETWORK INTERFACE: 'VIRBR0' WHEN USING PURESCALE ON REDHAT | |
product: | |
DB2 FOR LUW / DB2FORLUW / B50 - DB2 | |
Problem description: | |
db2instance -list and list alert report the Network interface: 'virbr0' not responding on the specified network interface. The alert message is as below : [db2sdin1@host-roce-m00 DIAG0000]$ db2instance -list ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORT NETNAME -- ---- ----- --------- ------------ ----- ---------------- ------------ ------- 0 MEMBER STARTED host-roce-m00 host-roce-m00 NO 0 0 db2ps-m00-ens2f0.spdbdev.com,db2ps-m00-ens3f0.spdbdev.com 1 MEMBER STARTED host-roce-m01 host-roce-m01 NO 0 0 db2ps-m01-ens2f0.spdbdev.com,db2ps-m01-ens3f0.spdbdev.com 128 CF PRIMARY host-roce-cf128 host-roce-cf128 NO - 0 db2ps-cf128-ens2f0.spdbdev.com,db2ps-cf128-ens3f0.spdbdev.com,db 2ps-cf128-ens2f1.spdbdev.com,db2ps-cf128-ens3f1.spdbdev.com 129 CF CATCHUP host-roce-cf129 host-roce-cf129 NO - 0 db2ps-cf129-ens2f0.spdbdev.com,db2ps-cf129-ens3f0.spdbdev.com,db 2ps-cf129-ens2f1.spdbdev.com,db2ps-cf129-ens3f1.spdbdev.com HOSTNAME STATE INSTANCE_STOPPED ALERT -------- ----- ---------------- ----- host-roce-cf129 ACTIVE NO YES host-roce-cf128 ACTIVE NO YES host-roce-m01 ACTIVE NO YES host-roce-m00 ACTIVE NO YES There is currently an alert for members, CFs, hosts, cluster file system or cluster configuration in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -list -alert'. [db2sdin1@host-roce-m00 DIAG0000]$ db2cluster -list -alert 1. Alert: The host is not responding on the specified network interface. This impacts members and cluster caching facilities communicating on this network interface. Host: 'host-roce-cf129'. Network interface: 'virbr0'. Action: Check the operating system error logs for messages related to the network interface and verify that the network cables are connected , and that the network interface is configured properly. For more information, see the 'State and alert values' and 'Troubleshooting options for the db2cluster command' topics in the DB2 Information Center. This alert will clear itself when the network interface starts to respond. This alert cannot be cleared manually. Impact: DB2 members on the affected host that communicate with the specified network interface will be stopped and not restarted on the host until the network interface is available. While the network interface is offline, DB2 members using the specified network interface will be in the WAITING_FOR_FALLBACK state in restart light mode on other hosts. If there is a cluster caching facility (CF) on the host and it is the public ethernet network interface that is unresponsive, the CFs on the host will not be available for CF failover and they will remain in the STOPPED state until the network interface issue is resolved. If it is a cluster interconnect network interface that has stopped responding and another cluster interconnect network interface is still responding for the CF, the CF will remain operational. If this alert was caused by running a network interface configuration change command on a CF with multiple cluster interconnect interfaces, the network interface must be re-enabled and the CF must be restarted to clear the alert. Examples of commands to change the network interface configuration are 'rmdev -l ', and 'chdev -l -a state=detach' on AIX systems, or 'ifdown detach ' on Linux systems. If all configured network interfaces on for the CF go offline, the CF is put into the ERROR state and is no longer available for failover. A CF in the STOPPED or ERROR state is restarted when connectivity is restored toa network interface. ---------------------------------------------------------------- --------------- 2. Alert: The host is not responding on the specified network interface. This impacts members and cluster caching facilities communicating on this network interface. Host: 'host-roce-cf128'. Network interface: 'virbr0'. Action: Check the operating system error logs for messages related to the network interface and verify that the network cables are connected , and that the network interface is configured properly. For more information, see the 'State and alert values' and 'Troubleshooting options for the db2cluster command' topics in the DB2 Information Center. This alert will clear itself when the network interface starts to respond. This alert cannot be cleared manually. Impact: DB2 members on the affected host that communicate with the specified network interface will be stopped and not restarted on the host until the network interface is available. While the network interface is offline, DB2 members using the specified network interface will be in the WAITING_FOR_FALLBACK state in restart light mode on other hosts. If there is a cluster caching facility (CF) on the host and it is the public ethernet network interface that is unresponsive, the CFs on the host will not be available for CF failover and they will remain in the STOPPED state until the network interface issue is resolved. If it is a cluster interconnect network interface that has stopped responding and another cluster interconnect network interface is still responding for the CF, the CF will remain operational. If this alert was caused by running a network interface configuration change command on a CF with multiple cluster interconnect interfaces, the network interface must be re-enabled and the CF must be restarted to clear the alert. Examples of commands to change the network interface configuration are 'rmdev -l ', and 'chdev -l -a state=detach' on AIX systems, or 'ifdown detach ' on Linux systems. If all configured network interfaces on for the CF go offline, the CF is put into the ERROR state and is no longer available for failover. A CF in the STOPPED or ERROR state is restarted when connectivity is restored toa network interface. ---------------------------------------------------------------- --------------- 3. Alert: The host is not responding on the specified network interface. This impacts members and cluster caching facilities communicating on this network interface. Host: 'host-roce-m01'. Network interface: 'virbr0'. Action: Check the operating system error logs for messages related to the network interface and verify that the network cables are connected , and that the network interface is configured properly. For more information, see the 'State and alert values' and 'Troubleshooting options for the db2cluster command' topics in the DB2 Information Center. This alert will clear itself when the network interface starts to respond. This alert cannot be cleared manually. Impact: DB2 members on the affected host that communicate with the specified network interface will be stopped and not restarted on the host until the network interface is available. While the network interface is offline, DB2 members using the specified network interface will be in the WAITING_FOR_FALLBACK state in restart light mode on other hosts. If there is a cluster caching facility (CF) on the host and it is the public ethernet network interface that is unresponsive, the CFs on the host will not be available for CF failover and they will remain in the STOPPED state until the network interface issue is resolved. If it is a cluster interconnect network interface that has stopped responding and another cluster interconnect network interface is still responding for the CF, the CF will remain operational. If this alert was caused by running a network interface configuration change command on a CF with multiple cluster interconnect interfaces, the network interface must be re-enabled and the CF must be restarted to clear the alert. Examples of commands to change the network interface configuration are 'rmdev -l ', and 'chdev -l -a state=detach' on AIX systems, or 'ifdown detach ' on Linux systems. If all configured network interfaces on for the CF go offline, the CF is put into the ERROR state and is no longer available for failover. A CF in the STOPPED or ERROR state is restarted when connectivity is restored toa network interface. ---------------------------------------------------------------- --------------- 4. Alert: The host is not responding on the specified network interface. This impacts members and cluster caching facilities communicating on this network interface. Host: 'host-roce-m00'. Network interface: 'virbr0'. Action: Check the operating system error logs for messages related to the network interface and verify that the network cables are connected , and that the network interface is configured properly. For more information, see the 'State and alert values' and 'Troubleshooting options for the db2cluster command' topics in the DB2 Information Center. This alert will clear itself when the network interface starts to respond. This alert cannot be cleared manually. Impact: DB2 members on the affected host that communicate with the specified network interface will be stopped and not restarted on the host until the network interface is available. While the network interface is offline, DB2 members using the specified network interface will be in the WAITING_FOR_FALLBACK state in restart light mode on other hosts. If there is a cluster caching facility (CF) on the host and it is the public ethernet network interface that is unresponsive, the CFs on the host will not be available for CF failover and they will remain in the STOPPED state until the network interface issue is resolved. If it is a cluster interconnect network interface that has stopped responding and another cluster interconnect network interface is still responding for the CF, the CF will remain operational. If this alert was caused by running a network interface configuration change command on a CF with multiple cluster interconnect interfaces, the network interface must be re-enabled and the CF must be restarted to clear the alert. Examples of commands to change the network interface configuration are 'rmdev -l ', and 'chdev -l -a state=detach' on AIX systems, or 'ifdown detach ' on Linux systems. If all configured network interfaces on for the CF go offline, the CF is put into the ERROR state and is no longer available for failover. A CF in the STOPPED or ERROR state is restarted when connectivity is restored toa network interface. The virbr0 is the Redhat libvirtd service bridge interface which is not a relevant interface for the PureScale cluster setup , the alert should not display here though there is no any impact to the cluster , it looks like cluster creation picked up the virbr0 interfaces and included them in the cluster monitoring leading to the alert. This happens because the IP on virbr0 likely resolves to the hostname the customer used to create the instance. Online IBM.Equivalency:db2_public_network_db2sdin1_0 |- Online IBM.NetworkInterface:nm-bond:host-roce-cf128 |- Online IBM.NetworkInterface:nm-bond:host-roce-cf129 |- Online IBM.NetworkInterface:nm-bond:host-roce-m00 |- Online IBM.NetworkInterface:nm-bond:host-roce-m01 |- Offline IBM.NetworkInterface:virbr0:host-roce-cf128 |- Offline IBM.NetworkInterface:virbr0:host-roce-cf129 |- Offline IBM.NetworkInterface:virbr0:host-roce-m00 '- Offline IBM.NetworkInterface:virbr0:host-roce-m01 | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * pS in redhat * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * As a workaround the customer should be able to add entries * * to the /etc/hosts file for the IPs belonging to virbr0 on * * each host and then recreate the resource model to get past * * this issue. or customer could disable or delete this * * Network interface: 'virbr0' from OS layer to workaround * * this . * **************************************************************** | |
Local Fix: | |
As a workaround the customer should be able to add entries to the /etc/hosts file for the IPs belonging to virbr0 on each host and then recreate the resource model to get past this issue. or customer could disable or delete this Network interface: 'virbr0' from OS layer to workaround this . | |
Solution | |
Workaround | |
**************************************************************** * USERS AFFECTED: * * pS in redhat * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * As a workaround the customer should be able to add entries * * to the /etc/hosts file for the IPs belonging to virbr0 on * * each host and then recreate the resource model to get past * * this issue. or customer could disable or delete this * * Network interface: 'virbr0' from OS layer to workaround * * this . * **************************************************************** | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 15.03.2021 16.09.2021 16.09.2021 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) |