DB2 - Problem description
| Problem IT03630 | Status: Closed |
TAKEOVER HADR COMMAND SUCCEEDS BUT RESOURCE GROUP REPORTS FAILED OFFLINE IN LSSAM OUTPUT | |
| product: | |
DB2 FOR LUW / DB2FORLUW / A50 - DB2 | |
| Problem description: | |
In a TSA/HADR high availability environment configured using
db2haicu, if the public network adapters of the standby and
primary nodes are defined within two separate network
equivalency groupings, i.e. as seen in this "lssam -V" output:
Online IBM.ResourceGroup:db2_db2inst1_host03_0-rg Nominal=Online
'- Online IBM.Application:db2_db2inst1_host03_0-rs
-.
'- Online
IBM.Application:db2_db2inst1_host03_0-rs:host03 |
Online IBM.ResourceGroup:db2_db2inst1_host04_0-rg Nominal=Online
|
'- Online IBM.Application:db2_db2inst1_host04_0-rs
| -.
'- Online
IBM.Application:db2_db2inst1_host04_0-rs:host04 | |
Online IBM.ResourceGroup:db2_db2inst1_db2inst1_HADRDB-rg
Nominal=Online | |
'- Online
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs
| | -. -.
|- Online
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs:host03 | | |
|
'- Offline
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs:host04 | | |
|
Online IBM.Equivalency:db2_db2inst1_host03_0-rg_group-equ
| | | |
'- Online IBM.PeerNode:host03:host03
| | | |
Online IBM.Equivalency:db2_db2inst1_host04_0-rg_group-equ
| | | |
'- Online IBM.PeerNode:host04:host04
| | | |
Online IBM.Equivalency:db2_db2inst1_db2inst1_HADRDB-rg_group-equ
| | | |
|- Online IBM.PeerNode:host03:host03
| | | |
'- Online IBM.PeerNode:host04:host04
DO | | DO
Online IBM.Equivalency:db2_public_network_0
<' | | <'
'- Online IBM.NetworkInterface:eth0:host03
DO DO
Online IBM.Equivalency:db2_public_network_1
<' <'
'- Online IBM.NetworkInterface:eth1:host04
then issuing a takeover HADR command on the database will
succeed, but it will leave the resource model in the following
state, as seen by "lssam":
Online IBM.ResourceGroup:db2_db2inst1_host03_0-rg Nominal=Online
'- Online IBM.Application:db2_db2inst1_host03_0-rs
'- Online
IBM.Application:db2_db2inst1_host03_0-rs:host03
Online IBM.ResourceGroup:db2_db2inst1_host04_0-rg Nominal=Online
'- Online IBM.Application:db2_db2inst1_host04_0-rs
'- Online
IBM.Application:db2_db2inst1_host04_0-rs:host04
Failed offline IBM.ResourceGroup:db2_db2inst1_db2inst1_HADRDB-rg
Binding=Sacrificed Nominal=Online
'- Offline
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs
|- Offline
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs:host03
'- Offline
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs:host04
Online IBM.Equivalency:db2_db2inst1_host03_0-rg_group-equ
'- Online IBM.PeerNode:host03:host03
Online IBM.Equivalency:db2_db2inst1_host04_0-rg_group-equ
'- Online IBM.PeerNode:host04:host04
Online IBM.Equivalency:db2_db2inst1_db2inst1_HADRDB-rg_group-equ
|- Online IBM.PeerNode:host03:host03
'- Online IBM.PeerNode:host04:host04
Online IBM.Equivalency:db2_public_network_0
'- Online IBM.NetworkInterface:eth0:host03
Online IBM.Equivalency:db2_public_network_1
'- Online IBM.NetworkInterface:eth1:host04
As seen above, the HADR resource group is left in a "Failed
Offline" state. In order to recover from this state, log in as
root from either node and issue the following sequence of
commands:
1) export CT_MANAGEMENT_SCOPE=2
2) rgreq -o lock db2_db2inst1_db2inst1_HADRDB-rg
3) rmrel -s "Name like 'db2_db2inst1_db2inst1_HADRDB-rs%'"
4) repeat steps 2 and 3 for every affected HADR database, where
HADRDB is the database name
5) rgreq -o unlock db2_db2inst1_db2inst1_HADRDB-rg
6) repeat step 5 for every HADR database, where HADRDB is the
database name
After having followed these instructions, the resource model as
shown by "lssam -V" should now look like this:
Online IBM.ResourceGroup:db2_db2inst1_host03_0-rg Nominal=Online
'- Online IBM.Application:db2_db2inst1_host03_0-rs
-.
'- Online
IBM.Application:db2_db2inst1_host03_0-rs:host03 |
Online IBM.ResourceGroup:db2_db2inst1_host04_0-rg Nominal=Online
|
'- Online IBM.Application:db2_db2inst1_host04_0-rs
| -.
'- Online
IBM.Application:db2_db2inst1_host04_0-rs:host04 | |
Online IBM.ResourceGroup:db2_db2inst1_db2inst1_HADRDB-rg
Nominal=Online | |
'- Online
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs
| |
|- Online
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs:host03 | |
'- Offline
IBM.Application:db2_db2inst1_db2inst1_HADRDB-rs:host04 | |
Online IBM.Equivalency:db2_db2inst1_host03_0-rg_group-equ
| |
'- Online IBM.PeerNode:host03:host03
| |
Online IBM.Equivalency:db2_db2inst1_host04_0-rg_group-equ
| |
'- Online IBM.PeerNode:host04:host04
| |
Online IBM.Equivalency:db2_db2inst1_db2inst1_HADRDB-rg_group-equ
| |
|- Online IBM.PeerNode:host04:host04
| |
'- Online IBM.PeerNode:host03:host03
DO |
Online IBM.Equivalency:db2_public_network_0
<' |
'- Online IBM.NetworkInterface:eth0:host03
DO
Online IBM.Equivalency:db2_public_network_1
<'
'- Online IBM.NetworkInterface:eth1:host04
Issuing a takeover HADR command after this, should now succeed. | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * DB2 HADR/TSA users * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 V10.5 FP5 * **************************************************************** | |
| Local Fix: | |
Manually delete HADR dependencies against the public network equivalency grouping. See Error Description for more details. | |
| Solution | |
Fixed in DB2 V10.5 FP5 | |
| Workaround | |
not known / see Local fix | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 06.08.2014 13.03.2015 13.03.2015 |
| Problem solved at the following versions (IBM BugInfos) | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 10.5.0.5 |
|