DB2 - Problem description
| Problem IC95661 | Status: Closed |
IN A LARGE DPF/HA CLUSTER, IF ALL RESOURCES ARE STARTED AT ONCE IT IS POSSIBLE THAT CERTAIN RESOURCES ARE NOT BROUGHT ONLINE | |
| product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
| Problem description: | |
In a large DB2 DPF environment configured for high availability,
if all resources are ordered online simultaneously then it is
possible that certain db2 resources are not successfully
started. In the lssam output, affected DB2 partition resources
will be displayed as "Offline" while the overlying Resource
Group (RG) will be displayed as "Pending Online". Here is an
example of a DB2 partition resource group in this described
state:
Example lssam output of the problem:
|- Pending online IBM.ResourceGroup:db2_db2inst1_115-rg
Nominal=Online
|- Offline IBM.Application:db2_db2inst1_115-rs
|- Offline
IBM.Application:db2_db2inst1_115-rs:bcu_node17
'- Offline
IBM.Application:db2_db2inst1_115-rs:bcu_node21
Here are some examples of commands that will attempt to online
all resources simultaneously:
db2start
chrg -o online -s 1=1
hastartdb2 (applicable only to ISAS)
When the start orders are issued, TSA-MP calls the
"/usr/sbin/rsct/sapolicies/db2/db2V97_start.ksh" script for each
partition in order to start its corresponding resource. As each
resource is started, a call to query for status, forces all
other nodes in the cluster to run a monitor command against all
other partition resources. The result is that multiple monitors
are run in parallel due to the start activity which creates race
conditions that interfere with TSA-MP's ability to capture all
the needed return codes. Due to this, TSA-MP is not able to send
start orders for all partitions as expected. | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * DB2 DPF * **************************************************************** * PROBLEM DESCRIPTION: * * See Problem Description above. * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version Fix Pack 4. * **************************************************************** | |
| Local Fix: | |
As root, comment out (or delete) the following line from the
/usr/sbin/rsct/sapolicies/db2/db2V97_start.ksh file:
runact -s "Name like 'db2_%${DB2INSTANCE?}%'" IBM.Application
refreshOpState 2> /dev/null
as it is not needed and is the source of this problem. | |
| available fix packs: | |
DB2 Version 10.1 Fix Pack 4 for Linux, UNIX, and Windows | |
| Solution | |
First fixed in DB2 Version 10.1 Fix Pack 4. | |
| Workaround | |
not known / see Local fix | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 02.09.2013 22.07.2014 22.07.2014 |
| Problem solved at the following versions (IBM BugInfos) | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 10.1.0.4 |
|