DB2 - Problem description
| Problem IC67509 | Status: Closed | 
| DB2STOP TAKES A LONG TIME ON HADR SYSTEM IF STANDBY IS OFFLINE AND DATABASE NOT ACTIVATED | |
| product: | |
| DB2 FOR LUW / DB2FORLUW / 950 - DB2 | |
| Problem description: | |
| When we have a HADR system, and the standby is offline. If the 
database in the primary is not activated,  the first connection 
to the database will be the one to activate it.  If we run db2 
connect to <database name> or db2 start hadr on <database name> 
as primary, but without the "by force" option, the connection 
will try to start hadr and connect to the standby, timing out 
eventually after the HADR_TIMEOUT setting and getting SQL1768N 
Unable to start HADR. Reason code = "7". 
 
2010-02-03-04.29.45.699578+000 I2838967A388       LEVEL: Warning 
PID     : 606706               TID  : 1           PROC : 
db2agent (FRH) 0 
INSTANCE: db2frh               NODE : 000 
APPHDL  : 0-171                APPID: *LOCAL.db2frh.100203042357 
AUTHID  : DB2FRH 
FUNCTION: DB2 UDB, High Availability Disaster Recovery, 
hdrEduStartup, probe:21151 
MESSAGE : Info: HADR Startup has begun. 
 
 
2010-02-03-04.30.16.734635+000 I2842396A552       LEVEL: Error 
PID     : 999878               TID  : 1           PROC : 
db2hadrp (FRH) 0 
INSTANCE: db2frh               NODE : 000 
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduP, 
probe:20390 
MESSAGE : HADR primary did not establish connection with standby 
within timeout 
          and will shut down. BY FORCE option required to start 
primary without 
          standby. Timeout seconds = 
DATA #1 : Hexdump, 4 bytes 
0x07800001D52CD008 : 0000 001E 
 
2010-01-14-06.35.07.374445+000 I5172086A471       LEVEL: Error 
PID     : 1077874              TID  : 1           PROC : 
db2agent (AB7) 0 
INSTANCE: db2ab7               NODE : 000 
APPHDL  : 0-8                  APPID: *LOCAL.db2ab7.100114063322 
AUTHID  : DB2AB7 
FUNCTION: DB2 UDB, High Availability Disaster Recovery, 
hdrEduStartup, probe:21300 
MESSAGE : HADR EDU sqlcode: 
DATA #1 : Hexdump, 4 bytes 
0x000000011121526C : FFFF F918 
.... 
 
2010-01-14-06.35.07.374514+000 I5172558A419       LEVEL: Severe 
PID     : 1077874              TID  : 1           PROC : 
db2agent (AB7) 0 
INSTANCE: db2ab7               NODE : 000 
APPHDL  : 0-8                  APPID: *LOCAL.db2ab7.100114063322 
AUTHID  : DB2AB7 
FUNCTION: DB2 UDB, base sys utilities, sqledint, probe:230 
DATA #1 : Hexdump, 4 bytes 
0x000000011121526C : FFFF F918 
.... 
 
If many of the connection attempts are issued, they will all be 
serialized until the database is activated: 
1. The currently active connection that is trying to start HADR 
is holding the database latch. The application is waiting to 
reach the HADR timeout. 
2. All other connections that are trying to start HADR are 
queued up behind the database latch in a serialized fashion. 
 
If in this scenario we run db2stop force, this might take a long 
time, depending on how many connections have been queued to 
activate the database (they will all fail with HADR timeout 
SQL1768N) 
 
 
When "db2stop force" kicks in, it will detect the number of 
applications that need to be forced: 
 
 
FUNCTION: DB2 UDB, base sys utilities, 
sqeAppServices::ExecuteStopForce, probe:1000 
DATA #1 : String, 47 bytes 
[Force]->Number of applications to be forced : 
DATA #2 : Hexdump, 4 bytes 
0x0FFFFFFFFFFFD698 : 0000 0004 
.... 
 
It will  until all queued up applications respond, and only then 
the database is actually stopped. This might take a long time, 
and could be perceived as db2stop force actually being hung. | |
| Problem Summary: | |
| **************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * DB2STOP TAKES A LONG TIME ON HADR SYSTEM IF STANDBY * * ISOFFLINE AND DATABASE NOT ACTIVATED * * * * The DB2 9.7 APAR is IC67515 * * * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 9.5 Fixpack 6 * **************************************************************** | |
| Local Fix: | |
| When the Standby is offline issue db2 start hadr on <database name> as primary by force to activate database on the Primary. This will avoid the time out waits and a db2stop force if needed, will respond quicker. | |
| available fix packs: | |
| DB2 Version 9.5 Fix Pack 6a for Linux, UNIX, and Windows | |
| Solution | |
| Problem was first fixed in DB2 Version 9.5 Fixpack 6 | |
| Workaround | |
| not known / see Local fix | |
| Timestamps | |
| Date - problem reported : Date - problem closed : Date - last modified : | 29.03.2010 13.05.2010 21.06.2011 | 
| Problem solved at the following versions (IBM BugInfos) | |
| 9.5.FP6 | |
| Problem solved according to the fixlist(s) of the following version(s) | |








 
