News

Latest versions	fixlist
11.1.0.7
10.5.0.9
10.1.0.6
9.8.0.5
9.7.0.11
9.5.0.10
9.1.0.12

Have problems? - contact us.
Register for free
Contact form

DB2 - Problem description

Problem IC71654	Status: Closed
TAKEOVER HADR COMMAND HANGS UP ON STANDBY WHEN A TRAP HAS BEEN PREVIOUSLY SUSTAINED IN PRIMARY DATABASE
product:
DB2 FOR LUW / DB2FORLUW / 970 - DB2
Problem description:
The hang problem occurs if a takeover is issued on an HADR Standby when the HADR Primary has previously sustained a trap. On the HADR Standby: the takeover command will hang, and other commands such as 'db2stop force' will either hang or not work. On the HADR Primary: clients will be unable to connect. If the HADR Primary has previously sustained a trap, you will be able to see: 1) ADM14012C or ADM14013C messages in the administration notification log ({instance_name}.nfy) AND 2) A suspended db2agent in 'db2pd -EDUs' output. And even after you apply APAR IC69960 fix, the takeover command will get into hang on the conditions above. The takeover command fails on the condition above with the Severe error messages like ADM14013C in db2diag.log of primary, which indicate the db2agents had been suspended in primary like below. 2010-09-27-14.35.38.415495+540 I1781400A564 LEVEL: Severe PID : 1577038 TID : 11054 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : TESTDB APPHDL : 0-367 APPID: 10.219.61.1.64526.100927053458 AUTHID : DB2INST1 EDUID : 11054 EDUNAME: db2agent (TESTDB) 0 FUNCTION: DB2 UDB, RAS/PD component, pdResilienceIsSafeToSustain, probe:800 DATA #1 : String, 37 bytes Trap Sustainability Criteria Checking DATA #2 : Hex integer, 8 bytes 0x0000000000021000 DATA #3 : Boolean, 1 bytes true ... 2010-09-27-14.35.38.625896+540 E1813735A941 LEVEL: Severe PID : 1577038 TID : 11054 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : TESTDB APPHDL : 0-367 APPID: 10.219.61.1.64526.100927053458 AUTHID : DB2INST1 EDUID : 11054 EDUNAME: db2agent (TESTDB) 0 (suspended) 0 FUNCTION: DB2 UDB, DRDA Application Server, sqljsTrapResilience, probe:800 MESSAGE : ADM14013C The following type of critical error occurred: "Trap". This error occurred because one or more threads that are associated with the current DB2 instance have been suspended, but the instance process is still running. First Occurrence Data Capture (FODC) was invoked in the following mode: "Automatic". FODC diagnostic information is located in the following directory: "/var/log/db2/FODC_Trap_2010-09-27-14.35.38.031284/". For more information on sustained traps, see: * Enhanced resilience to errors and traps reduces outages http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?t opic=/com.ibm.db2.luw.wn.doc/doc/c0054512.html
Problem Summary:
**************************************************************** * USERS AFFECTED: * * All * **************************************************************** * PROBLEM DESCRIPTION: * * "takeover hadr" command hangs up when a trap has been * * sustained. * **************************************************************** * RECOMMENDATION: * * Upgrade to db2 Version 9.7 FixPak 4 * ****************************************************************
Local Fix:
If db2_kill is issued on the primary hadr system to disconnect HADR connection, takeover hadr should be ended with errors. For more information on recovering from sustained traps, see: * Recovering from sustained traps http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?t opic=/com.ibm.db2.luw.admin.trb.doc/doc/t0055494.html
available fix packs:
DB2 Version 9.7 Fix Pack 4 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 5 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 6 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 7 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 8 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 9 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 9a for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 10 for Linux, UNIX, and Windows
Solution
Problem was the first fixed in Version 9.7 FixPak 4
Workaround
not known / see Local fix
Timestamps
Date - problem reported : Date - problem closed : Date - last modified :	04.10.2010 09.05.2011 09.05.2011
Problem solved at the following versions (IBM BugInfos)
9.7.
Problem solved according to the fixlist(s) of the following version(s)
9.7.0.4