DB2 - Problem description
Problem IT18809 | Status: Closed |
POSSIBLE HANG CAUSED BY A 'DB2TASKP' EDU WHEN BRINGING DOWN A DATABASE | |
product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
Problem description: | |
The hang would occur like this. A 'detach partition' occurs and creates an ABP task for handling the index changes related to the 'detach partition'. The timing is crucial here but basically you would see 3 kinds of stacks. One for the thread terminating the db and waiting for the last user to get out: 0x0900000010B54CC0 sqloWaitEDUWaitPost + 0x334 0x0900000010A1E5E4 sqloWaitEDUWaitPost@glue113 + 0x9C 0x0900000010A1D590 sqeLocalDatabase::TermDbConnect + 0xBC4 0x0900000010A19B28 sqeApplication::AppStopUsing + 0xE44 0x0900000010CE7F14 sqlesrspWrp + 0xF4 0x0900000010CE816C sqleUCagentConnectReset + 0xCC 0x09000000109EDFA8 sqljsCleanup + 0x924 0x09000000109EFA34 sqljsDrdaAsInnerDriver + 0x394 0x09000000109EF32C sqljsDrdaAsDriver + 0xEC 0x0900000010A73CD4 sqeAgent::RunEDU + 0xB4 0x0900000010A71320 sqzEDUObj::EDUDriver + 0xDC One or many, optional, for new connections coming in and waiting for the db to be down, before bringing it up again. These agents were in loop of getting and releasing the latch SQLO_LT_sqeDBMgr__dbMgrLatch. 0x0900000010B11F1C sqloXlatchConflict + 0x27C 0x0900000010B11BDC sqloXlatchConflict@glue1AC + 0x78 0x090000000D25B5F4 lockDbMgrArgs + 0x118 0x090000000CB1FDF0 StartUsingLocalDatabase + 0x130 0x0900000010A4B8B0 AppStartUsing + 0x1A0 0x0900000010CEAC28 AppLocalStart + 0x1F4 0x0900000010C42F10 sqlelostWrp + 0x44 0x0900000010C42FC8 sqleUCengnInit + 0x64 0x0900000010D0AD18 sqleUCagentConnect + 0x2C0 And finally the db2taskp EDU holding the dbcb and preventing things to go on. That EDU would simply loop trying to resume a task and not responding to the interruption: 0x0900000010B54B50 sqloWaitEDUWaitPost + 0x1C4 0x090000000DD19C48 sqeAgent::IntrptWaitLock + 0x680 0x090000000DD658B4 ABPDispatcher::getTaskProAssignment + 0x74 0x09000000109C3FA8 ABPAgent::getTaskProAssignment + 0x90 0x09000000109C4170 ABPAgent::main + 0x110 0x09000000109C440C sqeAgent::abpAgentEntryPoint + 0xE4 0x0900000010D05C3C sqleIndCoordProcessRequest + 0x254 0x0900000010A73E08 sqeAgent::RunEDU + 0x1E8 0x0900000010A71320 sqzEDUObj::EDUDriver + 0xDC 0x0900000010A71204 sqlzRunEDU + 0x24 0x0900000010A7A4E4 sqloEDUEntry + 0x264 And we could see db2taskp EDU in db2diag.log like: 2016-12-19-10.14.19.580847+480 LEVEL: Warning PID : 14811346 TID : 9923 PROC : db2sysc 0 INSTANCE: dbuser NODE : 000 DB : SAMPLE APPHDL : 0-27124 APPID: *N0.DB2.161218213921 AUTHID : DBUSER EDUID : 9923 EDUNAME: db2taskp (SAMPLE) 0 FUNCTION: DB2 UDB, AIC, apdTaskProcessorCleanup, probe:194 MESSAGE : ZRC=0x8012006D=-2146303891=SQLR_CA_BUILT "SQLCA has already been built" CALLED : DB2 UDB, AIC, apdTaskProcessor RETCODE : ZRC=0x82A90066=-2102853530=ABP_SUSPEND_TASK_PRO "Suspend the task processor" DATA #1 : String, 28 bytes Source Table Schema and Name DATA #2 : String, 8 bytes SCHEMA1 DATA #3 : String, 13 bytes TABLE1 DATA #4 : String, 12 bytes Partition ID DATA #5 : unsigned integer, 2 bytes 2 DATA #6 : String, 28 bytes Target Table Schema and Name DATA #7 : String, 8 bytes SCHEMA1 DATA #8 : String, 13 bytes TABLE1_PART2 | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * Database Partitioning Feature users * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 version 10.5.0.9 or later. * **************************************************************** | |
Local Fix: | |
Use the killdb2 command to force the instance to come down. | |
Solution | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : follow-up : IT18942 IT18943 IT18944 | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 13.01.2017 16.03.2017 16.03.2017 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) |