DB2 - Problem description
| Problem IC73855 | Status: Closed |
Working with Federation Server, db2 instance may trap when handled ENOMEM (12) "There is not enough memory available now." | |
| product: | |
DB2 FOR LUW / DB2FORLUW / 910 - DB2 | |
| Problem description: | |
Instance crashed while running a sql statement with the
following stacks:
0x09000000030DFFE0 @124@9@ossLockTestGet__FPVi + 0x18
0x09000000030E00DC
sqloXlatchAIX__FP12sqloSpinLockUlCPCcCUl@glueE1 + 0x94
0x090000000308567C
sqlerTakeLibLatch__FP27SQLER_LOADED_LIB_HASH_TABLEiT2 + 0x58
0x09000000030857A8 sqlerLibraryLoad + 0x64
0x090000000418F420
sqlriFedOneTimeInitRtn__FP8sqlrr_cbP10sqlri_ufob + 0x308
0x090000000418EE0C
sqlriFedInvokeInvoker__FP10sqlri_ufobP14sqlqg_Fmp_Info + 0xB4
0x090000000358C0CC
sqlqg_Call_FMP_Thread__FP17sqlqg_FMP_RequestPP15sqlqg_FMP_Reply
+ 0x394
0x0900000003C4A544
sqlqgDeleteServer__FPP14UnfencedServer16Sqlqg_DeleteType + 0x2BC
0x0900000002833D88
sqlqgFindServer__FPUc12Sqlqg_OpTypePP14UnfencedServerUi + 0x3C4
0x09000000041C378C sqlqgOpen__FP12sqlri_rquery + 0x9EC
0x090000000421E074 sqlri_djx_rta__FP8sqlrr_cb + 0x460
0x09000000030D4020 sqlriunn__FP8sqlrr_cbP10sqlri_stob + 0x24
0x09000000030D419C sqlriset__FP8sqlrr_cb + 0x78
0x09000000030CAF28 sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm +
0x4
0x0900000003019240
sqlrr_process_fetch_request__FP14db2UCinterface + 0x110
0x0900000002E825FC
sqlrr_open__FP14db2UCinterfaceP15db2UCCursorInfo + 0x2C
0x0900000002F72518
sqljs_ddm_opnqry__FP14db2UCinterfaceP14sqljsDDMObject + 0xA94
Prior to this crash, there were many errors in db2diag.log
reported "No Memory Available", e.g.:
2010-12-07-13.42.40.631290+480 E1791023315A732 LEVEL: Error
(OS)
PID : 1355956 TID : 1 PROC :
db2agent (SAMPLE) 0
INSTANCE: db2inst1 NODE : 000 DB :
SAMPLE
APPHDL : 0-525 APPID: GA990B88.A410.11C700223017
AUTHID : HUJINPEI
FUNCTION: DB2 UDB, oper system services,
sqloAIXLoadModuleTryShr, probe:130
CALLED : OS, -, dlopen
OSERR : ENOMEM (12) "There is not enough memory available
now."
MESSAGE : Attempt to load specified library failed.
DATA #1 : Library name or path, 41 bytes
/hujinpei/db2inst1/sqllib/lib64/libdb2qgstp.a
DATA #2 : shared library load flags, PD_TYPE_LOAD_FLAGS, 4 bytes
2010-12-07-13.42.40.649557+480 I1791025558A485 LEVEL: Error
PID : 1355956 TID : 1 PROC :
db2agent (SAMPLE) 0
INSTANCE: db2inst1 NODE : 000 DB :
SAMPLE
APPHDL : 0-525
AUTHID : HUJINPEI
FUNCTION: DB2 UDB, runtime interpreter, sqlriFedOneTimeInitRtn,
probe:40
RETCODE : ZRC=0x8B0F0000=-1961951232=SQLO_NOMEM
"No Memory Available (reason code is id of requested
heap)"
DIA8300C A memory heap error has occurred.
2010-12-07-13.43.37.708714+480 E1791083624A1533 LEVEL: Warning
(OS)
PID : 1355956 TID : 1 PROC :
db2agent (SAMPLE) 0
INSTANCE: db2inst1 NODE : 000 DB :
SAMPLE
APPHDL : 0-1417 APPID: GA990B88.CE10.11C700223137
<app 1417 saw NOMEM>
AUTHID : HUJINPEI
FUNCTION: DB2 UDB, SQO Memory Management,
sqloLogMemoryCondition, probe:100
CALLED : OS, -, malloc
OSERR : ENOMEM (12) "There is not enough memory available
now."
MESSAGE : Private memory and/or virtual address space exhausted,
or data ulimit
exceeded
DATA #1 : Soft data resource limit, PD_TYPE_RLIM_DATA_CUR, 8
bytes
251657728
DATA #2 : Requested size, PD_TYPE_MEM_REQUESTED_SIZE, 8 bytes
266240
DATA #3 : Current set size, PD_TYPE_SET_SIZE, 8 bytes
215678976
CALLSTCK:
[0] 0x0900000002DE3340 sqloLogMemoryCondition + 0x26C
[1] 0x0900000002DE3018 sqloLogMemoryCondition@glue236 + 0x74
[2] 0x09000000031A1F3C sqlogmblkEx + 0xC
[3] 0x09000000041E9600
allocMgr__13sqlqg_memPoolFP19sqlqg_memEntityInfo + 0x7C
[4] 0x09000000041E8F68
getMgrs__13sqlqg_memPoolFP19sqlqg_memEntityInfoP17sqlqg_memMgrsI
nfo
+ 0x1A0
[5] 0x09000000041E7E54
initialize_server__14UnfencedServerFP11Server_Infoi + 0x260
[6] 0x09000000041E77BC
initialize_server__25UnfencedRelational_ServerFP11Server_Infoi +
0x670
[7] 0x0900000003C4AB88
get_server__7WrapperFPUcP11Server_InfoPi + 0x14C
[8] 0x09000000028340D4
sqlqgFindServer__FPUc12Sqlqg_OpTypePP14UnfencedServerUi + 0x710
[9] 0x09000000041C378C sqlqgOpen__FP12sqlri_rquery + 0x9EC | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * Instance crashed while running a sql statement with the * * * * following stacks: * * * * * * * * 0x09000000030DFFE0 @124@9@ossLockTestGet__FPVi + 0x18 * * * * 0x09000000030E00DC * * * * sqloXlatchAIX__FP12sqloSpinLockUlCPCcCUl@glueE1 + 0x94 * * * * 0x090000000308567C * * * * sqlerTakeLibLatch__FP27SQLER_LOADED_LIB_HASH_TABLEiT2 + 0x58 * * * * 0x09000000030857A8 sqlerLibraryLoad + 0x64 * * * * 0x090000000418F420 * * * * sqlriFedOneTimeInitRtn__FP8sqlrr_cbP10sqlri_ufob + 0x308 * * * * 0x090000000418EE0C * * * * sqlriFedInvokeInvoker__FP10sqlri_ufobP14sqlqg_Fmp_Info + * * 0xB4 * * 0x090000000358C0CC * * * * sqlqg_Call_FMP_Thread__FP17sqlqg_FMP_RequestPP15sqlqg_FMP_Repl * + 0x394 * * * * 0x0900000003C4A544 * * * * sqlqgDeleteServer__FPP14UnfencedServer16Sqlqg_DeleteType + * * 0x2BC * * 0x0900000002833D88 * * * * sqlqgFindServer__FPUc12Sqlqg_OpTypePP14UnfencedServerUi + * * 0x3C4 * * 0x09000000041C378C sqlqgOpen__FP12sqlri_rquery + 0x9EC * * * * 0x090000000421E074 sqlri_djx_rta__FP8sqlrr_cb + 0x460 * * * * 0x09000000030D4020 sqlriunn__FP8sqlrr_cbP10sqlri_stob + 0x24 * * * * 0x09000000030D419C sqlriset__FP8sqlrr_cb + 0x78 * * * * 0x09000000030CAF28 * * sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + * * 0x4 * * * * 0x0900000003019240 * * * * sqlrr_process_fetch_request__FP14db2UCinterface + 0x110 * * * * 0x0900000002E825FC * * * * sqlrr_open__FP14db2UCinterfaceP15db2UCCursorInfo + 0x2C * * * * 0x0900000002F72518 * * * * sqljs_ddm_opnqry__FP14db2UCinterfaceP14sqljsDDMObject + * * 0xA94 * * * * * * Prior to this crash, there were many errors in db2diag.log * * * * reported "No Memory Available", e.g.: * * * * * * * * 2010-12-07-13.42.40.631290+480 E1791023315A732 LEVEL: * * Error * * (OS) * * * * PID : 1355956 TID : 1 PROC : * * * * db2agent (BASS_DW) 0 * * * * INSTANCE: dwinst NODE : 000 DB : * * BASS_DW * * APPHDL : 0-525 APPID: * * GA990B88.A410.11C700223017 * * AUTHID : TYMX * * * * FUNCTION: DB2 UDB, oper system services, * * * * sqloAIXLoadModuleTryShr, probe:130 * * * * CALLED : OS, -, dlopen * * * * OSERR : ENOMEM (12) "There is not enough memory available * * * * now." * * * * MESSAGE : Attempt to load specified library failed. * * * * DATA #1 : Library name or path, 41 bytes * * * * /dwhome/dwinst/sqllib/lib64/libdb2qgstp.a * * * * DATA #2 : shared library load flags, PD_TYPE_LOAD_FLAGS, 4 * * bytes * * * * * * 2010-12-07-13.42.40.649557+480 I1791025558A485 LEVEL: * * Error * * PID : 1355956 TID : 1 PROC : * * * * db2agent (BASS_DW) 0 * * * * INSTANCE: dwinst NODE : 000 DB : * * BASS_DW * * APPHDL : 0-525 * * * * AUTHID : TYMX * * * * FUNCTION: DB2 UDB, runtime interpreter, * * sqlriFedOneTimeInitRtn, * * probe:40 * * * * RETCODE : ZRC=0x8B0F0000=-1961951232=SQLO_NOMEM * * * * "No Memory Available (reason code is id of * * requested * * heap)" * * * * DIA8300C A memory heap error has occurred. * * * * * * * * 2010-12-07-13.43.37.708714+480 E1791083624A1533 LEVEL: * * Warning * * (OS) * * * * PID : 1355956 TID : 1 PROC : * * * * db2agent (BASS_DW) 0 * * * * INSTANCE: dwinst NODE : 000 DB : * * BASS_DW * * APPHDL : 0-1417 APPID: * * GA990B88.CE10.11C700223137 * * <app 1417 saw NOMEM> * * * * AUTHID : TYMX * * * * FUNCTION: DB2 UDB, SQO Memory Management, * * * * sqloLogMemoryCondition, probe:100 * * * * CALLED : OS, -, malloc * * * * OSERR : ENOMEM (12) "There is not enough memory available * * * * now." * * * * MESSAGE : Private memory and/or virtual address space * * exhausted, * * or data ulimit * * * * exceeded * * * * DATA #1 : Soft data resource limit, PD_TYPE_RLIM_DATA_CUR, 8 * * * * bytes * * * * 251657728 * * * * DATA #2 : Requested size, PD_TYPE_MEM_REQUESTED_SIZE, 8 * * bytes * * 266240 * * * * DATA #3 : Current set size, PD_TYPE_SET_SIZE, 8 bytes * * * * 215678976 * * * * CALLSTCK: * * * * [0] 0x0900000002DE3340 sqloLogMemoryCondition + 0x26C * * * * [1] 0x0900000002DE3018 sqloLogMemoryCondition@glue236 + * * 0x74 * * [2] 0x09000000031A1F3C sqlogmblkEx + 0xC * * * * [3] 0x09000000041E9600 * * * * allocMgr__13sqlqg_memPoolFP19sqlqg_memEntityInfo + 0x7C * * * * [4] 0x09000000041E8F68 * * * * getMgrs__13sqlqg_memPoolFP19sqlqg_memEntityInfoP17sqlqg_memMgr * nfo * * * * + 0x1A0 * * * * [5] 0x09000000041E7E54 * * * * initialize_server__14UnfencedServerFP11Server_Infoi + 0x260 * * * * [6] 0x09000000041E77BC * * * * initialize_server__25UnfencedRelational_ServerFP11Server_Infoi * + * * 0x670 * * * * [7] 0x0900000003C4AB88 * * * * get_server__7WrapperFPUcP11Server_InfoPi + 0x14C * * * * [8] 0x09000000028340D4 * * * * sqlqgFindServer__FPUc12Sqlqg_OpTypePP14UnfencedServerUi + * * 0x710 * * [9] 0x09000000041C378C sqlqgOpen__FP12sqlri_rquery + 0x9EC * **************************************************************** * RECOMMENDATION: * * Upgarde to v9.1FP11 * **************************************************************** | |
| Local Fix: | |
| available fix packs: | |
DB2 Version 9.1 Fix Pack 11 for Linux, UNIX and Windows | |
| Solution | |
In sqlriFedOneTimeInitRtn, add checking in case the
librariesTable is null in else block as below which could avoid
from trap potentially.
+3241 if (djfmp_appCB->dj_handle == NULL)
+3242 {
......
+3286 } else {
<Check librariesTable here and throw 901 if
null>
+3287 // Look up the datajoiner library for the
invokercb (it's already
+3288 // in the hash table)
+3289 rc = sqlerLibraryLoad ("", // don't care about
the routinename
+3290 "", // or specific name
+3291 libPath,
+3292 (char
*)QUERY_GATEWAY_STP_LIBRARY,
+3293 "", // don't care about
the func name
+3294
l_sqlr_rcb->agent_private_cbp->librariesTable,
+3295 &invptr->pLoadedLib,
+3296 FALSE,
+3297 invptr,
+3298 NULL,
+3299 l_sqlr_rcb->sqlca);
......
+3307 }
//@ed242033tjv | |
| Workaround | |
not known / see Local fix | |
| BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC75480 follow-up : | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 12.01.2011 07.03.2011 26.04.2011 |
| Problem solved at the following versions (IBM BugInfos) | |
9.0.1, 9.1.FP11 | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 9.1.0.11 |
|