DB2 - Problem description
| Problem IT07864 | Status: Closed |
DB2 CRASHES AFTER SETTING DB2_RESOURCE_POLICY. | |
| product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
| Problem description: | |
The use of DB2_RESOURCE_POLICY in some configurations enables
memory affinity for NUMA exploitation. As part of this
exploitation, various internal data structures are split into
multiple regions and are allocated from different nodes.
However, there are problems in how these multiple regions are
initialized which results in memory corruption.
Affected configurations must have more than 5 resources
configured or automatically detected. This information can be
found in db2diag.log after running db2start. (See "Number of
resource bindings" entry.)
2015-03-12-12.12.24.271249-240 E6184A2642 LEVEL: Event
PID : 56819936 TID : 258 PROC :
db2wdog_DB2NAME
INSTANCE: memmerto NODE : 000
HOSTNAME: p7302
EDUID : 258 EDUNAME: db2wdog_DB2NAME
FUNCTION: DB2 UDB, oper system services,
sqloInitializeResourcePolicy, probe:20
DATA #1 : String, 2279 bytes
RESOURCE POLICY
Number of DB resource policies = 1
DATABASE RESOURCE POLICY
Database name = !GLBPOL!
Method = 1
Number of resource groups = 1
Number of resource bindings = 32
Round robin resource binding = 1
Two different behaviours may be seen as a result of this
corruption:
1) messages in db2diag.log regarding memory corruption during
during database deactivation
2015-03-06-11.25.39.902046-300 E541790A1016 LEVEL:
Critical
PID : 6292082 TID : 2058 KTID :
96010371
PROC : db2sysc
INSTANCE: memmerto NODE : 000 DB :
SAMPLE
APPHDL : 0-7 APPID:
*LOCAL.memmerto.150306162453
AUTHID : MEMMERTO HOSTNAME: p7302
EDUID : 2058 EDUNAME: db2agent (PMR56607)
FUNCTION: DB2 UDB, SQO Memory Management,
sqloDiagnoseFreeBlockFailure, probe:10
MESSAGE : ADM14001C An unexpected and critical error has
occurred: "Panic".
The instance may have been shutdown as a result.
"Automatic" FODC
(First Occurrence Data Capture) has been invoked and
diagnostic
information has been recorded in directory
"/home/memmerto/sqllib/db2dump/FODC_Panic_2015-03-06-11.25.39.89
0878_
0000/". Please look in this directory for detailed
evidence about
what happened and contact IBM support if necessary to
diagnose the
problem.
-------Frame------ ------Function + Offset------
0x090000004D94261C sqle_panic__Fi + 0xA7C
0x090000004D958010 sqloCrashOnCriticalMemoryValidationFailure +
0x50
0x090000004D97E298
diagnoseMemoryCorruptionAndCrash__13SQLO_MEM_POOLFUlCPCcCb +
0x3F8
0x090000005823CB74
markAllAllocatedBlocksInvalid__17SqloChunkSubgroupCFv + 0x134
0x090000004D982A28
markAllAllocatedBlocksInvalid__13SQLO_MEM_POOLCFv + 0x88
0x090000004D96BBA0 sqloPurgeMemoryInSubPool + 0x400
0x090000004D96C334 sqloFreeMemorySubPool + 0x134
0x090000005004BBAC sqldTermDBCB__FP16sqeLocalDatabaseUl + 0x58C
0x090000004E3078B8 CleanDB__16sqeLocalDatabaseFbP5sqlca + 0x838
0x090000004E30FBD0
ExecuteDBShutdown__16sqeLocalDatabaseFP8sqeAgentPbP5sqlcai +
0x830
0x090000004E322320
TermDbConnect__16sqeLocalDatabaseFP8sqeAgentP5sqlcai + 0x3060
0x090000004E3A77AC
AppStopUsing__14sqeApplicationFP8sqeAgentUcP5sqlca + 0x1A2C
0x09000000558BE774
sqleStartDb__FsP8SQLE_BWAP10sqledbdescP13sqledbdescextT1PcT2iT1l
Ul + 0x1414
2) runtime traps due to corrupted latches
2015-03-05-16.24.57.464727-300 E33801A3679 LEVEL:
Severe (OS)
PID : 58917198 TID : 2402 PROC :
db2sysc
INSTANCE: memmerto NODE : 000 DB :
SAMPLE
APPHDL : 0-8 APPID:
*LOCAL.memmerto.150305212457
AUTHID : MEMMERTO HOSTNAME: p7302
EDUID : 2402 EDUNAME: db2agent (SAMPLE)
FUNCTION: DB2 UDB, SQO Latch Tracing,
SQLO_SLATCH_CAS64::releaseConflict, probe:300
MESSAGE :
ZRC=0x870F00FB=-2029059845=SQLO_SLATCH_ERROR_HELDX_WITH_SHARED_H
OLDERS
"shared latch found with both exclusive and shared
holders. Latch likely corrupt."
CALLED : OS, -, unspecified_system_function
-------Frame------ ------Function + Offset------
0x090000000E6BB9D4 sqle_panic__Fi + 0x520
0x090000000B3661B4
dumpDiagInfoAndPanic__17SQLO_SLATCH_CAS64CFCPCcCUiCUlT3ClT3CiT1T
3T7 + 0x308
0x090000000B365E44
releaseConflict__17SQLO_SLATCH_CAS64Fv@OL@21943 + 0x60
0x090000000CF2C9CC releaseConflict__17SQLO_SLATCH_CAS64Fv + 0x44
0x090000000CCF8DEC
sqldScanOpen__FP8sqeAgentP14SQLD_SCANINFO1P14SQLD_SCANINFO2PPv
+ 0x1C0
0x090000000C6C8808 sqlrlini__FP8sqlrr_cbb + 0x380
0x090000000DB36804 sqlrr_appl_init__FP8sqeAgentP5sqlca + 0x5474
0x090000000DB05348
InitEngineComponents__14sqeApplicationFcP8sqeAgentP8SQLE_BWAP5sq
lcaP22SQLESRSU_STATUS_VECTORT1 + 0x2CC4
0x090000000DAF9608
AppStartUsing__14sqeApplicationFP8SQLE_BWAP8sqeAgentcT3P5sqlcaPc
+ 0x820
0x090000000D680DBC
AppLocalStart__14sqeApplicationFP14db2UCinterface + 0x540 | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 10.1 and Fix Pack 5 * **************************************************************** | |
| Local Fix: | |
Discontinue the use of DB2_RESOURCE_POLICY if your configuration is at risk of being affected. | |
| Solution | |
Problem was first fixed in DB2 Version 10.1 and Fix Pack 5 | |
| Workaround | |
not known / see Local fix | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 23.03.2015 13.07.2015 13.07.2015 |
| Problem solved at the following versions (IBM BugInfos) | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 10.1.0.5 |
|