DB2 - Problem description
| Problem IT02770 | Status: Closed |
CRASH DUE TO MEMORY CORRUPTION DURING PARALLEL HSJN | |
| product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
| Problem description: | |
If smp mode is enabled (database manager configuration setting
for INTRA_PARALLEL is set to ON), and a query is run that
contains a hash join, then the hash join logic will be run in
parallel using multiple db2 agents.
There exists a timing issue in this setup where if the very
first agent to start initializating the hash join is currently
working, and then if one of the other agents working on the hash
join is interrupted, then there is a logic problem in the error
handling and cleanup processing that results in freeing memory
while the first agent is still doing the initialization work.
This leads to the agents making changes to memory incorrectly.
And this issue is a regression of IC92447, IC92731, IC95345 on
the following db2 level.
- v97 fp9
- v10.1 fp3 and fp4
- v10.5 fp3 and fp4
Since the problem is a memory corruption, a variety of
possible crash symptoms may result. The following are the
symptoms that were seen when the problem was discovered:
Sample 1:
sqloCrashOnCriticalMemoryValidationFailure
diagnoseMemoryCorruptionAndCrash
diagnoseMemoryCorruptionAndCrash
.MemTreePut
sqlofmblkEx
sqlofmblkEx
sqleIntrptWaitPostTerm
sqleIntrptWaitPostTerm
sqlri_hsjnClose
sqlrihsjn
sqlriSectInvoke
sqlrr_smp_route
sqlrr_subagent_router
sqleSubRequestRouter
sqleProcessSubRequest
RunEDU
EDUDriver
sqloEDUEntry
Sample 2:
sqloxltc_track
sqlri_hsjnNewTuple
sqlrihsjnpd
.
sqlifnxt
.sqlirdk
.sqldIndexFetch
sqldRowFetch
sqlritaSimplePerm
sqlriExecThread
sqlrihsjn
sqlriExecThread
sqlrihsjn
sqlriExecThread
sqlrihsjn
sqlriSectInvoke
sqlrr_smp_route
sqlrr_subagent_router
sqleSubRequestRouter
sqleProcessSubRequest
RunEDU
EDUDrive
sqloEDUEntry | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * v10.1 fp3 and fp4 user * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to db2 Version 10.1 FixPack 5 or higher * **************************************************************** | |
| Local Fix: | |
Any of the following changes can avoid the problem: 1) Turn off intra_parallel or set the query degree to 1 2) diable hash join (query optimization level 3, or db2set DB2_HASH_JOIN=OFF) | |
| Solution | |
Problem was first fixed in Version 10.1 FixPack 5 | |
| Workaround | |
not known / see Local fix | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 24.06.2014 14.07.2015 14.07.2015 |
| Problem solved at the following versions (IBM BugInfos) | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 10.1.0.5 |
|