DB2 - Problem description
| Problem IC76505 | Status: Closed |
DATABASE CRASHED WHILE CLOSING THE TEMP FILE HANDLE DURING AN UNDO OPERATION. | |
| product: | |
DB2 FOR LUW / DB2FORLUW / 950 - DB2 | |
| Problem description: | |
During an UNDO operation, we hit a disk full condition on the
temporary tablespace while trying to close a file handle.
Subsequently, database was marked bad and came down afterward.
Here's an example from the db2diag.log -
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Error
(OS)
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : xxxxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, oper system services, sqlobufreset, probe:10
MESSAGE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full."
DIA8312C Disk was full.
CALLED : OS, -, fsync OSERR: ENOSPC
(28)
DATA #1 : File handle, PD_TYPE_SQO_FILE_HDL, 8 bytes
0x0000002AA2BFB328 : 8A00 0000 8002 0000
........
DATA #2 : String, 105 bytes
Search for ossError*Analysis probe point after this log entry
for further
self-diagnosis of this problem.
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Error
(OS)
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : xxxxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 Common, OSSe, ossErrorIOAnalysis, probe:100
CALLED : OS, -, fsync OSERR: ENOSPC
(28)
DATA #1 : String, 132 bytes
A total of 4 analysis will be performed :
- User info
- ulimit info
- Target file info
- File system
Target file handle = 138
DATA #2 : String, 190 bytes
Real user ID of current process = xxxxx
Effective user ID of current process = xxxxx
Real group ID of current process = xxxx
Effective group ID of current process = xxxx
DATA #3 : String, 362 bytes
Current process limits (unit in bytes except for nofiles) :
mem (S/H) = unlimited / unlimited
core (S/H) = 0 / unlimited
cpu (S/H) = unlimited / unlimited
data (S/H) = unlimited / unlimited
fsize (S/H) = unlimited / unlimited
nofiles (S/H) = 65534 / 65534
stack (S/H) = 10485760 / unlimited
rss (S/H) = unlimited / unlimited
DATA #4 : String, 261 bytes
Target File Information :
Size = 16384
Link = No
Reference path = N/A
Type = 0x8000
Permissions = rw-------
UID = xxxxx
GID = xxxx
Last modified time = 1298765603
DATA #5 : String, 432 bytes
File System Information of the target file :
Block size = 32768 bytes
Total size = 402653184000 bytes
Free size = 32768 bytes
Total # of inodes = 24802560
FS name = xxxxxx:/xxx/xxxxxxxx/xxxxxxxxx
Mount point = /x/xxxxxx/xxx/xxxxxxxx/xxxxxxxxxxxxxxx
FSID = 27
FS type name = nfs
DIO/CIO mount opt = None
Device type = N/A
FS type = 0x6969
CALLSTCK:
[0] 0x0000002A9684F555 pdOSSeLoggingCallback + 0x91
[1] 0x0000002A9BB37C0B /xxx/.exec/x86_64.linux.2.6.glibc.2.3
/lib64/libdb2osse.so.1 + 0x1AEC0B
[2] 0x0000002A9BB38FEB ossLogSysRC + 0xBF
[3] 0x0000002A9BB2C6E0 /xxx/.exec/x86_64.linux.2.6.glibc.2.3
/lib64/libdb2osse.so.1 + 0x1A36E0
[4] 0x0000002A9BB2ACF1 ossErrorAnalysis + 0x25
[5] 0x0000002A97EBFAC4 sqloSystemErrorHandler + 0x6C0
[6] 0x0000002A96C2A3A7 sqlobufreset + 0xFB
[7] 0x0000002A968B2A94 sqlbWritePage + 0x1E0
[8] 0x0000002A98A3A187 _Z19sqlbGetPageFromDiskP11SQLB_FIX_CBi
+ 0x173D
[9] 0x0000002A98A2B843 _Z7sqlbfixP11SQLB_FIX_CB + 0xB73
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : xxxxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, buffer pool services,
sqlbForceNewPagesToDisk, probe:1235
MESSAGE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full."
DIA8312C Disk was full.
[...]
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : xxxxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, buffer pool services,
SqlbFhdlTbl::closeOneFile, probe:1000
DATA #1 : String, 38 bytes
Obj={pool:15;obj:2;type:16} State=x45
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : xxxxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, buffer pool services,
SqlbFhdlTbl::closeOneFile, probe:0
DATA #1 : Object descriptor, PD_TYPE_SQLB_OBJECT_DESC, 72 bytes
Obj: {pool:15;obj:2;type:16} Parent={9;24}
lifeLSN: 01EA0C9C96B2
tid: 0 0 0
extentAnchor: 0
initEmpPages: 0
poolPage0: 0
poolflags: 111
objectState: 45
lastSMP: 0
pageSize: 16384
extentSize: 32
bufferPoolID: 4
partialHash: 268566543
bufferPool: 0x0000002ace3a3800
[...]
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : xxxxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, data management, sqldReorgCleanup, probe:10
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : xxxxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, data management, sqldmund, probe:719
RETCODE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full."
DIA8312C Disk was full.
[...]
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : x-xxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, data management, sqldmund, probe:719
MESSAGE : Error during UNDO of LSN:
DATA #1 : Hexdump, 6 bytes
0x0000002AA2BFBE12 : 01EA 0C9C 96B2
2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : x-xxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, data management, sqldmund, probe:719
RETCODE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full."
DIA8312C Disk was full.
2011-0x-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe
PID : xxxx TID : xxxxxxxxxxxx PROC :
db2sysc xx
INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE
APPHDL : x-xxxx APPID:
xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx
AUTHID : xxxxx
EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx
FUNCTION: DB2 UDB, data management, sqldmund, probe:719
MESSAGE : Error during UNDO of log record:
DATA #1 : Dumped object of size 5000 bytes at offset 0, 59 bytes
/xxx/xxxxxx/xxxxxxxx/sqllib/db2dump/xxxx.xxxxx.xxx.dump.bin
This APAR will prevent the database from being marked bad when
we hit a disk full condition on a temporary tablespace during an
UNDO operation. | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * Users with insufficient TEMP disk space. * **************************************************************** * PROBLEM DESCRIPTION: * * Without this APAR, customer is exposed to the issue as * * described in the "ERROR DESCRIPTION" section. * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 9.5, Fixpack 9. * **************************************************************** | |
| Local Fix: | |
Be certain to have enough TEMP disk space to eliminate the possibility of hitting a 'disk full' condition. | |
| available fix packs: | |
DB2 Version 9.5 Fix Pack 9 for Linux, UNIX, and Windows | |
| Solution | |
First fixed in DB2 Version 9.5, Fixpack 9. | |
| Workaround | |
Be certain to have enough TEMP disk space to eliminate the possibility of hitting a 'disk full' condition. | |
| BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC79958 IC84087 IC84653 follow-up : | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 20.05.2011 09.03.2012 09.03.2012 |
| Problem solved at the following versions (IBM BugInfos) | |
9.5.FP9 | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 9.5.0.9 |
|