DB2 - Problem description
| Problem IC61620 | Status: Closed |
A LOAD OPERATION THAT HAS BEEN TERMINATED UN-EXPECTEDLY HANGS AND CANNOT BE FORCED OFF. | |
| product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
| Problem description: | |
A load operation may hang and may not respond to a force
application command.
This happens when a LOAD is trying to terminate with an error
but gets into a scenario where it gets stuck.
Even trying to force the application does not help.
The symptoms to look for are:
1) In the db2diag.log of the data nodes, you will notice
following entry for db2lload EDU:
2009-02-27-10.35.23.875795-300 I457104614E566 LEVEL: Error
PID : 28356 TID : 183127501152 PROC :
db2sysc 2
INSTANCE: nytxt370 NODE : 002 DB : WISE
APPHDL : 1-342 APPID:
10.163.158.207.28662.0902271504
AUTHID : WISEETLT
EDUID : 5321 EDUNAME: db2lload 2
FUNCTION: DB2 UDB, database utilities, DIAG_ERROR, probe:0
DATA #1 : String, 143 bytes
LOADID: 251.2009-02-27-10.05.00.581420.1 (7;336)
Error requesting identity values. , -2141716477, (nil), Detected
in file:sqluident.C, Line:559
Similar messages on other data nodes as well would show up.
2) From the LOAD MPP coordinator (the main db2agent serving
the load), you will see "client termination" messages,
rollback, -1224, and sending interrupt to all nodes.
2009-02-27-10.35.23.759797-300 I127307718E460 LEVEL: Severe
PID : 22844 TID : 191545469280 PROC :
db2sysc 1
INSTANCE: nytxt370 NODE : 001 DB : WISE
APPHDL : 1-342 APPID:
10.163.158.207.28662.0902271504
AUTHID : WISEETLT
EDUID : 251 EDUNAME: db2agent (WISE) 1
FUNCTION: DB2 UDB, relation data serv, sqlrrbck, probe:150
MESSAGE : SQLEU_STATE2_LOAD_ROLLBACK_PENDING state is set
2009-02-27-10.35.23.760287-300 I127308179E553 LEVEL: Error
PID : 22844 TID : 191545469280 PROC :
db2sysc 1
INSTANCE: nytxt370 NODE : 001 DB : WISE
APPHDL : 1-342 APPID:
10.163.158.207.28662.0902271504
AUTHID : WISEETLT
EDUID : 251 EDUNAME: db2agent (WISE) 1
FUNCTION: DB2 UDB, database utilities, DIAG_ERROR, probe:0
DATA #1 : String, 123 bytes
LOADID: 251.2009-02-27-10.05.00.581420.1 (7;336)
, -1224, 0xffffffff8012006d, Detected in file:sqlulxld_fetch.C,
Line:1121
2009-02-27-10.35.23.792229-300 I127310927E561 LEVEL: Error
PID : 22844 TID : 191545469280 PROC :
db2sysc 1
INSTANCE: nytxt370 NODE : 001 DB : WISE
APPHDL : 1-342 APPID:
10.163.158.207.28662.0902271504
AUTHID : WISEETLT
EDUID : 251 EDUNAME: db2agent (WISE) 1
FUNCTION: DB2 UDB, database utilities, DIAG_ERROR, probe:0
DATA #1 : String, 131 bytes
LOADID: 251.2009-02-27-10.05.00.581420.1 (7;336)
Interrupting all SAs , 0, (nil), Detected in
file:sqlusMPPCoordinator.C, Line:3839
2009-02-27-10.35.23.792389-300 I127311489E509 LEVEL:
Warning
PID : 22844 TID : 191545469280 PROC :
db2sysc 1
INSTANCE: nytxt370 NODE : 001 DB : WISE
APPHDL : 1-342 APPID:
10.163.158.207.28662.0902271504
AUTHID : WISEETLT
EDUID : 251 EDUNAME: db2agent (WISE) 1
FUNCTION: DB2 UDB, database utilities, DIAG_NOTE, probe:0
DATA #1 : String, 79 bytes
LOADID: 251.2009-02-27-10.05.00.581420.1 (7;336)
Sending interrupt to node 1, 0
2009-02-27-10.35.23.793104-300 I127311999E509 LEVEL:
Warning
PID : 22844 TID : 191545469280 PROC :
db2sysc 1
INSTANCE: nytxt370 NODE : 001 DB : WISE
APPHDL : 1-342 APPID:
10.163.158.207.28662.0902271504
AUTHID : WISEETLT
EDUID : 251 EDUNAME: db2agent (WISE) 1
FUNCTION: DB2 UDB, database utilities, DIAG_NOTE, probe:0
DATA #1 : String, 79 bytes
LOADID: 251.2009-02-27-10.05.00.581420.1 (7;336)
Sending interrupt to node 2, 0
3) The stack traceback for the coordinator agent would show up
as follows:
0000002A9718F900 _Z17sqlkdReceiveReplyP16sqlkdRqstRplyFmt +
0x0310
(/ms/dist/ibmdb2/PROJ/ds/9.5.2.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
0000002A968BE01A
_Z25sqlkdReceiveIntrptRepliesP22SQLKD_INTERRUPT_FORMATP16sqlkdRq
stRplyFmtP8SQLKD_CBb + 0x00e6
(/ms/dist/ibmdb2/PROJ/ds/9.5.2.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
0000002A9718DDEB
_Z14sqlkdInterruptP22SQLKD_INTERRUPT_FORMATP5sqlcaP8sqlrr_cb +
0x1439
(/ms/dist/ibmdb2/PROJ/ds/9.5.2.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
0000002A986BF9F8 _ZN16sqlusCBDSChannel10iInterruptEPhS0_jP5sqlca
+ 0x0578
(/ms/dist/ibmdb2/PROJ/ds/9.5.2.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
...
4) From db2lload's stack trace we see that is it looping in
sqlulTermEDU() waiting for all its child processes to terminate.
...
0000002A97CA604D sqlorest + 0x006b
(/ms/dist/ibmdb2/PROJ/ds/9.5.3.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
0000002A985323AC _Z12sqlulTermEDUP13SQLUCACB_TYPEiPi + 0x00b2
(/ms/dist/ibmdb2/PROJ/ds/9.5.3.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
...
From db2lmr's stack trace we see that it is in
sqloPdbSelectSocket() / select() functions:
...
0000002A9C247176 __select + 0x0066
(/lib64/tls/libc.so.6)
0000002A97CBA454 sqloPdbSelectSocket + 0x01be
(/ms/dist/ibmdb2/PROJ/ds/9.5.3.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
0000002A9870C111
_ZN23sqluCSerializableSocket18iSelectSocketForIOEv + 0x01b1
(/ms/dist/ibmdb2/PROJ/ds/9.5.3.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
0000002A9870B8DD _ZN23sqluCSerializableSocket5iNextEii + 0x01b3
(/ms/dist/ibmdb2/PROJ/ds/9.5.3.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
0000002A9870A898 _ZThn8_N23sqluCSerializableSocket5iNextEii +
0x000a
(/ms/dist/ibmdb2/PROJ/ds/9.5.3.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1)
0000002A987100F4
_ZN29sqlusCFormattedUserDataBuffer5iFillEP16sqluIMediaListIOPm +
0x015a
(/ms/dist/ibmdb2/PROJ/ds/9.5.3.2/.exec/x86_64.linux.2.6.glibc.2.
3/lib64/libdb2e.so.1) | |
| Problem Summary: | |
*************************************************************RDC * USERS AFFECTED: * All users of the LOAD command in DB2 9.7 eGA * **************************************************************** * PROBLEM DESCRIPTION: * DB2 LOAD command operation attempting to * terminate with an error but does not complete. * **************************************************************** * RECOMMENDATION: * Upgrade to DB2 Version 9.7 fix pack 1 or higher. * * Use the links at the top of this page to download * the fix pack. **************************************************************** | |
| Local Fix: | |
| available fix packs: | |
DB2 Version 9.7 Fix Pack 1 for Linux, UNIX, and Windows | |
| Solution | |
First Fixed in DB2 Version 9.7 fix pack 1 | |
| Workaround | |
not known / see Local fix | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 20.06.2009 11.02.2010 04.03.2010 |
| Problem solved at the following versions (IBM BugInfos) | |
9.7.FP1 | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 9.7.0.1 |
|