DB2 - Problem description
| Problem IC77640 | Status: Closed |
STANDBY SHUTDOWN AFTER LOG RETRIEVE ATTEMPT FAILURE | |
| product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
| Problem description: | |
When a storage manager is setup on standby, database activation
will fail if it cannot retrieve a log file required for
recovery. This behavior is expected on a standard database, but
not on a HADR standby. Standby should move from local catchup to
remote catchup in order to fetch all the log files.
Usually, if the problem is hit, the return code from userexit
program is 4 or 8.
Here are the related db2diag.log entries:
2011-05-21-15.22.48.976692-300 I227718A364 LEVEL: Warning
PID : 26476788 TID : 5142 PROC : db2sysc
0
INSTANCE: db2inst1 NODE : 000
EDUID : 5142 EDUNAME: db2logmgr (SAMPLE) 0
FUNCTION: DB2 UDB, data protection services,
sqlpgRetrieveLogFile, probe:4130
MESSAGE : Started retrieve for log file S0249205.LOG.
2011-05-21-15.27.09.530887-300 E229599A511 LEVEL: Error
PID : 26476788 TID : 5142 PROC : db2sysc
0
INSTANCE: db2inst1 NODE : 000
EDUID : 5142 EDUNAME: db2logmgr (SAMPLE) 0
FUNCTION: DB2 UDB, data protection services,
sqlpgUserexitLogAdminMsg, probe:1180
MESSAGE : ADM1835E The user exit program returned an error when
retrieving log
file "S0249205.LOG" to "/db2/SAMPLE/log_dir/NODE0000/"
for database
"SAMPLE". The error code was "8".
2011-05-21-15.27.09.544174-300 E230111A431 LEVEL: Warning
PID : 26476788 TID : 5142 PROC : db2sysc
0
INSTANCE: db2inst1 NODE : 000
EDUID : 5142 EDUNAME: db2logmgr (SAMPLE) 0
FUNCTION: DB2 UDB, data protection services,
sqlpgRetrieveLogFile, probe:4165
MESSAGE : ADM1847W Failed to retrieve log file "S0249205.LOG"
on chain "23" to
"/db2/SAMPLE/log_dir/NODE0000/".
2011-05-21-15.27.10.055514-300 I230543A469 LEVEL: Error
PID : 26476788 TID : 5913 PROC : db2sysc
0
INSTANCE: db2inst1 NODE : 000
EDUID : 5913 EDUNAME: db2lfr (SAMPLE) 0
FUNCTION: DB2 UDB, recovery manager, sqlplfrOpenExtentRetrieve,
probe:225
MESSAGE : Received error from db2logmgr on retrieve of log
249205, rc:
DATA #1 : Hexdump, 4 bytes
0x0700000019FFD890 : 0000 0008
....
2011-05-21-15.27.10.057751-300 I231013A478 LEVEL: Error
PID : 26476788 TID : 16193 PROC : db2sysc
0
INSTANCE: db2inst1 NODE : 000 DB :
SAMPLE
APPHDL : 0-8 APPID: *LOCAL.DB2.110521202116
EDUID : 16193 EDUNAME: db2redom (SAMPLE) 0
FUNCTION: DB2 UDB, recovery manager, sqlpPRecReadLog, probe:1275
RETCODE : ZRC=0x82100016=-2112880618=SQLPLFR_RC_RETRIEVE_FAILED
"Log could not be retrieved"
2011-05-21-15.27.10.437046-300 E233417A922 LEVEL:
Critical
PID : 26476788 TID : 4370 PROC : db2sysc
0
INSTANCE: db2inst1 NODE : 000 DB :
SAMPLE
APPHDL : 0-8 APPID: *LOCAL.DB2.110521202116
EDUID : 4370 EDUNAME: db2agent (SAMPLE) 0
FUNCTION: DB2 UDB, base sys utilities,
sqeLocalDatabase::MarkDBBad, probe:10
MESSAGE : ADM14001C An unexpected and critical error has
occurred:
"DBMarkedBad". The instance may have been shutdown as
a result.
"Automatic" FODC (First Occurrence Data Capture) has
been invoked and
diagnostic information has been recorded in directory
"/db2/SAMPLE/db2dump/FODC_DBMarkedBad_2011-05-21-15.27.10.432900
/".
Please look in this directory for detailed evidence
about what
happened and contact IBM support if necessary to
diagnose the
problem. | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to 9.7 FP6 * **************************************************************** | |
| Local Fix: | |
1. Move userexit program (the file should be
sqllib/bin/db2uext2) to another path or rename it.
2. start standby: because userexit program is not available, we
will get message similar to this:
2011-07-05-20.49.52.653962-420 E123698E549 LEVEL: Error
PID : 12496 TID : 47739391961408PROC :
db2sysc
INSTANCE: sfbao NODE : 000
EDUID : 60 EDUNAME: db2logmgr (HADRDB)
FUNCTION: DB2 UDB, data protection services,
sqlpgUserexitLogAdminMsg, probe:1170
MESSAGE : ADM1834E DB2 was unable to find the user exit program
when
retrieving log file "S0003169.LOG" to
"/u/sfbao/sfbao/NODE0000/SQL00001/SQLOGDIR/" for
database "HADRDB".
The error code was "24".
However, log manager should return to lfr instantly after it
detects this error, and standby will start up eventually i.e.,
move into remote catchup state.
3. Once HADR goes into a peer mode, move or rename userexit
program back | |
| available fix packs: | |
DB2 Version 9.7 Fix Pack 6 for Linux, UNIX, and Windows | |
| Solution | |
Problem first fixed on DB2 Version 9.7 Fix Pack 6 | |
| Workaround | |
not known / see Local fix | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 20.07.2011 05.12.2012 05.12.2012 |
| Problem solved at the following versions (IBM BugInfos) | |
9.7.FP6 | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 9.7.0.6 |
|