DB2 - Problem description
| Problem IC85354 | Status: Closed |
logs could be mistakenly removed on new primary after takeover on AIX 64 OS | |
| product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
| Problem description: | |
By design, only logs that have been archived on old primary will
be deleted on new primary during HADR takeover.
If logretain is enabled, no logs should be deleted.
However, when hitting this APAR, even when logretain is enabled,
db2 will removes logs up to first active log.
For forced takeover, this APAR could make reintegration fail or
make auxiliary standby not be able to be synchronous with new
primary because logs at takeover position is removed.
When this happen, similar messages like following will appear in
new primary's db2diag.log:
2012-06-21-11.16.25.029452+540 I101122A414 LEVEL: Info
PID : 12582958 TID : 7327 PROC :
db2sysc 0
INSTANCE: db2inst1 NODE : 000
HOSTNAME: host1
EDUID : 7327 EDUNAME: db2hadrp.0.1 (SAMPLE) 0
FUNCTION: DB2 UDB, High Availability Disaster Recovery,
hdrSDoTakeover, probe:47262
DATA #1 : <preformatted>
Deleting old log files starting from 0 up to 3.
......
2012-06-21-11.16.25.681051+540 I109591A645 LEVEL: Error
PID : 12582958 TID : 5271 PROC :
db2sysc 0
INSTANCE: db2inst1 NODE : 000
HOSTNAME: host1
EDUID : 5271 EDUNAME: db2lfr.0 (SAMPLE) 0
FUNCTION: DB2 UDB, recovery manager, sqlplfrFMOpenLog, probe:25
MESSAGE : ZRC=0x860F000A=-2045837302=SQLO_FNEX "File not found."
DIA8411C A file "" could not be found.
DATA #1 : SQLPLFR_SCAN_ID, PD_TYPE_SQLPLFR_SCAN_ID, 8 bytes
LFR Scan Num = 10
LFR Scan Caller's EDUID = 7592
DATA #2 : String, 25 bytes
Problem opening log file:
DATA #3 : String, 12 bytes
S0000002.LOG
2012-06-21-11.16.25.681448+540 I110237A394 LEVEL:
Warning
PID : 12582958 TID : 5271 PROC :
db2sysc 0
INSTANCE: db2inst1 NODE : 000
HOSTNAME: host1
EDUID : 5271 EDUNAME: db2lfr.0 (SAMPLE) 0
FUNCTION: DB2 UDB, recovery manager, sqlplfrFMReadLog,
probe:5120
MESSAGE : Return code for LFR opening file S0000002.LOG was
-2045837302
2012-06-21-11.16.25.681665+540 I110632A599 LEVEL: Error
PID : 12582958 TID : 7592 PROC :
db2sysc 0
INSTANCE: db2inst1 NODE : 000
HOSTNAME: host1
EDUID : 7592 EDUNAME: db2hadrp.0.3 (SAMPLE) 0
FUNCTION: DB2 UDB, High Availability Disaster Recovery,
hdrEdu::hdrEduP, probe:20591
MESSAGE : ZRC=0x860F000A=-2045837302=SQLO_FNEX "File not found."
DIA8411C A file "" could not be found.
DATA #1 : <preformatted>
HADR primary database failed to read log pages for remote
catchup. sqlplfrScanNext scanPages = 0, scanFlagsOut = 0x2
And standby could fail with similar messages in db2diag.log:
2012-06-21-11.16.25.857278-240 I3134549A594 LEVEL:
Warning
PID : 4784150 TID : 6427 PROC :
db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB :
SAMPLE
HOSTNAME: host1
EDUID : 6427 EDUNAME: db2hadrs.0.0 (SAMPLE) 0
FUNCTION: DB2 UDB, High Availability Disaster Recovery,
hdrEdu::hdrEduS, probe:21580
MESSAGE : ZRC=0x87800148=-2021654200=HDR_ZRC_BAD_LOG
"HADR standby found bad log"
DATA #1 : String, 99 bytes
HADR standby error handling: will close connection to primary,
then reconnect, and perform a retry. | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * AIX * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 UDB version 10.1 fix pack 2. * **************************************************************** | |
| Local Fix: | |
Copy logs from old primary to new primary's active log path if the logs are available on old primary and restart standby | |
| available fix packs: | |
DB2 Version 10.1 Fix Pack 2 for Linux, UNIX, and Windows | |
| Solution | |
Problem was first fixed in DB2 UDB Version 10.1 FixPack 2 | |
| Workaround | |
not known / see Local fix | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 17.07.2012 31.12.2012 31.12.2012 |
| Problem solved at the following versions (IBM BugInfos) | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 10.1.0.2 |
|
| 10.5.0.2 |
|