DB2 - Problem description
Problem IT29822 | Status: Closed |
HADR STANDBY WITH ROS CAN HANG DURING ENDING OF REPLAY ONLY WINDOW | |
product: | |
DB2 FOR LUW / DB2FORLUW / B10 - DB2 | |
Problem description: | |
This can problem can be observed when the HADR standby log replay position is not moving, and in db2diag.log there is the "Replay only window is active" message without a matching "Replay only window is inactive, connections to Active Standby are allowed" message. If stack is collected, one can observe an db2redow thread stuck in the function sqlprHADRROSRedoWorkerWaitForTCBRefresh() with a stack similar to below: 0x00002ACE8B832625 _Z25ossDumpStackTraceInternalmR11OSSTrapFileiP7siginfoPvmm + 0x0385 0x00002ACE8B83222C ossDumpStackTraceV98 + 0x002c 0x00002ACE8B82D32D _ZN11OSSTrapFile6dumpExEmiP7siginfoPvm + 0x00fd 0x00002ACE85E855CF sqlo_trce + 0x03ef 0x00002ACE85EDB905 sqloDumpDiagInfoHandler + 0x0105 address: 0x00002ACE7ECF45E0 ; dladdress: 0x00002ACE7ECE5000 ; offset in lib: 0x000000000000F5E0 ; 0x00002ACE7ECF3E4D __nanosleep + 0x002d 0x00002ACE8B818D11 ossSleep + 0x0051 0x00002ACE83A5F9F5 sqlorest + 0x00e5 0x00002ACE85FDDDD5 _Z39sqlprHADRROSRedoWorkerWaitForTCBRefreshP8sqeAgentP10SQLPR_PR CB + 0x01f5 0x00002ACE85FE4CCD _Z15sqlpPRecProcLogP8sqeAgentP8SQLP_ACBP14sqlpMasterDbcb + 0x0d7d 0x00002ACE85FC0AB0 _Z20sqlpParallelRecoveryP8sqeAgentP5sqlca + 0x06f0 0x00002ACE851D4FFE _Z26sqleSubCoordProcessRequestP8sqeAgent + 0x00de 0x00002ACE82CD7E04 _ZN8sqeAgent6RunEDUEv + 0x0824 0x00002ACE84279CA4 _ZN9sqzEDUObj9EDUDriverEv + 0x00f4 0x00002ACE83ACD617 sqloEDUEntry + 0x02f7 address: 0x00002ACE7ECECE25 ; dladdress: 0x00002ACE7ECE5000 ; offset in lib: 0x0000000000007E25 ; 0x00002ACE8C65E34D clone + 0x006d This problem is due to an extreme timing scenario where the first replay only window is started before all the db2redow threads are fully initialized. Recycling the instance and activating the standby database can usually avoid the problem. | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to Db2 11.1 Mod 4 Fixpack 5 or higher * **************************************************************** | |
Local Fix: | |
Root cause is an extreme timing scenario during start up of HADR standby database. Recycle the standby instance and activate the standby database again can usually avoid the problem. | |
Solution | |
Workaround | |
NA | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 24.07.2019 16.01.2020 16.01.2020 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) |