DB2 - Problem description
Problem IT29256 | Status: Closed |
db2instance -list may show the secondary CF in transient error state during catch up on busy systems. | |
product: | |
DB2 FOR LUW / DB2FORLUW / B10 - DB2 | |
Problem description: | |
When the CF's are very busy during secondary catch up, db2instance may show that the secondary CF is in error state due to timeout. Example db2instance -list output: ID TYPE STATE HOME_HOST CURRENT_HOST ALERT -- ---- ----- --------- ------------ ----- 0 MEMBER STARTED hostdb01 hostdb01 NO 1 MEMBER STARTED hostdb02 hostdb02 NO 2 MEMBER STARTED hostdb03 hostdb03 NO 128 CF PRIMARY hostcf01 hostcf01 NO 129 CF ERROR hostcf02 hostcf02 NO Example db2diag.log entries: 2019-03-05-18.33.07.795218+540 I1040042A511 LEVEL: Error PID : 51970510 TID : 1 PROC : db2instance INSTANCE: instanceName NODE : 000 HOSTNAME: hostname EDUID : 1 FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876 DATA #1 : MgmntPort::send_cmd failed. status: 8006000a DATA #1 : If a CF return code is displayed above and you wish to get more information then please run the following command: db2diag -cfrc 2019-03-05-18.33.07.795960+540 I1040554A1528 LEVEL: Severe PID : 51970510 TID : 1 PROC : db2instance INSTANCE: instanceName NODE : 000 HOSTNAME: hostname EDUID : 1 FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for CF, SAL_MANAGEMENT_PORT_HANDLE::SAL_ManagementDuplexingFailed, probe:11582 MESSAGE : ZRC=0x820001C7=-2113928761=SQLZ_RC_CA_DUPLEXING_STATUS_CHECK_FAI LED "An attempt to determine CF duplexing status failed." DATA #1 : CF RC, PD_TYPE_SD_CF_RC, 4 bytes 2147876874 Running 'db2diag -cfrc 8006000a' as suggested by the Error log entry above shows that there was a timeout communicating with the CF. Input ZRC string '8006000a' parsed as 0x8006000A (-2147090422). Attempting to lookup value 0x8006000A (-2147090422) as a ZRC ZRC value to map: 0x94C6000A (-1798963190) ZRC class : Cluster caching facility management port errors and warnings (Class Index: 20) Component: CF_MGMNT ; cluster caching facility management (Component Index: 198) Reason Code: 10 (0x000A) Identifer: CF_MGMNT_RECV_MRB_TIMEOUT Identifer (without component): SQLZ_RC_CF_MGMNT_RECV_MRB_TIMEOUT Description: Recv MRB timeout Associated information: Sqlcode -902 SQL0902C A system error occurred. Subsequent SQL statements cannot be processed. IBM software support reason code: "". Number of sqlca tokens : 1 Diaglog message number: 1 | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * Db2 purescale environment * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to Db2 Version 11.1 Modification 4 Fix Pack 6. * **************************************************************** | |
Local Fix: | |
Run db2instance -list at a later time | |
Solution | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 23.05.2019 24.04.2020 24.04.2020 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) |