suche 36x36
Latest versionsfixlist
11.1.0.7 FixList
10.5.0.9 FixList
10.1.0.6 FixList
9.8.0.5 FixList
9.7.0.11 FixList
9.5.0.10 FixList
9.1.0.12 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

DB2 - Problem description

Problem IT29277 Status: Closed

IN PURESCALE, WHILE USING I/O DRAWER, DB2 MEMBER MAY GO DOWN WHEN THERE IS
A PROBLEM WITH A ROCE PORT

product:
DB2 FOR LUW / DB2FORLUW / B10 - DB2
Problem description:
When a RoCE port that is configured for HA encounters issues, it
may result in one of the members going down.

In this case, the db2diag.log shows the following entries:

2018-09-07-05.27.43.138008+540 I2379A709 LEVEL: Severe
PID : 15597810 TID : 139862 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DUIT
APPHDL : 0-10933 APPID: *N0.db2inst1.180905095247
AUTHID : DB2IRS HOSTNAME: host21
EDUID : 139862 EDUNAME: db2agent (DUIT) 0
FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876
DATA #1 : 
xport_send: dat_ep_post_rdma_write of the MCB failed:
0x80040000. EP: 0x1111177d0
DATA #1 : 
If a CF return code is displayed above and you wish to get
more information then please run the following command:
...
2018-09-07-05.27.43.152685+540 I8731A746 LEVEL: Error
PID : 15597810 TID : 102875 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DUIT
HOSTNAME: host21
EDUID : 102875 EDUNAME: db2XInot GBP 2-0 (DUIT) 0
FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876
DATA #1 : 
link_status_write: do_dequeue for link status Buffer FAILED dest
Address: 0x111b86f68 RKEY = 0x4ee00 len = 4, src Address: 0x
121146ac LKEY = 0x36700 len = 4 status = 0x80090020, ep =
0x12114c50
DATA #1 : 
If a CF return code is displayed above and you wish to get
more information then please run the following command:
db2diag -cfrc 
...
2018-09-07-05.27.43.154096+540 I10195A6128 LEVEL: Event
PID : 15597810 TID : 102875 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DUIT
HOSTNAME: host21
EDUID : 102875 EDUNAME: db2XInot GBP 2-0 (DUIT) 0
FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for
CF, SAL_GBP_HANDLE::SAL_CheckXiLink, probe:204
MESSAGE : CA RC= 2148073504
DATA #1 : String, 59 bytes
Detected broken XI connection, attempt reset operation now.
DATA #2 : Codepath, 8 bytes
7:15
DATA #3 : unsigned integer, 8 bytes
1
DATA #4 : SAL CF Index, PD_TYPE_SAL_CF_INDEX, 8 bytes
2
DATA #5 : SAL CF Node Number, PD_TYPE_SAL_CF_NODE_NUM, 2 bytes
129
DATA #6 : String, 49 bytes
current xi cf-server/member-devname/adapter-index
DATA #7 : SAL CF Server Name, PD_TYPE_SAL_CF_SERVER_NAME, 13
bytes
host22-en1
DATA #8 : SAL Member Device Name,
PD_TYPE_SAL_MEMBER_DEVICE_NAME, 4 bytes
hca0
DATA #9 : Connection pool link adapter number,
PD_TYPE_SAL_ADAPTER_NUMBER, 8 bytes
0
...
2018-09-07-05.27.43.156303+540 I17603A738 LEVEL: Error
PID : 15597810 TID : 101309 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DUIT
HOSTNAME: host21
EDUID : 101309 EDUNAME: db2LLMn2 (DUIT) 0
FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876
DATA #1 : 
link_status_write: do_dequeue for link status Buffer FAILED dest
Address: 0x111b882e8 RKEY = 0x10500 len = 4, src Address: 0x
185ac29c LKEY = 0x16800 len = 4 status = 0x80090020, ep =
0x185bd5d0
DATA #1 : 
If a CF return code is displayed above and you wish to get
more information then please run the following command:
db2diag -cfrc 
...
2018-09-07-05.27.43.161216+540 I21396A630 LEVEL: Error
PID : 15597810 TID : 101309 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DUIT
HOSTNAME: host21
EDUID : 101309 EDUNAME: db2LLMn2 (DUIT) 0
FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876
DATA #1 : 
notify_disconnect(close): dat_ep_disconnect failed: 0x80030000,
EP: 0x1185bd5d0 Token: 0x1a000
DATA #1 : 
If a CF return code is displayed above and you wish to get
more information then please run the following command:
db2diag -cfrc 
...
2018-09-07-05.27.43.167388+540 I30439A4907 LEVEL: Event
PID : 15597810 TID : 102106 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DUIT
HOSTNAME: host21
EDUID : 102106 EDUNAME: db2XInot SCA 2-0 (DUIT) 0
FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for
CF, SAL_GBP_HANDLE::SAL_CheckXiLink, probe:204
MESSAGE : CA RC= 2148073504
DATA #1 : String, 59 bytes
Detected broken XI connection, attempt reset operation now.
...
2018-09-07-05.27.43.185042+540 E53804A4857 LEVEL: Error
PID : 15597810 TID : 139862 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DUIT
APPHDL : 0-10933 APPID: *N0.db2inst1.180905095247
AUTHID : DB2IRS HOSTNAME: host21
EDUID : 139862 EDUNAME: db2agent (DUIT) 0
FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for
CF,
SAL_MANAGEMENT_PORT_HANDLE::SAL_ManagementQueryKillConnection,
probe:12678
MESSAGE : ECF=0x94C6004D=-1798963123
DATA #1 : CF RC, PD_TYPE_SD_CF_RC, 4 bytes
2147876941

The stack files shows following stack of functions:


-------Frame------ ------Function + Offset------
0x090000000057FF14 pthread_kill + 0xD4
0x090000000057F764 _p_raise + 0x44
0x0900000000039E68 raise + 0x48
0x0900000000056864 abort + 0xC4
0x0900000004A59CF8 sqloExitEDU + 0x298
0x0900000004ABE0DC sqle_panic__Fi + 0x71C
0x090000000534DC54
SAL_ResetXiConnection__14SAL_GBP_HANDLEFR17SAL_XI_RECONN_EDU +
0x3D54
0x090000000B4C985C
SAL_CheckXiLink__14SAL_GBP_HANDLEFR17SAL_XI_RECONN_EDU + 0xC9C
0x090000000B4C9CF4 RunEDU__17SAL_XI_RECONN_EDUFv + 0x34
0x0900000004B5EFA0 EDUDriver__9sqzEDUObjFv + 0x2E0
0x0900000004A53694 sqloEDUEntry + 0x374
Problem Summary:
****************************************************************
* USERS AFFECTED:                                              *
* ALL                                                          *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
* See Error Description                                        *
****************************************************************
* RECOMMENDATION:                                              *
* Upgrade to Db2 11.1 Mod 4 Fixpack 5 or higher                *
****************************************************************
Local Fix:
Solution
Workaround
not known / see Local fix
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
28.05.2019
16.01.2020
16.01.2020
Problem solved at the following versions (IBM BugInfos)
Problem solved according to the fixlist(s) of the following version(s)