suche 36x36
Latest versionsfixlist
11.1.0.7 FixList
10.5.0.9 FixList
10.1.0.6 FixList
9.8.0.5 FixList
9.7.0.11 FixList
9.5.0.10 FixList
9.1.0.12 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

DB2 - Problem description

Problem IT37136 Status: Closed

SQL1659 is not returned even when a RDMA device cannot be openedduring
db2start, and db2diag.log is spammed with error messages

product:
DB2 FOR LUW / DB2FORLUW / B50 - DB2
Problem description:
When a member fails to open a specific RDMA device, that is used
for communication to the CF, during db2start, a SQL1659N warning
did not get returned to the user from the db2start command.

The open device failure looks similar to this:

2021-04-30-10.38.19.631990+540 I14510040A757        LEVEL:
Warning
PID     : 11076474             TID : 772            PROC :
db2sysc 0
INSTANCE: db2inst1             NODE : 000
HOSTNAME: MEMBER01
EDUID   : 772                  EDUNAME: db2castructevent 0
FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for
CF, SQLE_SINGLE_CA_HANDLE::sqleSingleCfOpenAndConnect,
probe:1316
MESSAGE : CA RC= 2148204567
DATA #1 : 
PsOpen FAILURE: hostname:CF-host(member#: 128, cfIndex: 1) ;
device:hca1 ; caport:56001 ; transport: UDAPL
Connection pool target size = 16 ;
 Tolerate this PsOpen failure, connections will berestricted to
use the successful opened device(s). conn (seq #: 251 node #: 1
connectTimeoutForLink: 10 maxTimeoutForLink: 20)


The db2diag.log will also be spammed with the following error
repeatedly:

2021-04-30-10.38.19.636533+540 I14513438A1711       LEVEL:
Warning
PID     : 11076474             TID : 772            PROC :
db2sysc 0
INSTANCE: db2inst1             NODE : 000
HOSTNAME: MEMBER01
EDUID   : 772                  EDUNAME: db2castructevent 0
FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for
CF,
SQLE_SINGLE_CA_HANDLE::sqleCaCpGetTokenIndexByCfServerNetNameAnd
DeviceName, probe:1737
MESSAGE : Could not find a match for the specified netname among
CA tokens.
DATA #1 : SAL CF Server Name, PD_TYPE_SAL_CF_SERVER_NAME, 8
bytes
CF-host
DATA #2 : SAL Member Device Name,
PD_TYPE_SAL_MEMBER_DEVICE_NAME, 4 bytes
hca1
DATA #3 : SAL CF Index, PD_TYPE_SAL_CF_INDEX, 8 bytes
1
DATA #4 : SAL CF Node Number, PD_TYPE_SAL_CF_NODE_NUM, 2 bytes
128
CALLSTCK: (Static functions may not be resolved correctly, as
they are resolved to the nearest symbol)
  [0] 0x090000000762EE58
sqleCaCpGetTokenIndexByCfServerNetNameAndDeviceName__21SQLE_SING
LE_CA_HANDLEFRC27SQLE_CF_MEMBER_ADAPTER_LINKCP17SAL_ADAPTER_IND
+ 0x318
  [1] 0x0900000007627014
sqleSingleCaRefreshAdapterStatus__21SQLE_SINGLE_CA_HANDLEFCb +
0x944
  [2] 0x09000000076257FC
sqleSingleCfOpenAndConnect__21SQLE_SINGLE_CA_HANDLEFCUi + 0x6DC
  [3] 0x090000000762D060
sqleSingleCaInitialize__21SQLE_SINGLE_CA_HANDLEFCUlCUi + 0x4F0
  [4] 0x09000000076144B0
sqleCaCpAddCa__17SQLE_CA_CONN_POOLFsCUiCPUl + 0x7A0
  [5] 0x0900000007679CE4
ROCM_StateCaInitMonitor__16sqleRocmNotifEduFv + 0xCA4
  [6] 0x0900000007677BC0 RunEDU__16sqleRocmNotifEduFv + 0xCD0
  [7] 0x0900000006EC1890 EDUDriver__9sqzEDUObjFv + 0x2F0
  [8] 0x0900000006DD84D4 sqloEDUEntry + 0x364
  [9] 0x09000000009E7FE8 _pthread_body + 0xE8
  [10] 0xFFFFFFFFFFFFFFFC ?unknown + 0xFFFFFFFF

2021-04-30-10.38.19.636936+540 I14515150A1630       LEVEL: Error
PID     : 11076474             TID : 772            PROC :
db2sysc 0
INSTANCE: db2inst1             NODE : 000
HOSTNAME: MEMBER01
EDUID   : 772                  EDUNAME: db2castructevent 0
FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for
CF, SQLE_SINGLE_CA_HANDLE::sqleSingleCaRefreshAdapterStatus,
probe:7358
MESSAGE : ZRC=0x802700FC=-2144927492=SQLE_SAL_INV_PARM "Invalid
input parameter"
DATA #1 : String, 69 bytes
Could not match this member's HCA with a device. Skip to the
next HCA
DATA #2 : SAL CF Index, PD_TYPE_SAL_CF_INDEX, 8 bytes
1
DATA #3 : SAL CF Node Number, PD_TYPE_SAL_CF_NODE_NUM, 2 bytes
128
DATA #4 : SAL CF Server Name, PD_TYPE_SAL_CF_SERVER_NAME, 8
bytes
CF-host
DATA #5 : SAL Member Device Name,
PD_TYPE_SAL_MEMBER_DEVICE_NAME, 4 bytes
hca1
CALLSTCK: (Static functions may not be resolved correctly, as
they are resolved to the nearest symbol)
  [0] 0x0900000007627104
sqleSingleCaRefreshAdapterStatus__21SQLE_SINGLE_CA_HANDLEFCb +
0xA34
  [1] 0x09000000076257FC
sqleSingleCfOpenAndConnect__21SQLE_SINGLE_CA_HANDLEFCUi + 0x6DC
  [2] 0x090000000762D060
sqleSingleCaInitialize__21SQLE_SINGLE_CA_HANDLEFCUlCUi + 0x4F0
  [3] 0x09000000076144B0
sqleCaCpAddCa__17SQLE_CA_CONN_POOLFsCUiCPUl + 0x7A0
  [4] 0x0900000007679CE4
ROCM_StateCaInitMonitor__16sqleRocmNotifEduFv + 0xCA4
  [5] 0x0900000007677BC0 RunEDU__16sqleRocmNotifEduFv + 0xCD0
  [6] 0x0900000006EC1890 EDUDriver__9sqzEDUObjFv + 0x2F0
  [7] 0x0900000006DD84D4 sqloEDUEntry + 0x364
  [8] 0x09000000009E7FE8 _pthread_body + 0xE8
  [9] 0xFFFFFFFFFFFFFFFC ?unknown + 0xFFFFFFFF


After applying the fix for this problem, db2start will return
SQL1659N when a member fails to open a specific RDMA device
during db2start, and the error messages above will be reduced so
that they don't spam the db2diag.log.
Problem Summary:
****************************************************************
* USERS AFFECTED:                                              *
* pureScale users                                              *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
* See Error Description                                        *
****************************************************************
* RECOMMENDATION:                                              *
* Upgrade to Db2 11.5m7fp0 or higher                           *
****************************************************************
Local Fix:
Solution
Workaround
****************************************************************
* USERS AFFECTED:                                              *
* pureScale users                                              *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
* See Error Description                                        *
****************************************************************
* RECOMMENDATION:                                              *
* Upgrade to Db2 11.5m7fp0 or higher                           *
****************************************************************
Comment
First fixed in Db2 11.5m7fp0
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
06.06.2021
01.12.2021
01.12.2021
Problem solved at the following versions (IBM BugInfos)
Problem solved according to the fixlist(s) of the following version(s)