home clear 64x64
en blue 200x116 de orange 200x116 info letter User
suche 36x36
Latest versionsfixlist
14.10.xC12 FixList
12.10.xC16.X5 FixList
11.70.xC9.XB FixList
11.50.xC9.X2 FixList
11.10.xC3.W5 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

Informix - Problem description

Problem IT45632 Status: Closed

NEW ER NIF CONNECTION OVER SMX MIGHT HANG IN 'CONNECTING' STATE WITH
UNRELIABLE NETWORK

product:
INFORMIX SERVER / 5725A3900 / E10 - 
Problem description:
An ER connection to a remote server might show as forever
'Connecting' in "cdr list server" output, and a connection will
never be fully established.

The reason would reside in delayed detection of a no longer
viable SMX pipe underneath the newly to be established ER NIF
connection:

the non-viability of an SMX pipe (due to packet loss or other
network errors) might be detected at rather different points in
time between two connected ER nodes.  The first one to detect
this will tear down the ER NIF connection (over SMX) and soon
after might initiate a new one, over a remaining viable SMX pipe
or a new one. The initial exchange will be about whether SMX
even can and should be used between the two servers: a specific
SMX message will be sent to the other side to which that side
will feed back OK or not OK (to use SMX). With some bad luck,
the bad SMX pipe that triggered all this and that already got
dismantled on this side, might still look good on the other
side, so that feedback from there might be placed to this pipe -
which then would be the point where its non-viability gets
detected, with the net result that the feedback message will
never arrive.

The typical NIF send thread stack for this situation, on the
initiating side, is:

Stack for thread: 1257 CDRNsT3
 base: 0x0000000059ac8000
  len:   69632
   pc: 0x000000000151167b
  tos: 0x0000000059ad8630
state: sleeping
   vp: 9

0x000000000151167b (oninit) yield_processor_mvp
0x0000000001642814 (oninit) smx_recv
0x0000000001237213 (oninit) isPeerSupportERSmxCon
0x000000000122d601 (oninit) nifiGenericStart
0x00000000014dd5d3 (oninit) th_init_initgls
0x00000000015264cf (oninit) startup

In onstat -g nif all, the connection would be seen with either
no state or, if a disconnect had been attempted, in state
"INTR,SHUT".
What's interesting here is the Connection Start time which
typically is very shortly before or the same as one delayed "SMX
thread is exiting" message.  Corresponding such messages between
the servers, possibly for multiple SMX pipes terminating, would
occur with some seconds difference between the sites.
Problem Summary:
****************************************************************
* USERS AFFECTED:                                              *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
The fix also includes changes to the error message in order to
make it better for diagnostics.
Local Fix:
Solution
Workaround
****************************************************************
* USERS AFFECTED:                                              *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
The fix also includes changes to the error message in order to
make it better for diagnostics.
Comment
Fixed in Informix Server 14.10.xC11.
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
04.03.2024
11.06.2024
24.09.2024
Problem solved at the following versions (IBM BugInfos)
Problem solved according to the fixlist(s) of the following version(s)