suche 36x36
  • Admin-Scout-small-Banner
           

    CURSOR Admin-Scout

    get the ultimate tool for Informix

    pfeil  
Latest versionsfixlist
14.10.xC10 FixList
12.10.xC16.X5 FixList
11.70.xC9.XB FixList
11.50.xC9.X2 FixList
11.10.xC3.W5 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

Informix - Problem description

Problem IT16807 Status: Closed

HDR PRIMARY BLOCKS AT CHECKPOINT AFTER PING TIMEOUT AND FAILURE TO
SUCCESSFULLY RECONNECT

product:
INFORMIX SERVER / 5725A3900 / C10 - IDS 12.10
Problem description:
The primary server can hang at checkpoint unable to proceed.
The hang is because nobody can flush logical log buffers, so you
would tend to see the main_loop thread waiting for threads to
leave their critical section, and then some threads that might
be trying to flush log buffers would be waiting on the previous
log buffers to flush and then ultimately there would be 1 or
possibly more threads that would be trying to flush a logical
log buffer, but they would be waiting on the drcb_lock mutex.

Preceding the hang the primary server would have encountered a
ping timeout and then you would not see it properly reconnect to
the secondary get hdr operational again.

So here's the stack for the main_loop thread waiting for threads
to leave critical sections

Stack for thread: 7 main_loop():
yield_processor_mvp
wait4critex
checkpoint
main_loop
th_init_initgls
startup

From onstat -u you can see a thread or threads in critical
sections and also waiting on log buf (so X and G flags) lots of
threads waiting on checkpoint flag (C)

Userthreads
address          flags   sessid   user     tty      wait
tout locks nreads   nwrites
14d0326a8        G-BPX-- 37072    garpac   -        1207419b8
0    54    8159649  48299
11f3f63e8        C--P--- 41922    danielal -        10a45b850
0    1     592209   203
11f3f6ca8        C--P--- 43376    informix -        10a45b850
0    1     1        0
11f3f7568        C--P--- 43252    informix -        10a45b850
0    0     15       0
11f3f7e28        C--P--- 43238    andreasl 107      10a45b850
0    2     2200     0
11f3f86e8        C--P--- 43128    kasiak   131      10a45b850
0    1     17       0

Then last you can see a thread waiting a long time waiting for
the drcb_lock in the onstat -g  lmx output:

Locked mutexes:
mid      addr             name               holder   lkcnt
waiter   waittime
9687     120372ad0        drcb_lock          139      0      111
5797
9688     120372b78        drcb_node_count_lo 139      0


The owner of the drcb_lock mutex is the dr_prsend thread and
it's stack would look like this (so it's waiting for an smx
response from the secondary server)

yield_processor_mvp
smx_listWait
smx_recv
GetServerVersionInfo
dr_state_change
dr_session_thread
startup

then the thread waiting for the drcb_lock mutex stack showing
it's trying to flush a logical log buffer:

yield_processor_mvp
mt_lock_wait
mt_lock
dr_logcopy
logwrite
log_flushtolsn
logm_flush
rmiLogFlush
rmiMonitor
cdrMonitorThread
cdrTrampolineThread
th_init_initgls
startup

(in this particular case it was a thread used for ER but it
wouldn't have to be, it could be any thread that had tried to
call logwrite())
Problem Summary:
****************************************************************
* USERS AFFECTED:                                              *
* All users                                                    *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
* See Error Description                                        *
****************************************************************
* RECOMMENDATION:                                              *
* Update to IBM Informix Server 12.10.xC8                      *
****************************************************************
Local Fix:
Solution
Workaround
not known / see Local fix
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
29.08.2016
09.12.2016
09.12.2016
Problem solved at the following versions (IBM BugInfos)
12.10.xC8
Problem solved according to the fixlist(s) of the following version(s)
Informix EditionsInformix Editions
Informix Editions
DocumentationDocumentation
Documentation
IBM NewsletterIBM Newsletter
IBM Newsletter
Current BugsCurrent Bugs
Current Bugs
Bug ResearchBug Research
Bug Research
Bug FixlistsBug Fixlists
Bug Fixlists
Release NotesRelease Notes
Release Notes
Machine NotesMachine Notes
Machine Notes
Release NewsRelease News
Release News
Product LifecycleProduct Lifecycle
Lifecycle
Media DownloadMedia Download
Media Download