Re: Hanging queries on dual CPU windows

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: Hanging queries on dual CPU windows
Дата
Msg-id 6BCB9D8A16AC4241919521715F4D8BCEA35109@algol.sollentuna.se
обсуждение исходный текст
Ответы Re: Hanging queries on dual CPU windows
Список pgsql-performance
> > > >  I dunno
> > > >
> > > > > if you've got anything gdb-equivalent under Windows,
> but that's
> > > > > the first thing I'd be interested in ...
> > > >
> > > > Here ya go:
> > > >
> > > > http://www.devisser-siderius.com/stack1.jpg
> > > > http://www.devisser-siderius.com/stack2.jpg
> > > > http://www.devisser-siderius.com/stack3.jpg
> > > >
> > > > There are three threads in the process. I guess thread 1
> > > > (stack1.jpg) is the most interesting.
> > > >
> > > > I also noted that cranking up concurrency in my app
> reproduces the
> > > > problem in about 4 minutes ;-)
> >
> > Just reproduced again.
> >
> > > Actually, stack2 looks very interesting. Does it "stay stuck" in
> > > pg_queue_signal? That's really not supposed to happen.
> >
> > Yes it does.
>
> An update on that: There is actually *two* processes in this
> state, both hanging in pg_queue_signal. I've looked at the
> source of that, and the obvious candidate for hanging is
> EnterCriticalSection. I also found this:
>
> http://blogs.msdn.com/larryosterman/archive/2005/03/02/383685.aspx
>
> where they say:
>
> "
> In addition, for Windows 2003, SP1, the EnterCriticalSection
> API has a subtle change that's intended tor resolve many of
> the lock convoy issues.  Before
> Win2003 SP1, if 10 threads were blocked on
> EnterCriticalSection and all 10 threads had the same
> priority, then EnterCriticalSection would service those
> threads in a FIFO (first -in, first-out) basis.  Starting in
> Windows 2003 SP1, the EnterCriticalSection will wake up a
> random thread from the waiting threads.  If all the threads
> are doing the same thing (like a thread pool) this won't make
> much of a difference, but if the different threads are doing
> different work (like the critical section protecting a widely
> accessed object), this will go a long way towards removing
> lock convoy semantics.
> "
>
> Could it be they broke it when they did that????

In theory, yes, but it still seems a bit far fetched :-(

If you have the env to rebuild, can you try changing the order of the lines:
    ResetEvent(pgwin32_signal_event);
    LeaveCriticalSection(&pg_signal_crit_sec);

in backend/port/win32/signal.c


And if not, can you also try disabling the stats collector and see if that makes a difference. (Could be a
workaround..)


//Magnus

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jan de Visser
Дата:
Сообщение: Re: Hanging queries on dual CPU windows
Следующее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: pg_reset_stats + cache I/O %