Re: cost-based vacuum

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: cost-based vacuum
Дата
Msg-id 1121287188.3970.296.camel@localhost.localdomain
обсуждение исходный текст
Ответ на Re: cost-based vacuum  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: cost-based vacuum
Список pgsql-performance
On Wed, 2005-07-13 at 14:58 -0400, Tom Lane wrote:
> Ian Westmacott <ianw@intellivid.com> writes:
> > On Wed, 2005-07-13 at 11:55, Simon Riggs wrote:
> >> On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote:
> >>> It appears not to matter whether it is one of the tables
> >>> being written to that is ANALYZEd.  I can ANALYZE an old,
> >>> quiescent table, or a system table and see this effect.
> >>
> >> Can you confirm that this effect is still seen even when the ANALYZE
> >> doesn't touch *any* of the tables being accessed?
>
> > Yes.
>
> This really isn't making any sense at all.

Agreed. I think all of this indicates that some wierdness (technical
term) is happening at a different level in the computing stack. I think
all of this points fairly strongly to it *not* being a PostgreSQL
algorithm problem, i.e. if the code was executed by an idealised Knuth-
like CPU then we would not get this problem. Plus, I have faith that if
it was a problem in that "plane" then you or another would have
uncovered it by now.

> However, these certainly do not explain Ian's problem, because (a) these
> only apply to VACUUM, not ANALYZE; (b) they would only lock the table
> being VACUUMed, not other ones; (c) if these locks were to block the
> reader or writer thread, it'd manifest as blocking on a semaphore, not
> as a surge in LWLock thrashing.

I've seen enough circumstantial evidence to connect the time spent
inside LWLockAcquire/Release as being connected to the Semaphore ops
within them, not the other aspects of the code.

Months ago we discussed the problem of false sharing on closely packed
arrays of shared variables because of the large cache line size of the
Xeon MP. When last we touched on that thought, I focused on the thought
that the LWLock array was too tightly packed for the predefined locks.
What we didn't discuss (because I was too focused on the other array)
was the PGPROC shared array is equally tightly packed, which could give
problems on the semaphores in LWLock.

Intel says fairly clearly that this would be an issue.

> >> Is that Xeon MP then?
>
> > Yes.
>
> The LWLock activity is certainly suggestive of prior reports of
> excessive buffer manager lock contention, but it makes *no* sense that
> that would be higher with vacuum cost delay than without.  I'd have
> expected the other way around.
>
> I'd really like to see a test case for this...

My feeling is that a "micro-architecture" test would be more likely to
reveal some interesting information.

Best Regards, Simon Riggs


В списке pgsql-performance по дате отправления:

Предыдущее
От: Dan Harris
Дата:
Сообщение: Re: Quad Opteron stuck in the mud
Следующее
От: Vivek Khera
Дата:
Сообщение: Re: Quad Opteron stuck in the mud