Re: [Patch] Optimize dropping of relation buffers using dlist

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [Patch] Optimize dropping of relation buffers using dlist
Дата
Msg-id 1801920.1596217177@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [Patch] Optimize dropping of relation buffers using dlist  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [Patch] Optimize dropping of relation buffers using dlist  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> Unfortunately, I don't have time for detailed review of this. I am
> suspicious that there are substantial performance regressions that you
> just haven't found yet. I would not take the position that this is a
> completely hopeless approach, or anything like that, but neither would
> I conclude that the tests shown so far are anywhere near enough to be
> confident that there are no problems.

I took a quick look through the v8 patch, since it's marked RFC, and
my feeling is about the same as Robert's: it is just about impossible
to believe that doubling (or more) the amount of hashtable manipulation
involved in allocating a buffer won't hurt common workloads.  The
offered pgbench results don't reassure me; we've so often found that
pgbench fails to expose performance problems, except maybe when it's
used just so.

But aside from that, I noted a number of things I didn't like a bit:

* The amount of new shared memory this needs seems several orders
of magnitude higher than what I'd call acceptable: according to my
measurements it's over 10KB per shared buffer!  Most of that is going
into the CachedBufTableLock data structure, which seems fundamentally
misdesigned --- how could we be needing a lock per map partition *per
buffer*?  For comparison, the space used by buf_table.c is about 56
bytes per shared buffer; I think this needs to stay at least within
hailing distance of there.

* It is fairly suspicious that the new data structure is manipulated
while holding per-partition locks for the existing buffer hashtable.
At best that seems bad for concurrency, and at worst it could result
in deadlocks, because I doubt we can assume that the new hash table
has partition boundaries identical to the old one.

* More generally, it seems like really poor design that this has been
written completely independently of the existing buffer hash table.
Can't we get any benefit by merging them somehow?

* I do not like much of anything in the code details.  "CachedBuf"
is as unhelpful as could be as a data structure identifier --- what
exactly is not "cached" about shared buffers already?  "CombinedLock"
is not too helpful either, nor could I find any documentation explaining
why you need to invent new locking technology in the first place.
At best, CombinedLockAcquireSpinLock seems like a brute-force approach
to an undocumented problem.

* The commentary overall is far too sparse to be of any value ---
basically, any reader will have to reverse-engineer your entire design.
That's not how we do things around here.  There should be either a README,
or a long file header comment, explaining what's going on, how the data
structure is organized, and what the locking requirements are.
See src/backend/storage/buffer/README for the sort of documentation
that I think this needs.

Even if I were convinced that there's no performance gotchas,
I wouldn't commit this in anything like its current form.

Robert again:
> Also, systems with very large shared_buffers settings are becoming
> more common, and probably will continue to become more common, so I
> don't think we can dismiss that as an edge case any more. People don't
> want to run with an 8GB cache on a 1TB server.

I do agree that it'd be great to improve this area.  Just not convinced
that this is how.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Why is pq_begintypsend so slow?
Следующее
От: Robert Haas
Дата:
Сообщение: Re: refactoring basebackup.c