Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

Поиск
Список
Период
Сортировка
От Jim Jarvie
Тема Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
Дата
Msg-id 0385e562-19a9-dffe-79e0-64b89142f6a2@talentstack.to
обсуждение исходный текст
Ответ на Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-performance
Merging a couple of emails:

On 20-Aug.-2020 19:37, David Rowley wrote:

> When updates occur in a non-partitioned table we can follow item
> pointer chains to find the live row and check if the WHERE clause
> still matches to determine if the row should be updated, or in this
> case just locked since it's a SELECT FOR UPDATE. However, with
> partitioned table, a concurrent UPDATE may have caused the row to have
> been moved off to another partition, in which case the tuple's item
> pointer cannot point to it since we don't have enough address space,
> we only have 6 bytes for a TID. To get around the fact that we can't
> follow these update chains, we just throw the serialization error,
> which is what you're getting. Ideally, we'd figure out where the live
> version of the tuple is and check if it matches the WHERE clause and
> lock it if it does, but we've no means to do that with the current
> design.
This is the absolute best description of what causes the "tuple to be
locked was already moved to another partition due to concurrent update"
message I have ever seen.  It totally makes sense why this happens given
your explanation.  Thank you for giving the detail.

I am backing off the query with a time delay and then retrying and that
seems to be the correct approach as well, but I only stumbled upon that
by accident.  Hopefully when the google gnomes index the mailing list
this message will come up to save others time and worry about the message.

On 21-Aug.-2020 00:34, Thomas Munro wrote:
>
> Maybe it's just getting algorithmically ugly.  To claim some job rows,
> you have to skip all dead/non-matching tuples left behind so far at
> the start of the table by all the other sessions, and then also all
> currently locked tuples, and you have to do update-chain walks on some
> of them too.  It all gets a bit explosive once you have such high
> numbers of workers.
Yes, fundamentally it seems to come down to traffic volume.  When I get
over 128 connections all selecting and locking that one table, applying
the locks seems to struggle and the problem grows exponentially.  I'm an
extreme edge case, so not really a scenario that locking could ever have
been expected to handle but it's pretty good that it gets up to 128 at all.
> I think I'd experiment with splitting the job table up into N tables
> and feed jobs into all of them about evenly (by hashing, at random,
> whatever), and then I'd assign each consumer a "preferred" table where
> it looks for jobs first (perhaps my_worker_id % ntables), before
> trying the others in round robin order.  Then they won't trample on
> each other's toes so much.
I think this is a good idea, but (in my case) I think this is where it
will need v13 which is going to give that via "Allow ROW values to be
used as partitioning expressions" ? (e.g. will v13 permit queueid mod
200 as the partition expression to make 200 partitions to allow 200
[contention free] consumers?).
> In the past I've wondered about a hypothetical circular_seqscan
> option, which would cause table scans to start where they left off
> last time in each backend, so SELECT * FROM t LIMIT 1 repeated would
> show you a different row each time until you get all the way around to
> the start again (as we're entirely within our rights to do for a query
> with no ORDER BY).  That'd give the system a chance to vacuum and
> start refilling the start of the table before you get around to it
> again, instead of repeatedly having to step over the same useless
> pages every time you need a new job.
I like the sound of this; if I understand correctly, it would
essentially walk in insertion(-ish) order which would be OK for me. As
long as it was documented clearly; perhaps I should put a page online
about high traffic message queues with PostgreSQL for people to find
when they try the same thing.
>
> Hmm.  I guess another way to avoid colliding with others' work would
> be to try to use SELECT * FROM t TABLESAMPLE SYSTEM (10) WHERE ... FOR
> UPDATE SKIP LOCKED LIMIT ....  It's less cache-friendly, and less
> order-preserving, but way more contention-friendly.  That has another
> complication though; how do you pick 10?  And if it doesn't return any
> or enough rows, it doesn't mean there isn't enough, so you may need to
> be ready to fall back to the plain approach if having 250 rows is
> really important to you and TABLESAMPLE doesn't give you enough.  Or
> something.

For me, this is an acceptable compromise.  I just need to consume
something on each pass and up to 250 items but getting something less
than 250 would still allow progress and if it removed wait time, could
even be an overall greater throughput.  I may try this just to see how
it performs.

If someone has a strictly ordered queue, they will never really have
this issue as they must start at the beginning and go sequentially using
only 1 consumer.  Because I only need loose/approximate ordering and
throughput is the objective, all the locking and contention comes into play.

> By the way, when working with around 64 consumer processes I was also
> annoyed by the thundering herd problem when using NOTIFY.  I found
> various workaround solutions to that, but ultimately I think we need
> more precise wakeups for that sort of thing, which I hope to revisit
> one day.

I'm making a note of this because I also happen to have a different
scenario which has NOTIFY with well over 100 LISTEN consumers...  That's
not given me problems - yet - but I'm now aware of this should problems
arise.




В списке pgsql-performance по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
Следующее
От: Dirk Krautschick
Дата:
Сообщение: sizing / capacity planning tipps related to expected request or transactions per second