Re: Delete/update with limit

Поиск
Список
Период
Сортировка
От Chris Browne
Тема Re: Delete/update with limit
Дата
Msg-id 60644bymua.fsf@dba2.int.libertyrms.com
обсуждение исходный текст
Ответ на Delete/update with limit  (Csaba Nagy <nagy@ecircle-ag.com>)
Список pgsql-general
nagy@ecircle-ag.com (Csaba Nagy) writes:
> In postgres we're currently not chunking, due to the fact that the code
> to do it is simply overly contorted and inefficient compared to the
> other DBs we use. At least all the solutions we could think of to do the
> chunking in a safe way while the inserts are running in parallel,
> without disturbing them, have invariably resulted in overly complicated
> code compared to the simple delete with limit + delete trigger solution
> we have for the other DBs.
>
> Now I don't put too much hope I can convince anybody that the limit on
> the delete/update commands has valid usage scenarios, but then can
> anybody help me find a good solution to chunk-wise process such a buffer
> table where insert speed is the highest priority (thus no indexes, the
> minimum of fields), and batch processing should still work fine with big
> table size, while not impacting at all the inserts, and finish in short
> time to avoid long running transactions ? Cause I can't really think of
> one... other than our scheme with the delete with limit + trigger +
> private temp table thing.

All that comes to mind is to put a SERIAL primary key on the table,
which shouldn't be *too* terribly expensive an overhead, assuming
there is reasonably complex processing going on; you then do something
like:

- select ID from the incoming table, order by ID, limit 500, to grab a
  list of IDs;

- delete from the table for that set of IDs.

Actually, is there any particular reason why you couldn't simply have
your "batch processing" loop look like:

Loop Forever
  DELETE from incoming_table;
  VACUUM incoming_table;
End Loop;

???

The alternative that I suggested amounts to:

Loop Forever
  DELETE from incoming_table where id in (select id from incoming_table limit 500);
  VACUUM incoming_table;
End Loop;

I realize you're concerned that maintaining the index will be too
costly; I don't think it is obvious without actual benchmarking that
this is *in fact* too costly.

I'm pretty sure of one countervailing consideration: there's a cost to
VACUUMing the table that will throw in some costs; it is possible that
the cost of the index would be noise against that.
--
"cbbrowne","@","acm.org"
http://cbbrowne.com/info/lisp.html
When a man talks dirty to a woman, its sexual harassment. When a woman
talks dirty to a man, it's 3.95 per minute.

В списке pgsql-general по дате отправления:

Предыдущее
От: Ben
Дата:
Сообщение: Re: two phase commit
Следующее
От: Andrew Sullivan
Дата:
Сообщение: Re: two phase commit