Re: SELECT ... FOR UPDATE performance costs? alternatives?

Поиск
Список
Период
Сортировка
От Erik Jones
Тема Re: SELECT ... FOR UPDATE performance costs? alternatives?
Дата
Msg-id 8662E8FF-BBA6-4D15-81A5-3F618B270DD0@myemma.com
обсуждение исходный текст
Ответ на Re: SELECT ... FOR UPDATE performance costs? alternatives?  ("D. Dante Lorenso" <dante@lorenso.com>)
Список pgsql-general
On Aug 15, 2007, at 9:21 PM, D. Dante Lorenso wrote:

> Erik Jones wrote:
>> On Aug 15, 2007, at 2:39 PM, btober@ct.metrocast.net wrote:
>>> Erik Jones wrote:
>>>> On Aug 15, 2007, at 1:09 PM, D. Dante Lorenso wrote:
>>>>
>>>>> ...to ensure that only one server is processing the queue item,
>>>>> so inside PGSQL, use SELECT ... FOR UPDATE to lock the row...
>>>>> When my server is under severe load, however, this function
>>>>> begins to take a long time to execute and I begin to suspect
>>>>> that the FOR UPDATE lock might be locking the whole table and
>>>>> not just the row.  How do I know if this is the case, how can I
>>>>> optimize this procedure, and how should I be doing this
>>>>> differently?  ...
>>>>>
>>>>> Thoughts?
>>>>
>>>> SELECT ... FOR UPDATE should only be locking the rows returned
>>>> by your the select statement, in this case the one row.  You can
>>>> check what locks exist on a table (and their type) with the
>>>> pg_locks system view.
>>>>
>>>
>>> Is that correct? Documentation section 12.3.1. Table-Level Locks
>>> states 'The list below shows the available lock modes ...Remember
>>> that all of these lock modes are table-level locks, even if the
>>> name contains the word "row"'.
>> You will notice that SELECT ... FOR UPDATE is not in that list.
>> It's covered in the next section on row level locks.
>>>
>>> I wonder why bother with the SELECT statement at all. Why not
>>> just go straight to the UPDATE statement with something like
>>>
>>>    UPDATE queue SET
>>>      status = in_new_status,
>>>      ts_start = NOW(),
>>>      ts_end = NULL,
>>>      hostname = COALESCE(in_hostname, hostname)
>>>    WHERE tcq_id = (SELECT tcq_id  FROM queue q WHERE q.status =
>>> in_status ORDER BY tcq_id ASC LIMIT 1);
>>>
>>> He may need to trap an exception for the "not found" case, but
>>> what's the big deal with that?
>>>
>>> UPDATE statements acquire a ROW EXCLUSIVE on the table, which
>>> conflicts, among other things, with ROW EXCLUSIVE, so it will
>>> block other UPDATE statements initiated by other transactions.
>> That won't work because the update won't lock the row until the
>> select returns.  So, if two process execute that at the same time
>> they will both execute the subquery and return the same result,
>> the first will update it and the second will then (redundantly)
>> update it.
>
> It also won't work because I need to change AND read the row.  If I
> only do the update, I don't know what was updated.  I still need to
> return the tcq_id to my application.
>
> Maybe the update could look like this:
>
> UPDATE queue SET
>   status = in_new_status,
>   ts_start = NOW(),
>   ts_end = NULL,
>   hostname = COALESCE(in_hostname, hostname),
> WHERE status = in_status;
>
> But there I don't have the LIMIT 1, and I also don't know which
> rows got updated.  I supposed there might be some magic to find the
> OID of the affected rows, but I don't know how what would be done.
>
> I need logic like "atomic test and set" or pop 1 item off the queue
> atomically and tell me what that item was.

If you're using 8.2.x there is the RETURNING clause that can be used
with UPDATE and INSERT queries.  So, you could alter his version to
add the FOR UPDATE to the subquery and tack on the RETURNING clause
to the UPDATE and you'd have semantically equivalent.

>
> In my situation, there are a dozen or so machines polling this
> queue periodically looking for work to do.  As more polling is
> occurring, the locks seem to be taking longer so I was worried
> table-level locks might be occurring.

As I said, watch pg_locks.  Also, make sure that this function call
isn't happening inside a larger transaction that's not committing
right away when the function returns.  The lock from the SELECT ...
FOR UPDATE isn't released until the enclosing transaction commits or
rolls back.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



В списке pgsql-general по дате отправления:

Предыдущее
От: Douglas McNaught
Дата:
Сообщение: Re: PostgreSQL clustering (shared disk)
Следующее
От: Rainer Bauer
Дата:
Сообщение: Re: Yet Another COUNT(*)...WHERE...question