Обсуждение: Need LIMIT and ORDER BY for UPDATE
All, I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE commands. Is this possible? UPDATE invoice i SET reserve_ts = NOW() + '1 hour'::timestamp FROM account a WHERE a.acct_id = i.acct_id AND i.reserve_ts < NOW() AND a.status = 'A' AND i.is_paid IS FALSE ORDER BY i.create_ts ASC LIMIT 1 RETURNING invoice_id; This query would find JUST ONE invoice record which is not paid and reserve the right to operate on the row using the 'reserve_ts' column for all active accounts. The one row would be the oldest invoice matching the criteria. Only that one row would be updated and the invoice_id of the updated row (if any) would be returned. Running a query like this over and over would pop just one record off the queue and would guarantee an atomic reservation. Similar syntax would be very useful for DELETE operations. The idea is that doing an UPDATE with RETURNING allows a get/set operation in a single query. Without the LIMIT and ORDER BY, I'm forced to reserve all rows at once which my application doesn't want to handle like that. Can something like what I want be added in a future version? Ideas or alternatives? I don't see how I can rewrite this query as a single statement any other way and get the same expectations. -- Dante
"D. Dante Lorenso" <dante@lorenso.com> wrote: > > All, > > I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE > commands. Is this possible? > > UPDATE invoice i > SET reserve_ts = NOW() + '1 hour'::timestamp > FROM account a > WHERE a.acct_id = i.acct_id > AND i.reserve_ts < NOW() > AND a.status = 'A' > AND i.is_paid IS FALSE > ORDER BY i.create_ts ASC > LIMIT 1 > RETURNING invoice_id; > > This query would find JUST ONE invoice record which is not paid and > reserve the right to operate on the row using the 'reserve_ts' column > for all active accounts. The one row would be the oldest invoice > matching the criteria. Only that one row would be updated and the > invoice_id of the updated row (if any) would be returned. > > Running a query like this over and over would pop just one record off > the queue and would guarantee an atomic reservation. While I'm not going to argue as to whether your suggestion would be a good idea or not, I will suggest you look at SELECT FOR UPDATE, which will allow you to do what you desire. -- Bill Moran http://www.potentialtech.com
D. Dante Lorenso wrote: > I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE > commands. Is this possible? > > UPDATE invoice i > SET reserve_ts = NOW() + '1 hour'::timestamp > FROM account a > WHERE a.acct_id = i.acct_id > AND i.reserve_ts < NOW() > AND a.status = 'A' > AND i.is_paid IS FALSE > ORDER BY i.create_ts ASC > LIMIT 1 > RETURNING invoice_id; > > This query would find JUST ONE invoice record which is not paid and > reserve the right to operate on the row using the 'reserve_ts' > column for all active accounts. The one row would be the oldest > invoice matching the criteria. Only that one row would be updated > and the invoice_id of the updated row (if any) would be returned. > Can something like what I want be added in a future version? Ideas > or alternatives? I don't see how I can rewrite this query as a > single statement any other way and get the same expectations. Doesn't this do it, assuming invoice_id is unique? UPDATE invoice SET reserve_ts = NOW() + '1 hour'::timestamp where invoice_id = (select invoice_id from invoice i, account a WHERE a.acct_id = i.acct_id AND i.reserve_ts < NOW() AND a.status = 'A' AND i.is_paid IS FALSE ORDER BY i.create_ts ASC LIMIT 1) RETURNING invoice_id; - John Burger MITRE
John D. Burger wrote: > D. Dante Lorenso wrote: > >> I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE >> commands. Is this possible? >> >> UPDATE invoice i >> SET reserve_ts = NOW() + '1 hour'::timestamp >> FROM account a >> WHERE a.acct_id = i.acct_id >> AND i.reserve_ts < NOW() >> AND a.status = 'A' >> AND i.is_paid IS FALSE >> ORDER BY i.create_ts ASC >> LIMIT 1 >> RETURNING invoice_id; >> >> This query would find JUST ONE invoice record which is not paid and >> reserve the right to operate on the row using the 'reserve_ts' column >> for all active accounts. The one row would be the oldest invoice >> matching the criteria. Only that one row would be updated and the >> invoice_id of the updated row (if any) would be returned. > >> Can something like what I want be added in a future version? Ideas or >> alternatives? I don't see how I can rewrite this query as a single >> statement any other way and get the same expectations. > > Doesn't this do it, assuming invoice_id is unique? > > UPDATE invoice > SET reserve_ts = NOW() + '1 hour'::timestamp > where invoice_id = > (select invoice_id from invoice i, > account a > WHERE a.acct_id = i.acct_id > AND i.reserve_ts < NOW() > AND a.status = 'A' > AND i.is_paid IS FALSE > ORDER BY i.create_ts ASC > LIMIT 1) > RETURNING invoice_id; Doesn't this create race condition in the query where multiple processes might find the same invoice_id while executing the inner select. The update would then update the same record more than once during the update step and 2 processes might get the same invoice_id returned. In otherwords, moving the select criteria into a sub-query breaks the atomic nature of the update. Right? I have been trying to doing something like this, though: UPDATE invoice SET reserve_ts = NOW() + '1 hour'::timestamp WHERE reserve_ts < NOW() AND invoice_id = ( SELECT invoice_id FROM invoice i, account a WHERE a.acct_id = i.acct_id AND i.reserve_ts < NOW() AND a.status = 'A' AND i.is_paid IS FALSE ORDER BY i.create_ts ASC LIMIT 1 ) RETURNING invoice_id; By checking the reserve_ts inside the SELECT and again inside the UPDATE this should catch the race condition and only allow one process to perform the update on a given match. If the other process has updated the reserve_ts already, the reserve_ts would not pass the second check. However, the new side-effect is that one process would receive a NULL return result when the race condition occurs rather than just picking up the next queue invoice_id. Unless I can get what I really want, this will have to do, I suppose. -- Dante
Bill Moran wrote: > "D. Dante Lorenso" <dante@lorenso.com> wrote: >> All, >> >> I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE >> commands. Is this possible? >> >> UPDATE invoice i >> SET reserve_ts = NOW() + '1 hour'::timestamp >> FROM account a >> WHERE a.acct_id = i.acct_id >> AND i.reserve_ts < NOW() >> AND a.status = 'A' >> AND i.is_paid IS FALSE >> ORDER BY i.create_ts ASC >> LIMIT 1 >> RETURNING invoice_id; >> >> This query would find JUST ONE invoice record which is not paid and >> reserve the right to operate on the row using the 'reserve_ts' column >> for all active accounts. The one row would be the oldest invoice >> matching the criteria. Only that one row would be updated and the >> invoice_id of the updated row (if any) would be returned. >> >> Running a query like this over and over would pop just one record off >> the queue and would guarantee an atomic reservation. > > While I'm not going to argue as to whether your suggestion would be > a good idea or not, I will suggest you look at SELECT FOR UPDATE, which > will allow you to do what you desire. UPDATE invoice SET reserve_ts = NOW() + '1 hour'::interval WHERE invoice_id = ( SELECT invoice_id FROM invoice i, account a WHERE a.acct_id = i.acct_id AND i.reserve_ts < NOW() AND a.status = 'A' AND i.is_paid IS FALSE ORDER BY i.create_ts ASC LIMIT 1 FOR UPDATE ) RETURNING invoice_id; Does this do the same thing while still remaining a single atomic query that will guarantee no race conditions during the inner select/update? ERROR: SELECT FOR UPDATE/SHARE is not allowed in subqueries Guess not. -- Dante
In response to "D. Dante Lorenso" <dante@lorenso.com>: > Bill Moran wrote: > > "D. Dante Lorenso" <dante@lorenso.com> wrote: > >> All, > >> > >> I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE > >> commands. Is this possible? > >> > >> UPDATE invoice i > >> SET reserve_ts = NOW() + '1 hour'::timestamp > >> FROM account a > >> WHERE a.acct_id = i.acct_id > >> AND i.reserve_ts < NOW() > >> AND a.status = 'A' > >> AND i.is_paid IS FALSE > >> ORDER BY i.create_ts ASC > >> LIMIT 1 > >> RETURNING invoice_id; > >> > >> This query would find JUST ONE invoice record which is not paid and > >> reserve the right to operate on the row using the 'reserve_ts' column > >> for all active accounts. The one row would be the oldest invoice > >> matching the criteria. Only that one row would be updated and the > >> invoice_id of the updated row (if any) would be returned. > >> > >> Running a query like this over and over would pop just one record off > >> the queue and would guarantee an atomic reservation. > > > > While I'm not going to argue as to whether your suggestion would be > > a good idea or not, I will suggest you look at SELECT FOR UPDATE, which > > will allow you to do what you desire. > > UPDATE invoice > SET reserve_ts = NOW() + '1 hour'::interval > WHERE invoice_id = ( > SELECT invoice_id > FROM invoice i, account a > WHERE a.acct_id = i.acct_id > AND i.reserve_ts < NOW() > AND a.status = 'A' > AND i.is_paid IS FALSE > ORDER BY i.create_ts ASC > LIMIT 1 > FOR UPDATE > ) > RETURNING invoice_id; > > Does this do the same thing while still remaining a single atomic query > that will guarantee no race conditions during the inner select/update? > > ERROR: SELECT FOR UPDATE/SHARE is not allowed in subqueries > > Guess not. BEGIN; SELECT invoice_id FROM invoice i, account a WHERE a.acct_id = i.acct_id AND i.reserve_ts < NOW() AND a.status = 'A' AND i.is_paid IS FALSE ORDER BY i.create_ts ASC LIMIT 1 FOR UPDATE; UPDATE invoice SET reserve_ts = NOW() + '1 hour'::interval WHERE invoice_id = [previously selected value]; COMMIT; And before you start asking a lot of "won't this x or y", please read the docs: http://www.postgresql.org/docs/8.1/static/sql-select.html#SQL-FOR-UPDATE-SHARE Then feel free to ask more questions. -- Bill Moran http://www.potentialtech.com
D. Dante Lorenso wrote: > Doesn't this create race condition in the query where multiple > processes might find the same invoice_id while executing the inner > select. The update would then update the same record more than > once during the update step and 2 processes might get the same > invoice_id returned. In otherwords, moving the select criteria > into a sub-query breaks the atomic nature of the update. Right? Hmm, dunno. Sorry, my grasp of concurrency issues is still infantile. > I have been trying to doing something like this, though: ... > By checking the reserve_ts inside the SELECT and again inside the > UPDATE this should catch the race condition and only allow one > process to perform the update on a given match. If the other > process has updated the reserve_ts already, the reserve_ts would > not pass the second check. However, the new side-effect is that > one process would receive a NULL return result when the race > condition occurs rather than just picking up the next queue > invoice_id. But this could happen in any event, if there are no more invoices to process, yes? I'm picturing a set of queue consumers, each of which is already looping around such issues, anyway. - John D. Burger MITRE