Обсуждение: Need advice to avoid ORDER BY
Hello, I have one query in my postgresql 9.2.3 that took 137 ms to me executed and looking a way what I can do to optimize it. I have one table generated numbers from 1 to 1 000 000 and I need to get first free id, meanwhile id's when is taken can be free (deleted data and id is free for next job). Table is simple: id serial, jobid text, valids int default 0 (Yes, I have index). my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1 I need the first id only. My question is: Is there a way how I can avoid using ORDER BY to receive the first free id from mytable ? Cheers, Condor
On Thu, Apr 4, 2013 at 4:32 PM, Condor <condor@stz-bg.com> wrote: > Hello, > > I have one query in my postgresql 9.2.3 that took 137 ms to me executed and > looking a way > what I can do to optimize it. I have one table generated numbers from 1 to 1 > 000 000 and > I need to get first free id, meanwhile id's when is taken can be free > (deleted data and id > is free for next job). Table is simple: > > > id serial, > jobid text, > valids int default 0 > > (Yes, I have index). > > > my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC > LIMIT 1 > > I need the first id only. > > My question is: Is there a way how I can avoid using ORDER BY to receive the > first > free id from mytable ? well, you can (via EXISTS()), but you can really optimize this with partial index. CREATE INDEX ON mytable (id) WHERE valids = 0; then, SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1; should return in zero time since btree indexes can optimize order by expressions and the partial index will bypass having to wade through the rows you don't want. merlin
On 2013-04-05 00:38, Merlin Moncure wrote: > On Thu, Apr 4, 2013 at 4:32 PM, Condor <condor@stz-bg.com> wrote: >> Hello, >> >> I have one query in my postgresql 9.2.3 that took 137 ms to me >> executed and >> looking a way >> what I can do to optimize it. I have one table generated numbers from >> 1 to 1 >> 000 000 and >> I need to get first free id, meanwhile id's when is taken can be free >> (deleted data and id >> is free for next job). Table is simple: >> >> >> id serial, >> jobid text, >> valids int default 0 >> >> (Yes, I have index). >> >> >> my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id >> ASC >> LIMIT 1 >> >> I need the first id only. >> >> My question is: Is there a way how I can avoid using ORDER BY to >> receive the >> first >> free id from mytable ? > > well, you can (via EXISTS()), but you can really optimize this with > partial index. > > CREATE INDEX ON mytable (id) WHERE valids = 0; > > then, > > SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1; > > should return in zero time since btree indexes can optimize order by > expressions and the partial index will bypass having to wade through > the rows you don't want. > > merlin Hm, I only can say: Thank You! Your solution is work, but Im now a little confused. I has a index CREATE INDEX ON mytable (valids) USING BTREE (valids) and the query to find valids = 0 tooks 137 ms. Why, your solution is worked ? Yes, it's worked. Cheers, Condor
On Thu, Apr 4, 2013 at 4:49 PM, Condor <condor@stz-bg.com> wrote: >> SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1; >> >> should return in zero time since btree indexes can optimize order by >> expressions and the partial index will bypass having to wade through >> the rows you don't want. >> >> merlin > > > > Hm, > I only can say: Thank You! > Your solution is work, but Im now a little confused. I has a index > CREATE INDEX ON mytable (valids) USING BTREE (valids) and the > query to find valids = 0 tooks 137 ms. problem is that you are looking for needles (valids = 0) in the haystack. the problem wasn't really the order, but the fact that you had to scan an arbitrary amount of rows before finding a candidate record. so the partial index manages this problem by creating index entries *only for records that match a criteria*, and the planner recognizes this and prefers that index when the criteria is also present in the query. In other words, index only the needles. merlin
On 4/4/2013 2:49 PM, Condor wrote: > Your solution is work, but Im now a little confused. I has a index > CREATE INDEX ON mytable (valids) USING BTREE (valids) and the > query to find valids = 0 tooks 137 ms. the query can't use that index, and the separate index on id at the same time, it has to pick one. Merlin's partial index is how you get around this. -- john r pierce 37N 122W somewhere on the middle of the left coast
Merlin Moncure <mmoncure@gmail.com> writes: > problem is that you are looking for needles (valids = 0) in the > haystack. the problem wasn't really the order, but the fact that you > had to scan an arbitrary amount of rows before finding a candidate > record. so the partial index manages this problem by creating index > entries *only for records that match a criteria*, and the planner > recognizes this and prefers that index when the criteria is also > present in the query. In other words, index only the needles. The other way to fix it is a two-column index on (valids, id), which will be more useful if sometimes you need the minimum/maximum id for some nonzero value of valids. The real point here is that you want the index to contain consecutive entries for the rows with the particular valids value you want, *in order by id*. Then the planner knows the first/last such index entry contains the answer. When you index only valids, it has to collect all the matching rows and sort them by id. regards, tom lane
On Thu, Apr 4, 2013 at 5:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> problem is that you are looking for needles (valids = 0) in the >> haystack. the problem wasn't really the order, but the fact that you >> had to scan an arbitrary amount of rows before finding a candidate >> record. so the partial index manages this problem by creating index >> entries *only for records that match a criteria*, and the planner >> recognizes this and prefers that index when the criteria is also >> present in the query. In other words, index only the needles. > > The other way to fix it is a two-column index on (valids, id), which > will be more useful if sometimes you need the minimum/maximum id > for some nonzero value of valids. right -- that's a more general solution -- here we are exploiting that A: the OP only needs access to "=0" rows and especially B: "=0" rows are a tiny fraction of the overall set (we know this because otherwise the query would have returned quickly anyways). So we get to squeak out with a tiny index pointing to only the candidate rows. Partial indexes are an underutilized trick -- the efficiency savings can be enormous. They are often useful when coding ad hoc queue operations in the database where the queued items are intermixed with items that have been resolved. merlin
Try an index like:
create index yada on mytable (id) where valids=0;
then
select max(jobid) from mytable where valids=0;
On Thu, Apr 4, 2013 at 3:32 PM, Condor <condor@stz-bg.com> wrote:
Hello,
I have one query in my postgresql 9.2.3 that took 137 ms to me executed and looking a way
what I can do to optimize it. I have one table generated numbers from 1 to 1 000 000 and
I need to get first free id, meanwhile id's when is taken can be free (deleted data and id
is free for next job). Table is simple:
id serial,
jobid text,
valids int default 0
(Yes, I have index).
my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1
I need the first id only.
My question is: Is there a way how I can avoid using ORDER BY to receive the first
free id from mytable ?
Cheers,
Condor
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
To understand recursion, one must first understand recursion.
On 2013-04-05 01:54, Merlin Moncure wrote: > On Thu, Apr 4, 2013 at 5:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Merlin Moncure <mmoncure@gmail.com> writes: >>> problem is that you are looking for needles (valids = 0) in the >>> haystack. the problem wasn't really the order, but the fact that >>> you >>> had to scan an arbitrary amount of rows before finding a candidate >>> record. so the partial index manages this problem by creating index >>> entries *only for records that match a criteria*, and the planner >>> recognizes this and prefers that index when the criteria is also >>> present in the query. In other words, index only the needles. >> >> The other way to fix it is a two-column index on (valids, id), which >> will be more useful if sometimes you need the minimum/maximum id >> for some nonzero value of valids. > > right -- that's a more general solution -- here we are exploiting that > A: the OP only needs access to "=0" rows and especially B: "=0" rows > are a tiny fraction of the overall set (we know this because otherwise > the query would have returned quickly anyways). So we get to squeak > out with a tiny index pointing to only the candidate rows. > > Partial indexes are an underutilized trick -- the efficiency savings > can be enormous. They are often useful when coding ad hoc queue > operations in the database where the queued items are intermixed with > items that have been resolved. > > merlin Thank you for every one for suggestions. I'll try to make changes tomorrow night to see what will be happened. Cheers, Condor
On 2013-04-04, Condor <condor@stz-bg.com> wrote: > Hello, > > I have one query in my postgresql 9.2.3 that took 137 ms to me executed > and looking a way > what I can do to optimize it. I have one table generated numbers from 1 > to 1 000 000 and > I need to get first free id, meanwhile id's when is taken can be free > (deleted data and id > is free for next job). Table is simple: > > > id serial, > jobid text, > valids int default 0 > > (Yes, I have index). > > > my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC > LIMIT 1 > > I need the first id only. > > My question is: Is there a way how I can avoid using ORDER BY to > receive the first > free id from mytable ? create index freejobs on mytable(id) where valids = 0 ; retry the same query. -- ⚂⚃ 100% natural