Обсуждение: SELECT TOP %d PERCENT, or SELECT ... LIMIT %d PERCENT ?
so I wonder, quite few times ppl asked me about my-word database TOP % PERCENT (and I guess ms-word db has it too). Now that postgres has limit(subselect) - postgresql can do the same thing. But, using a count(*) in subselects isn't very efficient. The backend gets information from limit X, so I figure it shouldn't be a very big deal to add LIMIT %d PERCENT , and pass it on as percentage, instead of just number of rows. I can give it a go and try to implement it myself, but wanted to ask here first for opinion on syntax (I have no idea what would be right syntax by sql standards), and how keen would you guys be on accepting such patch. thanks.
2008/10/12 Grzegorz Jaskiewicz <gj@pointblue.com.pl>: > so I wonder, quite few times ppl asked me about my-word database TOP % > PERCENT (and I guess ms-word db has it too). > Now that postgres has limit(subselect) - postgresql can do the same thing. > But, using a count(*) in subselects isn't very efficient. The backend gets > information from limit X, so I figure it shouldn't be a very big deal to add > LIMIT %d PERCENT , and pass it on as percentage, instead of just number of > rows. > I can give it a go and try to implement it myself, but wanted to ask here > first for opinion on syntax (I have no idea what would be right syntax by > sql standards), and how keen would you guys be on accepting such patch. > > thanks. > > I'm afraid you will not succeed it because PostgreSQL doesn't know how many rows are totally returned until it returns the last row, which is as ineficient as count(*) as you point out. Regards, -- Hitoshi Harada
that's a shame. I figure, with the "WITH ... AS .." you can kind of speed up counts, just by doing it once - now. But still, it would be great to be able to use PERCENT, against estimate. After all, imagine a table FOO with 5 rows, and something like this: WITH c AS (select count(*)/4 as n from FOO) select * from FOO limit (select n from c) offset (select n*X from c); now X is for every 25%, but you wan't get to see last row if you use offset 75 percent (X=3). Which kind of puts me off the idea of using PERCENT anyway, because it isn't quite accurate. So thanks for answer, I'll pass on this one. Hopefully there would be another chance for me to write some patch in future. ta.
I wonder if this could be implemented using the window-function infrastructure... ...Robert On Sun, Oct 12, 2008 at 9:07 AM, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote: > that's a shame. > I figure, with the "WITH ... AS .." you can kind of speed up counts, just by > doing it once - now. But still, it would be great to be able to use PERCENT, > against estimate. > After all, imagine a table FOO with 5 rows, and something like this: > > WITH c AS (select count(*)/4 as n from FOO) select * from FOO limit (select > n from c) offset (select n*X from c); > > now X is for every 25%, but you wan't get to see last row if you use offset > 75 percent (X=3). Which kind of puts me off the idea of using PERCENT > anyway, because it isn't quite accurate. > > So thanks for answer, I'll pass on this one. Hopefully there would be > another chance for me to write some patch in future. > > ta. > >
2008/10/12 Robert Haas <robertmhaas@gmail.com>: > I wonder if this could be implemented using the window-function > infrastructure... > Yeah, actually window functions I am working now has percent_rank() or something he wants. That is better than WITH clauses but still we must see the last row before determine the percent. Better than count(*) of course, but I'm not sure it meets his needs. Regards, -- Hitoshi Harada
On 2008-10-12, at 16:22, Hitoshi Harada wrote: > 2008/10/12 Robert Haas <robertmhaas@gmail.com>: >> I wonder if this could be implemented using the window-function >> infrastructure... >> > > Yeah, actually window functions I am working now has percent_rank() or > something he wants. That is better than WITH clauses but still we must > see the last row before determine the percent. Better than count(*) of > course, but I'm not sure it meets his needs. > excuse my ignorance, apparently I don't recall that patches discussion in here. If it pops out, and gets into cvs, can someone remind me of that - please ? Also, will that patch also be able to speed up count(*) , or is it for different purpose ? Sorry if I am repeating the subject, if so - please just respond in private. thanks folks.
2008/10/13 Grzegorz Jaskiewicz <gj@pointblue.com.pl>: > > On 2008-10-12, at 16:22, Hitoshi Harada wrote: > >> 2008/10/12 Robert Haas <robertmhaas@gmail.com>: >>> >>> I wonder if this could be implemented using the window-function >>> infrastructure... >>> >> >> Yeah, actually window functions I am working now has percent_rank() or >> something he wants. That is better than WITH clauses but still we must >> see the last row before determine the percent. Better than count(*) of >> course, but I'm not sure it meets his needs. >> > excuse my ignorance, apparently I don't recall that patches discussion in > here. If it pops out, and gets into cvs, can someone remind me of that - > please ? You can track pgsql news on David Fetter's blog: http://people.planetpostgresql.org/dfetter/ > Also, will that patch also be able to speed up count(*) , or is it for > different purpose ? Sorry if I am repeating the subject, if so - please just > respond in private. I guess cume_dist() will do for you: SELECT id, cume_dist FROM( SELECT id, cume_dist() OVER (ORDER BY id) FROM foo )s WHERE cume_dist < 0.3 will produce a result with id of 30% ascending order. I am working on this window functions infrastructure to get into 8.4 but we're still not sure if it will. You can google "window functions" or my design docs on http://umitanuki.net/pgsql/wfv06/design.html to see what it is for. Regards, -- Hitoshi Harada