Обсуждение: Query optimization using order by and limit

Поиск
Список
Период
Сортировка

Query optimization using order by and limit

От
Michael Viscuso
Дата:
First of all, thank you for taking the time to review my question.  After attending the PostgresOpen conference in Chicago last week, I've been pouring over explain logs for hours on end and although my system is MUCH better, I still can't resolve a few issues.  Luckily my data is pretty well structured so solving one issue will likely solve many more so I'll start with this one.

Version: PostgreSQL 9.1rc1, compiled by Visual C++ build 1500, 64-bit
OS: Windows 7 64-bit
ORM: SQLAlchemy
Postgres table structure: I have daily partitioned tables for each of 4 "core tables" (the tables with the majority of my application's data).  Each daily table inherits from its parent.  I do not explicitly define a REFERENCE between these tables because I cannot guarantee the order in which the events are inserted into the database, but where there are references, the referenced row should exist in the other's daily table.  The reason I partitioned the data in this manner is to increase query speed and make it easy to archive old data. (I'm new to high-end Postgres performance so there's likely several fundamental flaws in my assumptions.  I won't turn down any recommendation.) 

An example of a daily partitioned table follows:

cb=# \d osmoduleloads_2011_09_14;
                                     Table "public.osmoduleloads_2011_09_14"
        Column         |            Type             |                         Modifiers
-----------------------+-----------------------------+------------------------------------------------------------
 guid                  | numeric(20,0)               | not null
 osprocess_guid        | numeric(20,0)               | not null
 filepath_guid         | numeric(20,0)               | not null
 firstloadtime         | numeric(20,0)               | not null
 md5hash               | bytea                       | not null
 host_guid             | numeric(20,0)               | default NULL::numeric
 process_create_time   | numeric(20,0)               | default NULL::numeric
 process_filepath_guid | numeric(20,0)               | default NULL::numeric
 event_time            | timestamp without time zone | default '2011-09-14 00:00:00'::timestamp without time zone
Indexes:
    "osmoduleloads_2011_09_14_pkey" PRIMARY KEY, btree (guid)
    "idx_osmoduleloads_2011_09_14_filepath_guid" btree (filepath_guid)
    "idx_osmoduleloads_2011_09_14_firstload_time" btree (firstloadtime)
    "idx_osmoduleloads_2011_09_14_host_guid" btree (host_guid)
    "idx_osmoduleloads_2011_09_14_md5hash" btree (md5hash)
    "idx_osmoduleloads_2011_09_14_osprocess_guid" btree (osprocess_guid)
Check constraints:
    "osmoduleloads_2011_09_14_event_time_check" CHECK (event_time = '2011-09-14 00:00:00'::timestamp without time zone)
    "osmoduleloads_2011_09_14_firstloadtime_check" CHECK (firstloadtime >= 129604464000000000::bigint::numeric AND firstloadtime < 129605328000000000::bigint::numeric)
Inherits: osmoduleloads

Objective:  The firstloadtime check constraint ensures that the record is applicable to that daily table. (In case you were wondering, the large numerics correspond to the Windows 100-nanosecond since the Epoch.) I'm inserting millions of records into each daily table so "query slowness" is quite easy to spot.  Given that there is so much data per daily table, I was hoping to use the order by and limit clauses to "stop out" a query once it sufficed the limit clause and not be forced to visit each daily table.  However, I'm spending way too much time in the older tables than I'd like - which leads me to believe that I;m doing something wrong.  For ease of viewing, my explain analyze can be found at http://explain.depesz.com/s/tot 

I'm still very new to this so I'm not sure if explain.depesz.com saves the original query.  It wasn't readily apparent that it did so here is the original query:

SELECT osm_1.*, storefiles_1.*, filepaths_1.*, filepaths_2.* FROM (SELECT * FROM osmoduleloads JOIN hosts ON hosts.guid = osmoduleloads.host_guid WHERE hosts.guid = '2007075705813916178' AND osmoduleloads.firstloadtime >= 129604320000000000 AND osmoduleloads.firstloadtime < 129610367990000000 AND hosts.enabled = true AND hosts.user_id = 111 ORDER BY osmoduleloads.firstloadtime DESC LIMIT 251) AS osm_1 LEFT OUTER JOIN storefiles AS storefiles_1 ON osm_1.md5hash = storefiles_1.md5hash LEFT OUTER JOIN filepaths AS filepaths_1 ON osm_1.process_filepath_guid = filepaths_1.guid AND osm_1.event_time = filepaths_1.event_time LEFT OUTER JOIN filepaths AS filepaths_2 ON osm_1.filepath_guid = filepaths_2.guid AND osm_1.event_time= filepaths_2.event_time ORDER BY osm_1.firstloadtime DESC;

Hopefully my assumptions about order by and limit are correct and this query can be optimized.  

Again, appreciate any help you can lend.  Thanks in advance.

Mike

Re: Query optimization using order by and limit

От
Greg Smith
Дата:
On 09/21/2011 07:14 PM, Michael Viscuso wrote:
Check constraints:
    "osmoduleloads_2011_09_14_event_time_check" CHECK (event_time = '2011-09-14 00:00:00'::timestamp without time zone)
    "osmoduleloads_2011_09_14_firstloadtime_check" CHECK (firstloadtime >= 129604464000000000::bigint::numeric AND firstloadtime < 129605328000000000::bigint::numeric)
Inherits: osmoduleloads

That weird casting can't be helping.  I'm not sure if it's your problem here, but the constraint exclusion code is pretty picky about matching the thing you're looking for against the CHECK constraint, and this is a messy one.  The bigint conversion in the middle there isn't doing anything useful for you anyway; you really should simplify this to just look like this:

firstloadtime >= 129604464000000000::numeric

SELECT osm_1.*, storefiles_1.*, filepaths_1.*, filepaths_2.* FROM (SELECT * FROM osmoduleloads JOIN hosts ON hosts.guid = osmoduleloads.host_guid WHERE hosts.guid = '2007075705813916178' AND osmoduleloads.firstloadtime >= 129604320000000000 AND osmoduleloads.firstloadtime < 129610367990000000 AND hosts.enabled = true AND hosts.user_id = 111 ORDER BY osmoduleloads.firstloadtime DESC LIMIT 251) AS osm_1 LEFT OUTER JOIN storefiles AS storefiles_1 ON osm_1.md5hash = storefiles_1.md5hash LEFT OUTER JOIN filepaths AS filepaths_1 ON osm_1.process_filepath_guid = filepaths_1.guid AND osm_1.event_time = filepaths_1.event_time LEFT OUTER JOIN filepaths AS filepaths_2 ON osm_1.filepath_guid = filepaths_2.guid AND osm_1.event_time= filepaths_2.event_time ORDER BY osm_1.firstloadtime DESC;


What you should start with here is confirming whether or not a simpler query touches all of the partitions or just the ones you expect it to.  A simpler one like this:

SELECT * FROM osmoduleloads WHERE osmoduleloads.firstloadtime >= 129604320000000000 AND osmoduleloads.firstloadtime < 129610367990000000;

Would be the place to begin. Once you've got that working, then you can build up more pieces, and see if one of them results in the query not excluding partitions anymore or not.  I can't figure out if you're running into a basic error here, where constraint exclusion just isn't working at all, or if you are only having this problem because the query is too complicated.  Figuring that out will narrow the potential solutions.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

Re: Query optimization using order by and limit

От
Tom Lane
Дата:
Greg Smith <greg@2ndQuadrant.com> writes:
> That weird casting can't be helping.  I'm not sure if it's your problem
> here, but the constraint exclusion code is pretty picky about matching
> the thing you're looking for against the CHECK constraint, and this is a
> messy one.  The bigint conversion in the middle there isn't doing
> anything useful for you anyway; you really should simplify this to just
> look like this:
> firstloadtime >= 129604464000000000::numeric

I have a more aggressive suggestion: change all the numeric(20,0) fields
to bigint.  Unless the OP actually needs values wider than 64 bits,
the choice to use numeric is a significant performance penalty for
nothing.

            regards, tom lane

Re: Query optimization using order by and limit

От
Michael Viscuso
Дата:
Thanks guys,

First of all, I should have included my postgres.conf file with the
original submission.  Sorry about that.  It is now attached.

Based on a recommendation, I also should have shown the parent child
relationship between osmoduleloads and its daily partitioned tables.  to
reduce clutter, It is at the end of this message.

Taking this one step at a time and taking Greg's second suggestion
first, issuing

select * from osmoduleloads WHERE osmoduleloads.firstloadtime >=
129604320000000000 AND osmoduleloads.firstloadtime < 129610367990000000;

appears to only query the appropriate daily tables (2011_09_13 through
2011_09_20 - http://explain.depesz.com/s/QCG).  So it appears that
constraint_exclusion is working properly.  Putting a limit on the query
like:

select * from osmoduleloads WHERE osmoduleloads.firstloadtime >=
129604320000000000 AND osmoduleloads.firstloadtime < 129610367990000000
limit 251;

has the result that I'd expect to see http://explain.depesz.com/s/O7fZ.
Ordering by firstloadtime AND limiting like:

select * from osmoduleloads WHERE osmoduleloads.firstloadtime >=
129604320000000000 AND osmoduleloads.firstloadtime < 129610367990000000
order by firstloadtime desc limit 251;

also has the result that I'd expect to see
http://explain.depesz.com/s/RDh.

Adding the hosts join condition to the mix was still OK
http://explain.depesz.com/s/2Ns.

Adding the hosts.enabled condition was still OK
http://explain.depesz.com/s/UYN.

Adding the hosts.user_id = 111 started the descent but it appears to
still be obeying the proper contraint_exclusion that I'd expect, just
with a ton of rows returned from the most recent daily tables
http://explain.depesz.com/s/4WE.

Adding the final condition hosts_guid = '2007075705813916178' is what
ultimately kills it http://explain.depesz.com/s/8zy.  By adding the
host_guid, it spends considerably more time in the older tables than
without this condition and I'm not sure why.

Thanks Greg for the recommendation to step through it like that -
hopefully this helps get us closer to a resolution.

Greg/Tom, you are correct, these columns should be modified to whatever
is easiest for Postgres to recognize 64-bit unsigned integers.  Would
you still recommend bigint for unsigned integers?  I likely read the
wrong documentation that suggested bigint for signed 64-bit integers and
numeric(20) for unsigned 64-bit integers.

Thanks again for all your help!  Perhaps 15 hours of pouring over
explain logs will finally pan out!

Mike

cb=# \d+ osmoduleloads;
                                     Table "public.osmoduleloads"
        Column         |            Type             |
Modifiers       | Storage  | Description
-----------------------+-----------------------------+-----------------------+----------+-------------
 guid                  | numeric(20,0)               | not
null              | main     |
 osprocess_guid        | numeric(20,0)               | not
null              | main     |
 filepath_guid         | numeric(20,0)               | not
null              | main     |
 firstloadtime         | numeric(20,0)               | not
null              | main     |
 md5hash               | bytea                       | not
null              | extended |
 host_guid             | numeric(20,0)               | default
NULL::numeric | main     |
 process_create_time   | numeric(20,0)               | default
NULL::numeric | main     |
 process_filepath_guid | numeric(20,0)               | default
NULL::numeric | main     |
 event_time            | timestamp without time zone
|                       | plain    |
Indexes:
    "osmoduleloads_pkey" PRIMARY KEY, btree (guid)
Child tables: osmoduleloads_2001_12_31,
              osmoduleloads_2010_10_11,
              osmoduleloads_2010_10_12,
              osmoduleloads_2010_10_13,
              osmoduleloads_2011_07_27,
              osmoduleloads_2011_08_04,
              osmoduleloads_2011_08_05,
              osmoduleloads_2011_08_06,
              osmoduleloads_2011_08_07,
              osmoduleloads_2011_08_08,
              osmoduleloads_2011_08_09,
              osmoduleloads_2011_08_10,
              osmoduleloads_2011_08_11,
              osmoduleloads_2011_08_12,
              osmoduleloads_2011_08_13,
              osmoduleloads_2011_08_14,
              osmoduleloads_2011_08_15,
              osmoduleloads_2011_08_16,
              osmoduleloads_2011_08_17,
              osmoduleloads_2011_08_18,
              osmoduleloads_2011_08_19,
              osmoduleloads_2011_08_20,
              osmoduleloads_2011_08_21,
              osmoduleloads_2011_08_22,
              osmoduleloads_2011_08_23,
              osmoduleloads_2011_08_24,
              osmoduleloads_2011_08_25,
              osmoduleloads_2011_08_26,
              osmoduleloads_2011_08_27,
              osmoduleloads_2011_08_28,
              osmoduleloads_2011_08_29,
              osmoduleloads_2011_08_30,
              osmoduleloads_2011_08_31,
              osmoduleloads_2011_09_01,
              osmoduleloads_2011_09_02,
              osmoduleloads_2011_09_03,
              osmoduleloads_2011_09_04,
              osmoduleloads_2011_09_05,
              osmoduleloads_2011_09_06,
              osmoduleloads_2011_09_07,
              osmoduleloads_2011_09_08,
              osmoduleloads_2011_09_09,
              osmoduleloads_2011_09_10,
              osmoduleloads_2011_09_11,
              osmoduleloads_2011_09_12,
              osmoduleloads_2011_09_13,
              osmoduleloads_2011_09_14,
              osmoduleloads_2011_09_15,
              osmoduleloads_2011_09_16,
              osmoduleloads_2011_09_17,
              osmoduleloads_2011_09_18,
              osmoduleloads_2011_09_19,
              osmoduleloads_2011_09_20,
              osmoduleloads_2011_12_01
Has OIDs: no

On 9/21/2011 10:09 PM, Tom Lane wrote:
> Greg Smith <greg@2ndQuadrant.com> writes:
>> That weird casting can't be helping.  I'm not sure if it's your problem
>> here, but the constraint exclusion code is pretty picky about matching
>> the thing you're looking for against the CHECK constraint, and this is a
>> messy one.  The bigint conversion in the middle there isn't doing
>> anything useful for you anyway; you really should simplify this to just
>> look like this:
>> firstloadtime >= 129604464000000000::numeric
> I have a more aggressive suggestion: change all the numeric(20,0) fields
> to bigint.  Unless the OP actually needs values wider than 64 bits,
> the choice to use numeric is a significant performance penalty for
> nothing.
>
>             regards, tom lane
>


Вложения

Re: Query optimization using order by and limit

От
Tom Lane
Дата:
Michael Viscuso <michael.viscuso@getcarbonblack.com> writes:
> Greg/Tom, you are correct, these columns should be modified to whatever
> is easiest for Postgres to recognize 64-bit unsigned integers.  Would
> you still recommend bigint for unsigned integers?  I likely read the
> wrong documentation that suggested bigint for signed 64-bit integers and
> numeric(20) for unsigned 64-bit integers.

Unsigned?  Oh, hm, that's a bit of a problem because we don't have any
unsigned types.  If you really need to go to 2^64 and not 2^63 then
you're stuck with numeric ... but that last bit is costing ya a lot.

            regards, tom lane

Re: Query optimization using order by and limit

От
"ktm@rice.edu"
Дата:
On Wed, Sep 21, 2011 at 11:22:53PM -0400, Tom Lane wrote:
> Michael Viscuso <michael.viscuso@getcarbonblack.com> writes:
> > Greg/Tom, you are correct, these columns should be modified to whatever
> > is easiest for Postgres to recognize 64-bit unsigned integers.  Would
> > you still recommend bigint for unsigned integers?  I likely read the
> > wrong documentation that suggested bigint for signed 64-bit integers and
> > numeric(20) for unsigned 64-bit integers.
>
> Unsigned?  Oh, hm, that's a bit of a problem because we don't have any
> unsigned types.  If you really need to go to 2^64 and not 2^63 then
> you're stuck with numeric ... but that last bit is costing ya a lot.
>
>             regards, tom lane
>

Hi Michael,

If you have access to the application, you can map the unsigned 64-bits
to the PostgreSQL signed 64-bit type with a simple subtraction. That will
allow you to drop all the numeric use. Also if the guid is a 64-bit
values stuffed into a numeric(20), you can do it there as well. I achieved
a hefty performance boost by making those application level changes in a
similar situation.

Regards,
Ken

Re: Query optimization using order by and limit

От
Michael Viscuso
Дата:
Thanks Ken,

I'm discussing with my coworker how to best make that change *as we
speak*.  Do you think this will also resolve the original issue I'm
seeing where the query doesn't "limit out properly" and spends time in
child tables that won't yield any results?  I was hoping that by using
the check constraints, I could query over a week or month's worth of
partitioned tables and the combination of order by and limit would
eliminate any time searching unnecessary tables but that doesn't appear
to be true. (I'm still very new to high-end Postgres performance so I
could be mistaken.)

Regardless, in the meantime, I'll switch those columns to bigint instead
of numeric and have an update as soon as possible.

Thanks for your help!

Mike

On 9/22/2011 9:41 AM, ktm@rice.edu wrote:
> On Wed, Sep 21, 2011 at 11:22:53PM -0400, Tom Lane wrote:
>> Michael Viscuso <michael.viscuso@getcarbonblack.com> writes:
>>> Greg/Tom, you are correct, these columns should be modified to whatever
>>> is easiest for Postgres to recognize 64-bit unsigned integers.  Would
>>> you still recommend bigint for unsigned integers?  I likely read the
>>> wrong documentation that suggested bigint for signed 64-bit integers and
>>> numeric(20) for unsigned 64-bit integers.
>> Unsigned?  Oh, hm, that's a bit of a problem because we don't have any
>> unsigned types.  If you really need to go to 2^64 and not 2^63 then
>> you're stuck with numeric ... but that last bit is costing ya a lot.
>>
>>             regards, tom lane
>>
> Hi Michael,
>
> If you have access to the application, you can map the unsigned 64-bits
> to the PostgreSQL signed 64-bit type with a simple subtraction. That will
> allow you to drop all the numeric use. Also if the guid is a 64-bit
> values stuffed into a numeric(20), you can do it there as well. I achieved
> a hefty performance boost by making those application level changes in a
> similar situation.
>
> Regards,
> Ken


Re: Query optimization using order by and limit

От
Stephen Frost
Дата:
* Michael Viscuso (michael.viscuso@getcarbonblack.com) wrote:
> Adding the final condition hosts_guid = '2007075705813916178' is what
> ultimately kills it http://explain.depesz.com/s/8zy.  By adding the
> host_guid, it spends considerably more time in the older tables than
> without this condition and I'm not sure why.

What I think is happening here is that PG is pushing down that filter
(not typically a bad thing..), but with that condition, it's going to
scan the index until it finds a match for that filter before returning
back up only to have that result cut out due to the limit.  Having it as
numerics isn't helping here, but the bigger issue is having to check all
those tuples for a match to the filter.

Mike, the filter has to be applied before the order by/limit, since
those clauses come after the filter has been applied (you wouldn't want
a 'where x = 2 limit 10' to return early just because it found 10
records where x didn't equal 2).

What would be great is if PG would realize that the CHECK constraints
prevent earlier records from being in these earlier tables, so it
shouldn't need to consider them at all once the records from the
'latest' table has been found and the limit reached (reverse all this
for an 'ascending' query, of course), which we can do when there's no
order by.  I don't believe we have that kind of logic or that
information available at this late stage- the CHECK constraints are used
to eliminate the impossible-to-match tables, but that's it.

One option, which isn't great of course, would be to implement your own
'nested loop' construct (something I typically despise..) in the
application which just walks backwards from the latest and pulls
whatever records it can from each day and then stops once it hits the
limit.

    Thanks,

        Stephen

Вложения

Re: Query optimization using order by and limit

От
Michael Viscuso
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen,

I spent the better part of the day implementing an application layer
nested loop and it seems to be working well.  Of course it's a little
slower than a Postgres only solution because it has to pass data back
and forth for each daily table query until it reaches the limit, but at
least I don't have "runaway" queries like I was seeing before.  That
should be a pretty good stopgap solution for the time being.

I was really hoping there was a Postgres exclusive answer though! :)  If
there are any other suggestions, it's a simple flag in my application to
query the other way again...

Thanks for all your help - and I'm still looking to change those
numerics to bigints, just haven't figured out the best way yet.

Mike

On 9/22/2011 10:53 AM, Stephen Frost wrote:
> * Michael Viscuso (michael.viscuso@getcarbonblack.com) wrote:
>> Adding the final condition hosts_guid = '2007075705813916178' is what
>> ultimately kills it http://explain.depesz.com/s/8zy. By adding the
>> host_guid, it spends considerably more time in the older tables than
>> without this condition and I'm not sure why.
>
> What I think is happening here is that PG is pushing down that filter
> (not typically a bad thing..), but with that condition, it's going to
> scan the index until it finds a match for that filter before returning
> back up only to have that result cut out due to the limit. Having it as
> numerics isn't helping here, but the bigger issue is having to check all
> those tuples for a match to the filter.
>
> Mike, the filter has to be applied before the order by/limit, since
> those clauses come after the filter has been applied (you wouldn't want
> a 'where x = 2 limit 10' to return early just because it found 10
> records where x didn't equal 2).
>
> What would be great is if PG would realize that the CHECK constraints
> prevent earlier records from being in these earlier tables, so it
> shouldn't need to consider them at all once the records from the
> 'latest' table has been found and the limit reached (reverse all this
> for an 'ascending' query, of course), which we can do when there's no
> order by. I don't believe we have that kind of logic or that
> information available at this late stage- the CHECK constraints are used
> to eliminate the impossible-to-match tables, but that's it.
>
> One option, which isn't great of course, would be to implement your own
> 'nested loop' construct (something I typically despise..) in the
> application which just walks backwards from the latest and pulls
> whatever records it can from each day and then stops once it hits the
> limit.
>
> Thanks,
>
> Stephen

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOe7zzAAoJEBKjVK2HR1IXYwAIAKQBnFOtCNljL1Hs1ZQW3e+I
ele/kZCiHzgHLFpN7zawt1Y7qf+3ntd6u+mkatJsnqeC+HY1Qee4VTUqr+hIKhcc
VIGuuYkzuojs6/PgF6MAERHP24lRFdLCQtMgTY8RshYODvc07VpqkLq1cXhsNJZw
6pNBTEpEmA0MzMrmk3x6C8lFbyXZAYUxNLwG5SEWecV+lkOjnA70oKnSxG6EXRgk
fkj2l1ezVn23KoO8SSUp4xBFHHOY/PQP9JtV7b52Gm5PC7lOqFFrXFygNP0KkWho
TzyjoYKttShEjmTMXoLt181+NB4rQEas8USasemRA1pUkx2NrfvcK46gYucOAsg=
=8yQW
-----END PGP SIGNATURE-----


Re: Query optimization using order by and limit

От
Stephen Frost
Дата:
Mike,

* Michael Viscuso (michael.viscuso@getcarbonblack.com) wrote:
> I spent the better part of the day implementing an application layer
> nested loop and it seems to be working well.  Of course it's a little
> slower than a Postgres only solution because it has to pass data back
> and forth for each daily table query until it reaches the limit, but at
> least I don't have "runaway" queries like I was seeing before.  That
> should be a pretty good stopgap solution for the time being.

Glad to hear that you were able to get something going which worked for
you.

> I was really hoping there was a Postgres exclusive answer though! :)  If
> there are any other suggestions, it's a simple flag in my application to
> query the other way again...

I continue to wonder if some combination of multi-column indexes might
have made the task of finding the 'lowest' record from each of the
tables fast enough that it wouldn't be an issue.

> Thanks for all your help - and I'm still looking to change those
> numerics to bigints, just haven't figured out the best way yet.

Our timestamps are also implemented using 64bit integers and would allow
you to use all the PG date/time functions and operators.  Just a
thought.

    Thanks,

        Stephen

Вложения

Re: Query optimization using order by and limit

От
Michael Viscuso
Дата:
Stephen,

Yes, I couldn't agree more.  The next two things I will be looking at very carefully are the timestamps and indexes.  I will reply to this post if either dramatically helps.

Thanks again for all your help.  My eyes were starting to bleed from staring at explain logs!

Mike

On Thu, Sep 22, 2011 at 7:14 PM, Stephen Frost <sfrost@snowman.net> wrote:
Mike,

* Michael Viscuso (michael.viscuso@getcarbonblack.com) wrote:
> I spent the better part of the day implementing an application layer
> nested loop and it seems to be working well.  Of course it's a little
> slower than a Postgres only solution because it has to pass data back
> and forth for each daily table query until it reaches the limit, but at
> least I don't have "runaway" queries like I was seeing before.  That
> should be a pretty good stopgap solution for the time being.

Glad to hear that you were able to get something going which worked for
you.

> I was really hoping there was a Postgres exclusive answer though! :)  If
> there are any other suggestions, it's a simple flag in my application to
> query the other way again...

I continue to wonder if some combination of multi-column indexes might
have made the task of finding the 'lowest' record from each of the
tables fast enough that it wouldn't be an issue.

> Thanks for all your help - and I'm still looking to change those
> numerics to bigints, just haven't figured out the best way yet.

Our timestamps are also implemented using 64bit integers and would allow
you to use all the PG date/time functions and operators.  Just a
thought.

       Thanks,

               Stephen

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAk57wXAACgkQrzgMPqB3kijaNwCfQ9cSdzzHyiPwa+BTzIihWR7T
baoAoIbL8P3atU1cfbcCoFXFGbKE7fPt
=ZRqu
-----END PGP SIGNATURE-----


Re: Query optimization using order by and limit

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> What I think is happening here is that PG is pushing down that filter
> (not typically a bad thing..), but with that condition, it's going to
> scan the index until it finds a match for that filter before returning
> back up only to have that result cut out due to the limit.

Yeah, it's spending quite a lot of time finding the first matching row
in each child table.  I'm curious why that is though; are the child
tables not set up with nonoverlapping firstloadtime ranges?

> What would be great is if PG would realize that the CHECK constraints
> prevent earlier records from being in these earlier tables,

The explain shows that that isn't the case, because it *is* finding at
least one candidate row in each table.  It's just running quite far into
the firstloadtime sequence to do it.

If you're stuck with this table arrangement, one thing that would help
is a two-column index on (host_guid, firstloadtime) on each child table.
That would match the search condition exactly, and so reduce the cost
to find the first matching row to nearly nil.  Whether this query's
speed is important enough to justify maintaining such an index is a
question I can't answer for you.

            regards, tom lane

Re: Query optimization using order by and limit

От
Stephen Frost
Дата:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Yeah, it's spending quite a lot of time finding the first matching row
> in each child table.  I'm curious why that is though; are the child
> tables not set up with nonoverlapping firstloadtime ranges?

They are set up w/ nonoverlapping firstloadtime ranges, using CHECK
constraints such as:

"osmoduleloads_2011_09_14_firstloadtime_check" CHECK (firstloadtime >=
129604464000000000::bigint::numeric AND firstloadtime <
129605328000000000::bigint::numeric)

The issue here is that the query is saying "Give me the first 150
records with this host_id in this week-long range".  PG happily
eliminates all the tables that are outside of the week-long range during
constraint exclusion.  After that, however, it hunts down the earliest
records (which matches 'host_id') from each child table.  Sure, from
each table there's a record in the week-long range with the host_id that
matches.  What PG doesn't realize is that it can stop after pulling the
150 records from the most recent table (and flipping the direction of
the query or the tables doesn't help- PG still pulls a record from each
table).

> > What would be great is if PG would realize that the CHECK constraints
> > prevent earlier records from being in these earlier tables,
>
> The explain shows that that isn't the case, because it *is* finding at
> least one candidate row in each table.  It's just running quite far into
> the firstloadtime sequence to do it.

My point above is that the CHECK constraints ensure an ordering which
could be leveraged to use the latest table first and then stop if enough
tuples are returned (or immediately go to the next table), without ever
considering the other tables.  I'm not looking for PG to eliminate those
other tables for consideration in all cases- if the limit is large
enough, it may get all the way down to them.  I'm pretty sure this isn't
something which PG does today and I don't expect teaching it to do this
to be trivial, but it certainly would be nice as this strikes me as a
very common use-case.

> If you're stuck with this table arrangement, one thing that would help
> is a two-column index on (host_guid, firstloadtime) on each child table.

Agreed, I mentioned this to the OP previously and it's on his list of
things to try.

    Thanks,

        Stephen

Вложения

Re: Query optimization using order by and limit

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Yeah, it's spending quite a lot of time finding the first matching row
>> in each child table.  I'm curious why that is though; are the child
>> tables not set up with nonoverlapping firstloadtime ranges?

> The issue here is that the query is saying "Give me the first 150
> records with this host_id in this week-long range".

Oh, I see.  So the query range overlaps multiple child tables, even
after constraint exclusion eliminates a lot of them.

> My point above is that the CHECK constraints ensure an ordering which
> could be leveraged to use the latest table first and then stop if enough
> tuples are returned (or immediately go to the next table), without ever
> considering the other tables.

Yeah.  My opinion is that trying to reverse-engineer that from the CHECK
constraints would cost a lot more than it's worth.  What we need, and
will hopefully have sooner or later, is an abstract concept of
"partitioned table" in which this kind of relationship is known a-priori
instead of having to be laboriously re-deduced every time we plan a
query.

>> If you're stuck with this table arrangement, one thing that would help
>> is a two-column index on (host_guid, firstloadtime) on each child table.

> Agreed, I mentioned this to the OP previously and it's on his list of
> things to try.

AFAICS the fact that this example would be fast if we were only paying
attention to the newest table is mere luck.  If it can take a long time
to find the first matching host_guid record in several of the child
tables, why might it not take just as long to find said record in the
other one?  I think you really need the two-column indexes, if keeping
this query's runtime to a minimum is critical.

            regards, tom lane