Обсуждение: Slow query: select * order by XXX desc offset 10 limit 10

Поиск
Список
Период
Сортировка

Slow query: select * order by XXX desc offset 10 limit 10

От
Alexander Farber
Дата:
Hello,

I use PostgreSQL 8.4.7 on CentOS 6.0 / 64 bit machine
with Quad-Core AMD Opteron(tm) Processor 2352 and
16 GB RAM and use it for 1 PHP script - which selects
and displays data in jQuery DataTables (i.e. an
HTML-table which can be viewed page by page).

I select records from 1 view which unites 2 identical tables:

quincy=> \d quincyview
               View "public.quincyview"
   Column    |            Type             | Modifiers
-------------+-----------------------------+-----------
 qdatetime   | timestamp without time zone |
 id          | character varying(20)       |
 name        | character varying(20)       |
 category    | character varying(120)      |
 appsversion | character varying(30)       |
 osversion   | character varying(30)       |
 beta_prog   | character varying(20)       |
 catinfo     | character varying(120)      |
 details     | character varying(50)       |
 devinfo     | character varying(4000)     |
 email       | character varying(320)      |
 emailid     | character varying(16)       |
 imei        | character varying(25)       |
 pin         | character varying(12)       |
 formfactor  | character varying(10)       |
 copied      | timestamp without time zone |
View definition:
         SELECT quincynoreset.qdatetime, quincynoreset.id,
quincynoreset.name, quincynoreset.category, quincynoreset.appsversion,
quincynoreset.osversion, quincynoreset.beta_prog,
quincynoreset.catinfo, quincynoreset.details, quincynoreset.devinfo,
quincynoreset.email, quincynoreset.emailid, quincynoreset.imei,
quincynoreset.pin, quincynoreset.formfactor, quincynoreset.copied
           FROM quincynoreset
UNION
         SELECT quincytrack.qdatetime, quincytrack.id,
quincytrack.name, quincytrack.category, quincytrack.appsversion,
quincytrack.osversion, quincytrack.beta_prog, quincytrack.catinfo,
quincytrack.details, quincytrack.devinfo, quincytrack.email,
quincytrack.emailid, quincytrack.imei, quincytrack.pin,
quincytrack.formfactor, quincytrack.copied
           FROM quincytrack;

And here is 1 of the 2 tables (the other is same, except its name):

quincy=> \d quincytrack;
                Table "public.quincytrack"
   Column    |            Type             |   Modifiers
-------------+-----------------------------+---------------
 appsversion | character varying(30)       |
 beta_prog   | character varying(20)       |
 category    | character varying(120)      |
 catinfo     | character varying(120)      |
 details     | character varying(50)       |
 devinfo     | character varying(4000)     |
 emailid     | character varying(16)       |
 email       | character varying(320)      |
 formfactor  | character varying(10)       |
 id          | character varying(20)       | not null
 imei        | character varying(25)       |
 name        | character varying(20)       |
 osversion   | character varying(30)       |
 pin         | character varying(12)       |
 qdatetime   | timestamp without time zone |
 copied      | timestamp without time zone | default now()
Indexes:
    "quincytrack_pkey" PRIMARY KEY, btree (id)

There are around 1 mio records in the view:

quincy=> select count(*) from quincyview ;
 count
--------
 950476
(1 row)

My problem is, that select's are very slow and
using my script is no fun despite all the AJAX stuff -
which only tries to retrieve "offset X limit Y" records:

quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as
QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
quincyview where qdatetime <= now() order by QDATETIME desc offset 10
limit 10;


      QUERY PLAN


----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------
 Limit  (cost=600344.67..600344.70 rows=10 width=1172)
   ->  Sort  (cost=600344.65..602859.16 rows=1005804 width=1172)
         Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text))
         ->  Subquery Scan quincyview  (cost=518261.35..573580.57
rows=1005804 width=1172)
               ->  Unique  (cost=518261.35..561008.02 rows=1005804 width=252)
                     ->  Sort  (cost=518261.35..520775.86 rows=1005804
width=252)
                           Sort Key: quincynoreset.qdatetime,
quincynoreset.id, quincynoreset.name, quincynoreset.cate
gory, quincynoreset.appsversion, quincynoreset.osversion,
quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
t.details, quincynoreset.devinfo, quincynoreset.email,
quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
uincynoreset.formfactor, quincynoreset.copied
                           ->  Append  (cost=0.00..57003.60
rows=1005804 width=252)
                                 ->  Seq Scan on quincynoreset
(cost=0.00..40011.20 rows=863394 width=242)
                                       Filter: (qdatetime <= now())
                                 ->  Seq Scan on quincytrack
(cost=0.00..6934.36 rows=142410 width=312)
                                       Filter: (qdatetime <= now())

Does anybody please have an idea,
how to speed up my select statements?

Regards
Alex

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
David Johnston
Дата:
On Oct 13, 2011, at 9:41, Alexander Farber <alexander.farber@gmail.com> wrote:


Does anybody please have an idea,
how to speed up my select statements?


Create one or more indexes.

David J.

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Bill Moran
Дата:
In response to Alexander Farber <alexander.farber@gmail.com>:

> Hello,
>
> I use PostgreSQL 8.4.7 on CentOS 6.0 / 64 bit machine
> with Quad-Core AMD Opteron(tm) Processor 2352 and
> 16 GB RAM and use it for 1 PHP script - which selects
> and displays data in jQuery DataTables (i.e. an
> HTML-table which can be viewed page by page).
>
> I select records from 1 view which unites 2 identical tables:
>
> quincy=> \d quincyview
>                View "public.quincyview"
>    Column    |            Type             | Modifiers
> -------------+-----------------------------+-----------
>  qdatetime   | timestamp without time zone |
>  id          | character varying(20)       |
>  name        | character varying(20)       |
>  category    | character varying(120)      |
>  appsversion | character varying(30)       |
>  osversion   | character varying(30)       |
>  beta_prog   | character varying(20)       |
>  catinfo     | character varying(120)      |
>  details     | character varying(50)       |
>  devinfo     | character varying(4000)     |
>  email       | character varying(320)      |
>  emailid     | character varying(16)       |
>  imei        | character varying(25)       |
>  pin         | character varying(12)       |
>  formfactor  | character varying(10)       |
>  copied      | timestamp without time zone |
> View definition:
>          SELECT quincynoreset.qdatetime, quincynoreset.id,
> quincynoreset.name, quincynoreset.category, quincynoreset.appsversion,
> quincynoreset.osversion, quincynoreset.beta_prog,
> quincynoreset.catinfo, quincynoreset.details, quincynoreset.devinfo,
> quincynoreset.email, quincynoreset.emailid, quincynoreset.imei,
> quincynoreset.pin, quincynoreset.formfactor, quincynoreset.copied
>            FROM quincynoreset
> UNION
>          SELECT quincytrack.qdatetime, quincytrack.id,
> quincytrack.name, quincytrack.category, quincytrack.appsversion,
> quincytrack.osversion, quincytrack.beta_prog, quincytrack.catinfo,
> quincytrack.details, quincytrack.devinfo, quincytrack.email,
> quincytrack.emailid, quincytrack.imei, quincytrack.pin,
> quincytrack.formfactor, quincytrack.copied
>            FROM quincytrack;
>
> And here is 1 of the 2 tables (the other is same, except its name):
>
> quincy=> \d quincytrack;
>                 Table "public.quincytrack"
>    Column    |            Type             |   Modifiers
> -------------+-----------------------------+---------------
>  appsversion | character varying(30)       |
>  beta_prog   | character varying(20)       |
>  category    | character varying(120)      |
>  catinfo     | character varying(120)      |
>  details     | character varying(50)       |
>  devinfo     | character varying(4000)     |
>  emailid     | character varying(16)       |
>  email       | character varying(320)      |
>  formfactor  | character varying(10)       |
>  id          | character varying(20)       | not null
>  imei        | character varying(25)       |
>  name        | character varying(20)       |
>  osversion   | character varying(30)       |
>  pin         | character varying(12)       |
>  qdatetime   | timestamp without time zone |
>  copied      | timestamp without time zone | default now()
> Indexes:
>     "quincytrack_pkey" PRIMARY KEY, btree (id)
>
> There are around 1 mio records in the view:
>
> quincy=> select count(*) from quincyview ;
>  count
> --------
>  950476
> (1 row)
>
> My problem is, that select's are very slow and
> using my script is no fun despite all the AJAX stuff -
> which only tries to retrieve "offset X limit Y" records:
>
> quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as
> QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
> quincyview where qdatetime <= now() order by QDATETIME desc offset 10
> limit 10;
>
>
>       QUERY PLAN
>
>
>
----------------------------------------------------------------------------------------------------------------------
>
----------------------------------------------------------------------------------------------------------------------
>
----------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------
>  Limit  (cost=600344.67..600344.70 rows=10 width=1172)
>    ->  Sort  (cost=600344.65..602859.16 rows=1005804 width=1172)
>          Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text))
>          ->  Subquery Scan quincyview  (cost=518261.35..573580.57
> rows=1005804 width=1172)
>                ->  Unique  (cost=518261.35..561008.02 rows=1005804 width=252)
>                      ->  Sort  (cost=518261.35..520775.86 rows=1005804
> width=252)
>                            Sort Key: quincynoreset.qdatetime,
> quincynoreset.id, quincynoreset.name, quincynoreset.cate
> gory, quincynoreset.appsversion, quincynoreset.osversion,
> quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
> t.details, quincynoreset.devinfo, quincynoreset.email,
> quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
> uincynoreset.formfactor, quincynoreset.copied
>                            ->  Append  (cost=0.00..57003.60
> rows=1005804 width=252)
>                                  ->  Seq Scan on quincynoreset
> (cost=0.00..40011.20 rows=863394 width=242)
>                                        Filter: (qdatetime <= now())
>                                  ->  Seq Scan on quincytrack
> (cost=0.00..6934.36 rows=142410 width=312)
>                                        Filter: (qdatetime <= now())
>
> Does anybody please have an idea,
> how to speed up my select statements?

#1 Add indexes on qdatetime on both tables
#2 don't try to order/filter by a calculated value.  Instead modify the
   query to order and filter by the raw timestamptz column, which will
   allow that to be done without converting it all to text first.  This
   will require you to change your aliasing in your query.

A possible solution to #2:
select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIMEFORMATTED,
 ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO
from quincyview
where qdatetime <= now()
order by QDATETIME desc
offset 10 limit 10;

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Alexander Farber
Дата:
Hello Bill and others,

On Thu, Oct 13, 2011 at 4:09 PM, Bill Moran <wmoran@potentialtech.com> wrote:
> In response to Alexander Farber <alexander.farber@gmail.com>:
>> I use PostgreSQL 8.4.7 on CentOS 6.0 / 64 bit machine
>> with Quad-Core AMD Opteron(tm) Processor 2352 and
>> 16 GB RAM and use it for 1 PHP script - which selects
>> and displays data in jQuery DataTables (i.e. an
>> HTML-table which can be viewed page by page).
>>
>> My problem is, that select's are very slow and
>> using my script is no fun despite all the AJAX stuff -
>> which only tries to retrieve "offset X limit Y" records:
>>
>> quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as
>> QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
>> quincyview where qdatetime <= now() order by QDATETIME desc offset 10
>> limit 10;
>>
>>
>>       QUERY PLAN
>>
>>
>>
----------------------------------------------------------------------------------------------------------------------
>>
----------------------------------------------------------------------------------------------------------------------
>>
----------------------------------------------------------------------------------------------------------------------
>> ----------------------------------------------
>>  Limit  (cost=600344.67..600344.70 rows=10 width=1172)
>>    ->  Sort  (cost=600344.65..602859.16 rows=1005804 width=1172)
>>          Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text))
>>          ->  Subquery Scan quincyview  (cost=518261.35..573580.57
>> rows=1005804 width=1172)
>>                ->  Unique  (cost=518261.35..561008.02 rows=1005804 width=252)
>>                      ->  Sort  (cost=518261.35..520775.86 rows=1005804
>> width=252)
>>                            Sort Key: quincynoreset.qdatetime,
>> quincynoreset.id, quincynoreset.name, quincynoreset.cate
>> gory, quincynoreset.appsversion, quincynoreset.osversion,
>> quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
>> t.details, quincynoreset.devinfo, quincynoreset.email,
>> quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
>> uincynoreset.formfactor, quincynoreset.copied
>>                            ->  Append  (cost=0.00..57003.60
>> rows=1005804 width=252)
>>                                  ->  Seq Scan on quincynoreset
>> (cost=0.00..40011.20 rows=863394 width=242)
>>                                        Filter: (qdatetime <= now())
>>                                  ->  Seq Scan on quincytrack
>> (cost=0.00..6934.36 rows=142410 width=312)
>>                                        Filter: (qdatetime <= now())
>>
>
> #1 Add indexes on qdatetime on both tables
> #2 don't try to order/filter by a calculated value.  Instead modify the
>   query to order and filter by the raw timestamptz column, which will
>   allow that to be done without converting it all to text first.  This
>   will require you to change your aliasing in your query.
>
> A possible solution to #2:
> select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIMEFORMATTED,
>  ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO
> from quincyview
> where qdatetime <= now()
> order by QDATETIME desc
> offset 10 limit 10;

I've added 3 new indices on both tables:


quincy=> \d quincynoreset
               Table "public.quincynoreset"
   Column    |            Type             |   Modifiers
-------------+-----------------------------+---------------
 appsversion | character varying(30)       |
 beta_prog   | character varying(20)       |
 category    | character varying(120)      |
 catinfo     | character varying(120)      |
 details     | character varying(50)       |
 devinfo     | character varying(4000)     |
 emailid     | character varying(16)       |
 email       | character varying(320)      |
 formfactor  | character varying(10)       |
 id          | character varying(20)       | not null
 imei        | character varying(25)       |
 name        | character varying(20)       |
 osversion   | character varying(30)       |
 pin         | character varying(12)       |
 qdatetime   | timestamp without time zone |
 copied      | timestamp without time zone | default now()
Indexes:
    "quincynoreset_pkey" PRIMARY KEY, btree (id)
    "quincynoreset_appsversion_index" btree (appsversion)
    "quincynoreset_osversion_index" btree (osversion)
    "quincynoreset_qdatetime_index" btree (qdatetime)

And in my query I've renamed the string column to
QDATETIME_2 (if I've got your suggestion #2 correctly) -
still no visible improvement:


quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as
QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
quincyview where qdatetime <= now() order by QDATETIME desc offset 10
limit 10;


      QUERY PLAN


----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------
 Limit  (cost=558551.88..558551.91 rows=10 width=1172)
   ->  Sort  (cost=558551.86..560883.79 rows=932773 width=1172)
         Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text))
         ->  Subquery Scan quincyview  (cost=482428.59..533731.10
rows=932773 width=1172)
               ->  Unique  (cost=482428.59..522071.44 rows=932773 width=252)
                     ->  Sort  (cost=482428.59..484760.52 rows=932773 width=252)
                           Sort Key: quincynoreset.qdatetime,
quincynoreset.id, quincynoreset.name, quincynoreset.cate
gory, quincynoreset.appsversion, quincynoreset.osversion,
quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
t.details, quincynoreset.devinfo, quincynoreset.email,
quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
uincynoreset.formfactor, quincynoreset.copied
                           ->  Append  (cost=0.00..55177.71
rows=932773 width=252)
                                 ->  Seq Scan on quincynoreset
(cost=0.00..39171.89 rows=807446 width=242)
                                       Filter: (qdatetime <= now())
                                 ->  Seq Scan on quincytrack
(cost=0.00..6678.09 rows=125327 width=315)
                                       Filter: (qdatetime <= now())
(12 rows)

(XXX same query below but with QDATETIME_2 as column name XXX):

quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as
QDATETIME_2,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO
from quincyview where qdatetime <= now() order by QDATETIME desc
offset 10 limit 10;


      QUERY PLAN


----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------
 Limit  (cost=558551.88..558551.91 rows=10 width=1172)
   ->  Sort  (cost=558551.86..560883.79 rows=932773 width=1172)
         Sort Key: quincyview.qdatetime
         ->  Subquery Scan quincyview  (cost=482428.59..533731.10
rows=932773 width=1172)
               ->  Unique  (cost=482428.59..522071.44 rows=932773 width=252)
                     ->  Sort  (cost=482428.59..484760.52 rows=932773 width=252)
                           Sort Key: quincynoreset.qdatetime,
quincynoreset.id, quincynoreset.name, quincynoreset.cate
gory, quincynoreset.appsversion, quincynoreset.osversion,
quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
t.details, quincynoreset.devinfo, quincynoreset.email,
quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
uincynoreset.formfactor, quincynoreset.copied
                           ->  Append  (cost=0.00..55177.71
rows=932773 width=252)
                                 ->  Seq Scan on quincynoreset
(cost=0.00..39171.89 rows=807446 width=242)
                                       Filter: (qdatetime <= now())
                                 ->  Seq Scan on quincytrack
(cost=0.00..6678.09 rows=125327 width=315)
                                       Filter: (qdatetime <= now())
(12 rows)

Regards
Alex

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Alexander Farber
Дата:
I've also tried opening cursor:

quincy=> open ref for select to_char(qdatetime, 'YYYY-MM-DD') as
QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
quincyview where qdatetime <= now() order by QDATETIME desc ;
ERROR:  syntax error at or near "open"
LINE 1: open ref for select to_char(qdatetime, 'YYYY-MM-DD') as QDAT...
        ^

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Pavel Stehule
Дата:
Hello

you should to use a DECLARE statement

http://www.postgresql.org/docs/9.1/interactive/sql-declare.html

and fetch statement

http://www.postgresql.org/docs/9.1/interactive/sql-fetch.html

Regards

Pavel Stehule


2011/10/14 Alexander Farber <alexander.farber@gmail.com>:
> I've also tried opening cursor:
>
> quincy=> open ref for select to_char(qdatetime, 'YYYY-MM-DD') as
> QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
> quincyview where qdatetime <= now() order by QDATETIME desc ;
> ERROR:  syntax error at or near "open"
> LINE 1: open ref for select to_char(qdatetime, 'YYYY-MM-DD') as QDAT...
>        ^
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Alexander Farber
Дата:
Thank you -

On Fri, Oct 14, 2011 at 11:30 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> you should to use a DECLARE statement
> http://www.postgresql.org/docs/9.1/interactive/sql-declare.html
> and fetch statement
> http://www.postgresql.org/docs/9.1/interactive/sql-fetch.html

I've managed to create a cursor
and can fetch the data row by row:

quincy=> start TRANSACTION;
quincy=> declare XXX cursor for select to_char(qdatetime,
'YYYY-MM-DD') as
QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
quincyview where qdatetime <= now() order by QDATETIME desc ;
quincy=> fetch XXX;
.....
quincy=> fetch XXX;
.....

But how do I "go back"?

For my jQuery HTML table (DataTables.net)
I need to be able to go back and forth.

Regards
Alex

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Pavel Stehule
Дата:
2011/10/14 Alexander Farber <alexander.farber@gmail.com>:
> Thank you -
>
> On Fri, Oct 14, 2011 at 11:30 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> you should to use a DECLARE statement
>> http://www.postgresql.org/docs/9.1/interactive/sql-declare.html
>> and fetch statement
>> http://www.postgresql.org/docs/9.1/interactive/sql-fetch.html
>
> I've managed to create a cursor
> and can fetch the data row by row:
>
> quincy=> start TRANSACTION;
> quincy=> declare XXX cursor for select to_char(qdatetime,
> 'YYYY-MM-DD') as
> QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
> quincyview where qdatetime <= now() order by QDATETIME desc ;
> quincy=> fetch XXX;
> .....
> quincy=> fetch XXX;
> .....
>
> But how do I "go back"?
>
> For my jQuery HTML table (DataTables.net)
> I need to be able to go back and forth.
>
> Regards
> Alex

you can use a scrollable cursors.


BEGIN WORK;

-- Set up a cursor:
DECLARE liahona SCROLL CURSOR FOR SELECT * FROM films;

-- Fetch the first 5 rows in the cursor liahona:
FETCH FORWARD 5 FROM liahona;

 code  |          title          | did | date_prod  |   kind   |  len
-------+-------------------------+-----+------------+----------+-------
 BL101 | The Third Man           | 101 | 1949-12-23 | Drama    | 01:44
 BL102 | The African Queen       | 101 | 1951-08-11 | Romantic | 01:43
 JL201 | Une Femme est une Femme | 102 | 1961-03-12 | Romantic | 01:25
 P_301 | Vertigo                 | 103 | 1958-11-14 | Action   | 02:08
 P_302 | Becket                  | 103 | 1964-02-03 | Drama    | 02:28

-- Fetch the previous row:
FETCH PRIOR FROM liahona;

 code  |  title  | did | date_prod  |  kind  |  len
-------+---------+-----+------------+--------+-------
 P_301 | Vertigo | 103 | 1958-11-14 | Action | 02:08

-- Close the cursor and end the transaction:
CLOSE liahona;
COMMIT WORK;

this example is from doc
http://www.postgresql.org/docs/9.1/interactive/sql-fetch.html

Regards

Pavel

>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Alban Hertroys
Дата:
On 14 Oct 2011, at 11:14, Alexander Farber wrote:

> I've added 3 new indices on both tables:
>
>
> quincy=> \d quincynoreset
>               Table "public.quincynoreset"
>   Column    |            Type             |   Modifiers
> -------------+-----------------------------+---------------
> appsversion | character varying(30)       |
> beta_prog   | character varying(20)       |
> category    | character varying(120)      |
> catinfo     | character varying(120)      |
> details     | character varying(50)       |
> devinfo     | character varying(4000)     |
> emailid     | character varying(16)       |
> email       | character varying(320)      |
> formfactor  | character varying(10)       |
> id          | character varying(20)       | not null
> imei        | character varying(25)       |
> name        | character varying(20)       |
> osversion   | character varying(30)       |
> pin         | character varying(12)       |
> qdatetime   | timestamp without time zone |
> copied      | timestamp without time zone | default now()
> Indexes:
>    "quincynoreset_pkey" PRIMARY KEY, btree (id)
>    "quincynoreset_appsversion_index" btree (appsversion)
>    "quincynoreset_osversion_index" btree (osversion)
>    "quincynoreset_qdatetime_index" btree (qdatetime)

(...)

> quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as
> QDATETIME_2,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO
> from quincyview where qdatetime <= now() order by QDATETIME desc
> offset 10 limit 10;
>
>
>      QUERY PLAN
>
>
>
----------------------------------------------------------------------------------------------------------------------
>
----------------------------------------------------------------------------------------------------------------------
>
----------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------
> Limit  (cost=558551.88..558551.91 rows=10 width=1172)
>   ->  Sort  (cost=558551.86..560883.79 rows=932773 width=1172)
>         Sort Key: quincyview.qdatetime
>         ->  Subquery Scan quincyview  (cost=482428.59..533731.10
> rows=932773 width=1172)
>               ->  Unique  (cost=482428.59..522071.44 rows=932773 width=252)
>                     ->  Sort  (cost=482428.59..484760.52 rows=932773 width=252)
>                           Sort Key: quincynoreset.qdatetime,
> quincynoreset.id, quincynoreset.name, quincynoreset.cate
> gory, quincynoreset.appsversion, quincynoreset.osversion,
> quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
> t.details, quincynoreset.devinfo, quincynoreset.email,
> quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
> uincynoreset.formfactor, quincynoreset.copied
>                           ->  Append  (cost=0.00..55177.71
> rows=932773 width=252)
>                                 ->  Seq Scan on quincynoreset
> (cost=0.00..39171.89 rows=807446 width=242)
>                                       Filter: (qdatetime <= now())
>                                 ->  Seq Scan on quincytrack
> (cost=0.00..6678.09 rows=125327 width=315)
>                                       Filter: (qdatetime <= now())
> (12 rows)


An explain analyse would have been a bit more informative.

Anyway, I think you get the sequential scans because the UNION requires to sort all the data from both tables to
guaranteethat the results are unique (hence that long Sort Key at the 7th line of explain output). 
For that reason, an index on qdatetime alone won't help much, especially when most of your data has qdatetime <= now(),
whichis probably the case. 

It doesn't matter that you only want 10 results from that set, the database will first have to figure out which those
rowsare. That gets more complicated because they can come from two different tables, due to the UNION. 


Do you really need unique results from that view, or are duplicates acceptable (one from each table)? In that case, try
UNIONALL instead of UNION. 

If you do need unique results, then you could create an index on the combination of all those fields. That should take
outthe need for those sequential scans. 

Alban Hertroys

--
The scale of a problem often equals the size of an ego.



Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Alexander Farber
Дата:
Hi Alban and others -

On Fri, Oct 14, 2011 at 1:34 PM, Alban Hertroys <haramrae@gmail.com> wrote:
> Anyway, I think you get the sequential scans because the UNION requires to sort all the data from both tables to
guaranteethat the results are unique (hence that long Sort Key at the 7th line of explain output). 
> For that reason, an index on qdatetime alone won't help much, especially when most of your data has qdatetime <=
now(),which is probably the case. 
>
> It doesn't matter that you only want 10 results from that set, the database will first have to figure out which those
rowsare. That gets more complicated because they can come from two different tables, due to the UNION. 
>
> Do you really need unique results from that view, or are duplicates acceptable (one from each table)? In that case,
tryUNION ALL instead of UNION. 

I don't need unique at all!

So I've run "explain analyse" on the old view:

quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as
QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
quincyview where qdatetime <= now() order by QDATETIME desc offset 10
limit 10;


      QUERY PLAN


----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------
 Limit  (cost=559200.14..559200.16 rows=10 width=1172) (actual
time=11311.537..11311.541 rows=10 loops=1)
   ->  Sort  (cost=559200.11..561534.85 rows=933894 width=1172)
(actual time=11311.532..11311.536 rows=20 loops=1)
         Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text))
         Sort Method:  top-N heapsort  Memory: 27kB
         ->  Subquery Scan quincyview  (cost=482985.36..534349.53
rows=933894 width=1172) (actual time=5778.592..9004.
663 rows=934084 loops=1)
               ->  Unique  (cost=482985.36..522675.85 rows=933894
width=254) (actual time=5777.972..7320.816 rows=9340
84 loops=1)
                     ->  Sort  (cost=482985.36..485320.09 rows=933894
width=254) (actual time=5777.969..6557.012 rows=
934084 loops=1)
                           Sort Key: quincynoreset.qdatetime,
quincynoreset.id, quincynoreset.name, quincynoreset.cate
gory, quincynoreset.appsversion, quincynoreset.osversion,
quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
t.details, quincynoreset.devinfo, quincynoreset.email,
quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
uincynoreset.formfactor, quincynoreset.copied
                           Sort Method:  external merge  Disk: 180992kB
                           ->  Append  (cost=0.00..55205.73
rows=933894 width=254) (actual time=11.592..2242.501 rows=
934084 loops=1)
                                 ->  Seq Scan on quincynoreset
(cost=0.00..39188.71 rows=808567 width=244) (actual ti
me=11.591..1739.695 rows=808647 loops=1)
                                       Filter: (qdatetime <= now())
                                 ->  Seq Scan on quincytrack
(cost=0.00..6678.09 rows=125327 width=315) (actual time=
6.801..298.642 rows=125437 loops=1)
                                       Filter: (qdatetime <= now())
 Total runtime: 11363.393 ms
(15 rows)

Then I'v dropped and recreated the view with "union all":

quincy=> drop view quincyview ;
DROP VIEW
quincy=> create view quincyview as
quincy->                 select
quincy->                         qdatetime,
quincy->                         id,
quincy->                         name,
quincy->                         category,
quincy->                         appsversion,
quincy->                         osversion,
quincy->                         beta_prog,
quincy->                         catinfo,
quincy->                         details,
quincy->                         devinfo,
quincy->                         email,
quincy->                         emailid,
quincy->                         imei,
quincy->                         pin,
quincy->                         formfactor,
quincy->                         copied
quincy->                 from quincynoreset
quincy->         union all
quincy->                 select
quincy->                         qdatetime,
quincy->                         id,
quincy->                         name,
quincy->                         category,
quincy->                         appsversion,
quincy->                         osversion,
quincy->                         beta_prog,
quincy->                         catinfo,
quincy->                         details,
quincy->                         devinfo,
quincy->                         email,
quincy->                         emailid,
quincy->                         imei,
quincy->                         pin,
quincy->                         formfactor,
quincy->                         copied
quincy->                 from quincytrack
quincy-> ;
CREATE VIEW

Then "explain analyse" on the same select query again:

quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as
QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
quincyview where qdatetime <= now() order by QDATETIME desc offset 10
limit 10;
                                                                   QUERY PLAN

----------------------------------------------------------------------------------------------------------------------
--------------------------
 Limit  (cost=73052.13..73052.16 rows=10 width=111) (actual
time=3782.645..3782.649 rows=10 loops=1)
   ->  Sort  (cost=73052.11..75386.84 rows=933894 width=111) (actual
time=3782.640..3782.643 rows=20 loops=1)
         Sort Key: (to_char(quincynoreset.qdatetime, 'YYYY-MM-DD'::text))
         Sort Method:  top-N heapsort  Memory: 27kB
         ->  Result  (cost=0.00..48201.53 rows=933894 width=111)
(actual time=0.039..2660.561 rows=934084 loops=1)
               ->  Append  (cost=0.00..45866.79 rows=933894 width=111)
(actual time=0.021..1239.916 rows=934084 loops=
1)
                     ->  Seq Scan on quincynoreset
(cost=0.00..39188.71 rows=808567 width=95) (actual time=0.020..916
.249 rows=808647 loops=1)
                           Filter: (qdatetime <= now())
                     ->  Seq Scan on quincytrack  (cost=0.00..6678.09
rows=125327 width=215) (actual time=0.030..125.6
49 rows=125437 loops=1)
                           Filter: (qdatetime <= now())
 Total runtime: 3782.759 ms
(11 rows)

Now the script is noticably more enjoyable, thank you!

Do I still need to add indices over the whole union
and what's the syntax please?

I'm also thinking about adding some "pipelining"
(i.e. prefetching 5-10 pages for the HTML-table):
http://datatables.net/release-datatables/examples/server_side/pipeline.html

Regards
Alex




>
> If you do need unique results, then you could create an index on the combination of all those fields. That should
takeout the need for those sequential scans. 

Re: Slow query: select * order by XXX desc offset 10 limit 10

От
Alban Hertroys
Дата:
On 14 Oct 2011, at 13:58, Alexander Farber wrote:

> Hi Alban and others -
>
> On Fri, Oct 14, 2011 at 1:34 PM, Alban Hertroys <haramrae@gmail.com> wrote:
>> Anyway, I think you get the sequential scans because the UNION requires to sort all the data from both tables to
guaranteethat the results are unique (hence that long Sort Key at the 7th line of explain output). 
>> For that reason, an index on qdatetime alone won't help much, especially when most of your data has qdatetime <=
now(),which is probably the case. 
>>
>> It doesn't matter that you only want 10 results from that set, the database will first have to figure out which
thoserows are. That gets more complicated because they can come from two different tables, due to the UNION. 
>>
>> Do you really need unique results from that view, or are duplicates acceptable (one from each table)? In that case,
tryUNION ALL instead of UNION. 
>
> I don't need unique at all!
>
> So I've run "explain analyse" on the old view:
>
> quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as
> QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
> quincyview where qdatetime <= now() order by QDATETIME desc offset 10
> limit 10;
>
>
>      QUERY PLAN
>
>
>
----------------------------------------------------------------------------------------------------------------------
>
----------------------------------------------------------------------------------------------------------------------
>
----------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------
> Limit  (cost=559200.14..559200.16 rows=10 width=1172) (actual
> time=11311.537..11311.541 rows=10 loops=1)
>   ->  Sort  (cost=559200.11..561534.85 rows=933894 width=1172)
> (actual time=11311.532..11311.536 rows=20 loops=1)
>         Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text))
>         Sort Method:  top-N heapsort  Memory: 27kB
>         ->  Subquery Scan quincyview  (cost=482985.36..534349.53
> rows=933894 width=1172) (actual time=5778.592..9004.
> 663 rows=934084 loops=1)
>               ->  Unique  (cost=482985.36..522675.85 rows=933894
> width=254) (actual time=5777.972..7320.816 rows=9340
> 84 loops=1)
>                     ->  Sort  (cost=482985.36..485320.09 rows=933894
> width=254) (actual time=5777.969..6557.012 rows=
> 934084 loops=1)
>                           Sort Key: quincynoreset.qdatetime,
> quincynoreset.id, quincynoreset.name, quincynoreset.cate
> gory, quincynoreset.appsversion, quincynoreset.osversion,
> quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
> t.details, quincynoreset.devinfo, quincynoreset.email,
> quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
> uincynoreset.formfactor, quincynoreset.copied
>                           Sort Method:  external merge  Disk: 180992kB
>                           ->  Append  (cost=0.00..55205.73
> rows=933894 width=254) (actual time=11.592..2242.501 rows=
> 934084 loops=1)
>                                 ->  Seq Scan on quincynoreset
> (cost=0.00..39188.71 rows=808567 width=244) (actual ti
> me=11.591..1739.695 rows=808647 loops=1)
>                                       Filter: (qdatetime <= now())
>                                 ->  Seq Scan on quincytrack
> (cost=0.00..6678.09 rows=125327 width=315) (actual time=
> 6.801..298.642 rows=125437 loops=1)
>                                       Filter: (qdatetime <= now())
> Total runtime: 11363.393 ms
> (15 rows)

You can paste those in http://explain.depesz.com/ to get an analysis of where the most time gets spent in your query.
That'soften a bit quicker at pointing you to the sore points. 

It shows that much of the time in the query is spent on those sorts, and we just eliminated the worst offender ;)

> Then I'v dropped and recreated the view with "union all":

> Then "explain analyse" on the same select query again:
>
> quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as
> QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
> quincyview where qdatetime <= now() order by QDATETIME desc offset 10
> limit 10;
>                                                                   QUERY PLAN
>
>
----------------------------------------------------------------------------------------------------------------------
> --------------------------
> Limit  (cost=73052.13..73052.16 rows=10 width=111) (actual
> time=3782.645..3782.649 rows=10 loops=1)
>   ->  Sort  (cost=73052.11..75386.84 rows=933894 width=111) (actual
> time=3782.640..3782.643 rows=20 loops=1)
>         Sort Key: (to_char(quincynoreset.qdatetime, 'YYYY-MM-DD'::text))
>         Sort Method:  top-N heapsort  Memory: 27kB
>         ->  Result  (cost=0.00..48201.53 rows=933894 width=111)
> (actual time=0.039..2660.561 rows=934084 loops=1)
>               ->  Append  (cost=0.00..45866.79 rows=933894 width=111)
> (actual time=0.021..1239.916 rows=934084 loops=
> 1)
>                     ->  Seq Scan on quincynoreset
> (cost=0.00..39188.71 rows=808567 width=95) (actual time=0.020..916
> .249 rows=808647 loops=1)
>                           Filter: (qdatetime <= now())
>                     ->  Seq Scan on quincytrack  (cost=0.00..6678.09
> rows=125327 width=215) (actual time=0.030..125.6
> 49 rows=125437 loops=1)
>                           Filter: (qdatetime <= now())
> Total runtime: 3782.759 ms
> (11 rows)
>
> Now the script is noticably more enjoyable, thank you!

This time it's spending a large portion of it's time sorting on that to_char function. As Bill mentioned, the qdatetime
inthe SELECT list causes that the qdatetime in the ORDER BY uses the "updated definition" from your SELECT list. It
doesn'tneed to do that, the actual timestamp is just as good at that and on that column you have an index! 

> Do I still need to add indices over the whole union
> and what's the syntax please?

Nope, you only needed that because the query was sorting on all those columns.

For the record, an index like that is called a multi-column index and the definition would be:
CREATE INDEX quincynoreset_full_idx ON quincynoreset (qdatetime, id, name, category, appsversion, osversion, beta_prog,
catinfo,details, devinfo, email, emailid, imei, pin, formfactor, copied); 

(Don't pay too much attention to the order of columns there, I just took the columns and their respective order from
theearlier query plan) 

> I'm also thinking about adding some "pipelining"
> (i.e. prefetching 5-10 pages for the HTML-table):
> http://datatables.net/release-datatables/examples/server_side/pipeline.html


Firing more selective queries at the database would help as well.

It looks like you're implementing some kind of paging through the result set, but do you really think people will want
topage through 100,000 pages of results? 

I think they'd be much happier if you would rank the results somehow, so that they get what they're looking for
relativelyquickly. 
Or you could divide up the information - if they have some idea of when the event they're looking for occurred, that
helpsnarrow down the data set a lot and actually increases their chances of finding it. 
Those are just a few examples, it much depends on the data and the users you're working with.

Alban Hertroys

--
The scale of a problem often equals the size of an ego.