Обсуждение: Optimizer showing wrong rows in plan

Поиск
Список
Период
Сортировка

Optimizer showing wrong rows in plan

От
Tadipathri Raghu
Дата:
Hi All,
 
Example on optimizer
===============
postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select * from test;
 id
----
  1
(1 row)
postgres=# explain select * from test;
                       QUERY PLAN
--------------------------------------------------------
 Seq Scan on test  (cost=0.00..34.00 rows=2400 width=4)
(1 row)
In the above, example the optimizer is retreiving those many rows where there is only one row in that table. If i analyze am geting one row.
 
postgres=# ANALYZE test;
ANALYZE
postgres=# explain select * from test;
                     QUERY PLAN
----------------------------------------------------
 Seq Scan on test  (cost=0.00..1.01 rows=1 width=4)
(1 row)
 
My question here is, what it retreiving as rows when there is no such. One more thing, if i wont do analyze and run the explain plan for three or more times, then catalogs getting updated automatically and resulting the correct row as 1.
 
Q2. Does explain , will update the catalogs automatically.
 
Regards
Raghavendra
 

Re: Optimizer showing wrong rows in plan

От
Szymon Guz
Дата:
2010/3/28 Tadipathri Raghu <traghu.dba@gmail.com>
Hi All,
 
Example on optimizer
===============
postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select * from test;
 id
----
  1
(1 row)
postgres=# explain select * from test;
                       QUERY PLAN
--------------------------------------------------------
 Seq Scan on test  (cost=0.00..34.00 rows=2400 width=4)
(1 row)
In the above, example the optimizer is retreiving those many rows where there is only one row in that table. If i analyze am geting one row.

No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).
 
 
postgres=# ANALYZE test;
ANALYZE
postgres=# explain select * from test;
                     QUERY PLAN
----------------------------------------------------
 Seq Scan on test  (cost=0.00..1.01 rows=1 width=4)
(1 row)
 
My question here is, what it retreiving as rows when there is no such. One more thing, if i wont do analyze and run the explain plan for three or more times, then catalogs getting updated automatically and resulting the correct row as 1.
 

Now ANALYZE changed the statistics for this table and now the planner knows that there is just one row. In the background there can work autovacuum so it changes rows automatically (the autovacuum work characteristic depends on the settings for the database).
 
Q2. Does explain , will update the catalogs automatically.
 

No, explain doesn't update table's statistics.


regards
Szymon Guz

Re: Optimizer showing wrong rows in plan

От
Tadipathri Raghu
Дата:
Hi Guz,
 
Thank you for the prompt reply.
 
No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).
 
So, whats it assuming here as rows(2400).  Could you explain this.
 
Regards
Raghavendra
On Sun, Mar 28, 2010 at 12:32 PM, Szymon Guz <mabewlun@gmail.com> wrote:
2010/3/28 Tadipathri Raghu <traghu.dba@gmail.com>

Hi All,
 
Example on optimizer
===============
postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test VALUES (1);
INSERT 0 1
postgres=# select * from test;
 id
----
  1
(1 row)
postgres=# explain select * from test;
                       QUERY PLAN
--------------------------------------------------------
 Seq Scan on test  (cost=0.00..34.00 rows=2400 width=4)
(1 row)
In the above, example the optimizer is retreiving those many rows where there is only one row in that table. If i analyze am geting one row.

No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).
 
 
postgres=# ANALYZE test;
ANALYZE
postgres=# explain select * from test;
                     QUERY PLAN
----------------------------------------------------
 Seq Scan on test  (cost=0.00..1.01 rows=1 width=4)
(1 row)
 
My question here is, what it retreiving as rows when there is no such. One more thing, if i wont do analyze and run the explain plan for three or more times, then catalogs getting updated automatically and resulting the correct row as 1.
 

Now ANALYZE changed the statistics for this table and now the planner knows that there is just one row. In the background there can work autovacuum so it changes rows automatically (the autovacuum work characteristic depends on the settings for the database).
 
Q2. Does explain , will update the catalogs automatically.
 

No, explain doesn't update table's statistics.


regards
Szymon Guz

Re: Optimizer showing wrong rows in plan

От
Szymon Guz
Дата:


2010/3/28 Tadipathri Raghu <traghu.dba@gmail.com>
Hi Guz,
 
Thank you for the prompt reply.
 
No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).
 
So, whats it assuming here as rows(2400).  Could you explain this.
 


It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.

regards
Szymon Guz 

Re: Optimizer showing wrong rows in plan

От
Tadipathri Raghu
Дата:
Hi Guz,
 
It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.
 
Frankly speaking its a newly created table without any operation on it as you have seen the example. Then how come it showing those many rows where we have only one in it.
Thanks if we have proper explination on this..
 
Regards
Raghavendra

 
On Sun, Mar 28, 2010 at 12:59 PM, Szymon Guz <mabewlun@gmail.com> wrote:


2010/3/28 Tadipathri Raghu <traghu.dba@gmail.com>
Hi Guz,
 
Thank you for the prompt reply.
 
No, the optimizer is not retrieving anything, it just assumes that there are 2400 rows because that is the number of rows that exists in the statictics for this table. The optimizer just tries to find the best plan and to optimize the query plan for execution taking into consideration all information that can be found for this table (it also looks in the statistics information about rows from this table).
 
So, whats it assuming here as rows(2400).  Could you explain this.
 


It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.

regards
Szymon Guz 


Re: Optimizer showing wrong rows in plan

От
Tadipathri Raghu
Дата:
Hi All,
 
I want to give some more light on this by analysing more like this
 
1. In my example I have created a table with one column as INT( which occupies 4 bytes)
2. Initially it occupies one page of  space on the file that is (8kb).
 
So, here is it assuming these many rows may fit in this page. Clarify me on this Please.
 
Regards
Raghavendra

 
On Sun, Mar 28, 2010 at 2:06 PM, Gary Doades <gpd@gpdnet.co.uk> wrote:
On 28/03/2010 8:33 AM, Tadipathri Raghu wrote:
Hi Guz,
 
It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.
 
Frankly speaking its a newly created table without any operation on it as you have seen the example. Then how come it showing those many rows where we have only one in it.
Thanks if we have proper explination on this..
It's not *showing* any rows at all, it's *guessing* 2400 rows because you've never analyzed the table. Without any statistics at all, postgres will use some form of in-built guess for a table that produces reasonable plans under average conditions. As you've already seen, once you analyze the table, the guess get's much  better and therefore would give you a more appropriate plan.

Regards,
Gary.


Re: Optimizer showing wrong rows in plan

От
Frank Heikens
Дата:

Op 28 mrt 2010, om 11:07 heeft Tadipathri Raghu het volgende geschreven:

Hi All,
 
I want to give some more light on this by analysing more like this
 
1. In my example I have created a table with one column as INT( which occupies 4 bytes)
2. Initially it occupies one page of  space on the file that is (8kb).
 
So, here is it assuming these many rows may fit in this page. Clarify me on this Please.


The minimum size of a file depends on the block size, by default 8kb: http://www.postgresql.org/docs/8.4/interactive/install-procedure.html

Regards,
Frank


 
Regards
Raghavendra

 
On Sun, Mar 28, 2010 at 2:06 PM, Gary Doades <gpd@gpdnet.co.uk> wrote:
On 28/03/2010 8:33 AM, Tadipathri Raghu wrote:
Hi Guz,
 
It is assuming that there are 2400 rows in this table. Probably you've deleted some rows from the table leaving just one.
 
Frankly speaking its a newly created table without any operation on it as you have seen the example. Then how come it showing those many rows where we have only one in it.
Thanks if we have proper explination on this..
It's not *showing* any rows at all, it's *guessing* 2400 rows because you've never analyzed the table. Without any statistics at all, postgres will use some form of in-built guess for a table that produces reasonable plans under average conditions. As you've already seen, once you analyze the table, the guess get's much  better and therefore would give you a more appropriate plan.

Regards,
Gary.





Re: Optimizer showing wrong rows in plan

От
Gary Doades
Дата:
On 28/03/2010 10:07 AM, Tadipathri Raghu wrote:
> Hi All,
> I want to give some more light on this by analysing more like this
> 1. In my example I have created a table with one column as INT( which
> occupies 4 bytes)
> 2. Initially it occupies one page of  space on the file that is (8kb).
> So, here is it assuming these many rows may fit in this page. Clarify
> me on this Please.

Like I said, it's just a guess. With no statistics all postgres can do
is guess, or in this case use the in-built default for a newly created
table. It could guess 1 or it could guess 10,000,000. What it does is
produce a reasonable guess in the absence of any other information.

You should read the postgres documentation for further information about
statistics and how the optimizer uses them.


Regards,
Gary.



Re: Optimizer showing wrong rows in plan

От
Tom Lane
Дата:
Tadipathri Raghu <traghu.dba@gmail.com> writes:
> Frankly speaking its a newly created table without any operation on it as
> you have seen the example. Then how come it showing those many rows where we
> have only one in it.

Yes.  This is intentional: the size estimates for a never-yet-analyzed
table are *not* zero.  This is because people frequently create and load
up a table and then immediately query it without an explicit ANALYZE.
The quality of the plans you'd get at that point (especially for joins)
would be spectacularly bad if the default assumption were that the table
was very small.

            regards, tom lane

Re: Optimizer showing wrong rows in plan

От
Jeremy Harris
Дата:
On 03/28/2010 05:27 PM, Tom Lane wrote:
>   This is intentional: the size estimates for a never-yet-analyzed
> table are *not* zero.  This is because people frequently create and load
> up a table and then immediately query it without an explicit ANALYZE.

Does the creation of an index also populate statistics?

Thanks,
     Jeremy

Re: Optimizer showing wrong rows in plan

От
Tom Lane
Дата:
Jeremy Harris <jgh@wizmail.org> writes:
> On 03/28/2010 05:27 PM, Tom Lane wrote:
>> This is intentional: the size estimates for a never-yet-analyzed
>> table are *not* zero.  This is because people frequently create and load
>> up a table and then immediately query it without an explicit ANALYZE.

> Does the creation of an index also populate statistics?

IIRC, it will set the relpages/reltuples counts (though not any
more-complex statistics); but only if the table is found to not be
completely empty.  Again, this is a behavior designed with common
usage patterns in mind, to not set relpages/reltuples to zero on a
table that's likely to get populated shortly.

            regards, tom lane

Re: Optimizer showing wrong rows in plan

От
Tadipathri Raghu
Дата:
Hi Tom,
 
Thank for the update.
 
IIRC, it will set the relpages/reltuples counts (though not any
more-complex statistics); but only if the table is found to not be
completely empty.  Again, this is a behavior designed with common
usage patterns in mind, to not set relpages/reltuples to zero on a
table that's likely to get populated shortly.
As Harris, asked about creation of index will update the statistics. Yes indexes are updating the statistics, so indexes will analyze the table on the backend and update the statistics too, before it creating the index or after creating the index.
 
Example
======
postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test  VALUES (1);
INSERT 0 1
postgres=# select relname,reltuples,relpages from pg_class where relname='test';
 relname | reltuples | relpages
---------+-----------+----------
 test    |         0 |        0
(1 row)
postgres=# create INDEX itest on test (id);
CREATE INDEX
postgres=# select relname,reltuples,relpages from pg_class where relname='test';
 relname | reltuples | relpages
---------+-----------+----------
 test    |         1 |        1
(1 row)
 
Adding one more thing to this thread
==========================
As per the documentation, one page is 8kb, when i create a table with int as one column its 4 bytes. If i insert 2000 rows, it should be in one page only as its 8kb, but its extending vastly as expected. Example shown below, taking the previous example table test with one column.
 
postgres=# insert into test VALUES (generate_series(2,2000));
INSERT 0 1999
postgres=# \dt+
                    List of relations
  Schema  | Name | Type  |  Owner   | Size  | Description
----------+------+-------+----------+-------+-------------
 edbstore | test | table | postgres | 64 kB |
(1 row)
postgres=# select count(*) from test ;
 count
-------
  2000
(1 row)
 
Why the its extending so many pages, where it can fit in one page. Is there any particular reason in behaving this type of paging.
 
Thanks for all in advance
 
Regards
Raghavendra
 
 
 
On Sun, Mar 28, 2010 at 11:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jeremy Harris <jgh@wizmail.org> writes:
> On 03/28/2010 05:27 PM, Tom Lane wrote:
>> This is intentional: the size estimates for a never-yet-analyzed
>> table are *not* zero.  This is because people frequently create and load
>> up a table and then immediately query it without an explicit ANALYZE.

> Does the creation of an index also populate statistics?

IIRC, it will set the relpages/reltuples counts (though not any
more-complex statistics); but only if the table is found to not be
completely empty.  Again, this is a behavior designed with common
usage patterns in mind, to not set relpages/reltuples to zero on a
table that's likely to get populated shortly.

                       regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Optimizer showing wrong rows in plan

От
Matthew Wakeling
Дата:
On Mon, 29 Mar 2010, Tadipathri Raghu wrote:
> As per the documentation, one page is 8kb, when i create a table with int as
> one column its 4 bytes. If i insert 2000 rows, it should be in one page only
> as its 8kb, but its extending vastly as expected. Example shown below,
> taking the previous example table test with one column.

There is more to a row than just the single int column. The space used by
a column will include a column start marker (data length), transaction
ids, hint bits, an oid, a description of the types of the columns, and
finally your data columns. That takes a bit more space.

Matthew

--
 If you let your happiness depend upon how somebody else feels about you,
 now you have to control how somebody else feels about you. -- Abraham Hicks

Re: Optimizer showing wrong rows in plan

От
raghavendra t
Дата:
Hi Mattew,
 
Thank you for the information.
 
Once again, I like to thank each and everyone in this thread for there ultimate support.
 
Regards
Raghavendra

On Mon, Mar 29, 2010 at 4:47 PM, Matthew Wakeling <matthew@flymine.org> wrote:
On Mon, 29 Mar 2010, Tadipathri Raghu wrote:
As per the documentation, one page is 8kb, when i create a table with int as
one column its 4 bytes. If i insert 2000 rows, it should be in one page only
as its 8kb, but its extending vastly as expected. Example shown below,
taking the previous example table test with one column.

There is more to a row than just the single int column. The space used by a column will include a column start marker (data length), transaction ids, hint bits, an oid, a description of the types of the columns, and finally your data columns. That takes a bit more space.

Matthew

--
If you let your happiness depend upon how somebody else feels about you,
now you have to control how somebody else feels about you. -- Abraham Hicks

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Optimizer showing wrong rows in plan

От
Nikolas Everett
Дата:
See http://www.postgresql.org/docs/current/static/storage-page-layout.html for all of what is taking up the space.  Short version: 
 Per block overhead is > 24 bytes
 Per row overhead is 23 bytes + some alignment loss + the null bitmap if you have nullable columns

On Mon, Mar 29, 2010 at 8:24 AM, raghavendra t <raagavendra.rao@gmail.com> wrote:
Hi Mattew,
 
Thank you for the information.
 
Once again, I like to thank each and everyone in this thread for there ultimate support.
 
Regards
Raghavendra

On Mon, Mar 29, 2010 at 4:47 PM, Matthew Wakeling <matthew@flymine.org> wrote:
On Mon, 29 Mar 2010, Tadipathri Raghu wrote:
As per the documentation, one page is 8kb, when i create a table with int as
one column its 4 bytes. If i insert 2000 rows, it should be in one page only
as its 8kb, but its extending vastly as expected. Example shown below,
taking the previous example table test with one column.

There is more to a row than just the single int column. The space used by a column will include a column start marker (data length), transaction ids, hint bits, an oid, a description of the types of the columns, and finally your data columns. That takes a bit more space.

Matthew

--
If you let your happiness depend upon how somebody else feels about you,
now you have to control how somebody else feels about you. -- Abraham Hicks

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance