Обсуждение: Statistics Injection

Поиск
Список
Период
Сортировка

Statistics Injection

От
Victor Giannakouris - Salalidis
Дата:

Hello,

For some research purposes, I am trying to modify the existing statistics of some tables in the catalogs in order to change the execution plan, experiment with the EXPLAIN call etc.

Concretely, what I'd like to do is to create a "fake" table with a schema of my choice (that's the easy part) and then modify the statistics(particularly, the number of tuples and the number of pages).

Firstly, I create an empty table (CREATE TABLE newTable(....)) and then I update the pg_class table as well (UPDATE pg_class SET relpages = #pages WHERE relname='newTable'). 

The problem is that, even if I set the reltuples and relpages of my choice, when I run the EXPLAIN clause for a query in which the 'newTable'  is involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost and row estimation.

Could anyone help me with that?

Thank you in advance,

Victor Giannakouris

Re: Statistics Injection

От
Tom Lane
Дата:
Victor Giannakouris - Salalidis <victorasgs@gmail.com> writes:
> For some research purposes, I am trying to modify the existing statistics
> of some tables in the catalogs in order to change the execution plan,
> experiment with the EXPLAIN call etc.

> Concretely, what I'd like to do is to create a "fake" table with a schema
> of my choice (that's the easy part) and then modify the
> statistics(particularly, the number of tuples and the number of pages).

> Firstly, I create an empty table (CREATE TABLE newTable(....)) and then I
> update the pg_class table as well (UPDATE pg_class SET relpages = #pages
> WHERE relname='newTable').

> The problem is that, even if I set the reltuples and relpages of my choice,
> when I run the EXPLAIN clause for a query in which the 'newTable'  is
> involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost and
> row estimation.

You can't really do it like that, because the planner always looks at
the true relation size (RelationGetNumberOfBlocks()).  It uses
reltuples/relpages as an estimate of tuple density, not as hard numbers.
The reason for this is to cope with any table growth that may have
occurred since the last VACUUM/ANALYZE.

You could modify the code in plancat.c to change that, or you could
plug into the get_relation_info_hook to tweak the constructed
RelOptInfo before anything is done with it.
        regards, tom lane



Re: Statistics Injection

От
Vladimir Sitnikov
Дата:
<div style="width: 100%; font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255,
255,255);"><span style="font-family: Calibri, 'Slate Pro', sans-serif, sans-serif;"><br /></span></div><div
style="width:100%; font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255,
255);"><br/></div><div style="font-family: sans-serif; font-size: 12.8px;">> The problem is that, even if I set the
reltuplesand relpages of my choice, when I run the EXPLAIN clause for a query in which the 'newTable'  is involved in
(e.g.EXPLAIN SELECT * FROM newTable), I get the same cost and row estimation.</div><div style="font-family: sans-serif;
font-size:12.8px;"><br /></div><div style="font-family: sans-serif; font-size: 12.8px;">>Could anyone help me with
that?</div><divstyle="width: 100%; font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color:
rgb(255,255, 255);"><span style="font-family: Calibri, 'Slate Pro', sans-serif, sans-serif;"><br /></span></div><div
style="width:100%; font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255,
255);"><spanstyle="font-family: Calibri, 'Slate Pro', sans-serif, sans-serif;"><br /></span></div><div style="width:
100%;font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span
style="font-family:Calibri, 'Slate Pro', sans-serif, sans-serif;">There's a pg_dbms_stats extension that enables you to
override/freezethe statistics: https://github.com/ossc-db/pg_dbms_stats</span></div><div style="width: 100%; font-size:
initial;color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-family:
Calibri,'Slate Pro', sans-serif, sans-serif;"><br /></span></div><div style="width: 100%; font-size: initial; color:
rgb(31,73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-family: Calibri, 'Slate
Pro',sans-serif, sans-serif;">Vladimir</span></div> ‎ 

Re: Statistics Injection

От
Victor Giannakouris - Salalidis
Дата:
Hello,

pg_dbms_stats worked well for my case, thank you all for your quick responses and for the support!

Victor

On Sat, Jul 2, 2016 at 8:28 PM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:


> The problem is that, even if I set the reltuples and relpages of my choice, when I run the EXPLAIN clause for a query in which the 'newTable'  is involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost and row estimation.

>Could anyone help me with that?


There's a pg_dbms_stats extension that enables you to override/freeze the statistics: https://github.com/ossc-db/pg_dbms_stats

Vladimir



--
Victor Giannakouris - Salalidis

Researcher
Computing Systems Laboratory (CSLab),
National Technical University of Athens