Обсуждение: Statistics Injection
Hello,
Firstly, I create an empty table (CREATE TABLE newTable(....)) and then I update the pg_class table as well (UPDATE pg_class SET relpages = #pages WHERE relname='newTable').
For some research purposes, I am trying to modify the existing statistics of some tables in the catalogs in order to change the execution plan, experiment with the EXPLAIN call etc.
Concretely, what I'd like to do is to create a "fake" table with a schema of my choice (that's the easy part) and then modify the statistics(particularly, the number of tuples and the number of pages).
Firstly, I create an empty table (CREATE TABLE newTable(....)) and then I update the pg_class table as well (UPDATE pg_class SET relpages = #pages WHERE relname='newTable').
The problem is that, even if I set the reltuples and relpages of my choice, when I run the EXPLAIN clause for a query in which the 'newTable' is involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost and row estimation.
Could anyone help me with that?
Thank you in advance,
Victor Giannakouris
Victor Giannakouris - Salalidis <victorasgs@gmail.com> writes: > For some research purposes, I am trying to modify the existing statistics > of some tables in the catalogs in order to change the execution plan, > experiment with the EXPLAIN call etc. > Concretely, what I'd like to do is to create a "fake" table with a schema > of my choice (that's the easy part) and then modify the > statistics(particularly, the number of tuples and the number of pages). > Firstly, I create an empty table (CREATE TABLE newTable(....)) and then I > update the pg_class table as well (UPDATE pg_class SET relpages = #pages > WHERE relname='newTable'). > The problem is that, even if I set the reltuples and relpages of my choice, > when I run the EXPLAIN clause for a query in which the 'newTable' is > involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost and > row estimation. You can't really do it like that, because the planner always looks at the true relation size (RelationGetNumberOfBlocks()). It uses reltuples/relpages as an estimate of tuple density, not as hard numbers. The reason for this is to cope with any table growth that may have occurred since the last VACUUM/ANALYZE. You could modify the code in plancat.c to change that, or you could plug into the get_relation_info_hook to tweak the constructed RelOptInfo before anything is done with it. regards, tom lane
<div style="width: 100%; font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255,255);"><span style="font-family: Calibri, 'Slate Pro', sans-serif, sans-serif;"><br /></span></div><div style="width:100%; font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><br/></div><div style="font-family: sans-serif; font-size: 12.8px;">> The problem is that, even if I set the reltuplesand relpages of my choice, when I run the EXPLAIN clause for a query in which the 'newTable' is involved in (e.g.EXPLAIN SELECT * FROM newTable), I get the same cost and row estimation.</div><div style="font-family: sans-serif; font-size:12.8px;"><br /></div><div style="font-family: sans-serif; font-size: 12.8px;">>Could anyone help me with that?</div><divstyle="width: 100%; font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255,255, 255);"><span style="font-family: Calibri, 'Slate Pro', sans-serif, sans-serif;"><br /></span></div><div style="width:100%; font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><spanstyle="font-family: Calibri, 'Slate Pro', sans-serif, sans-serif;"><br /></span></div><div style="width: 100%;font-size: initial; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-family:Calibri, 'Slate Pro', sans-serif, sans-serif;">There's a pg_dbms_stats extension that enables you to override/freezethe statistics: https://github.com/ossc-db/pg_dbms_stats</span></div><div style="width: 100%; font-size: initial;color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-family: Calibri,'Slate Pro', sans-serif, sans-serif;"><br /></span></div><div style="width: 100%; font-size: initial; color: rgb(31,73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-family: Calibri, 'Slate Pro',sans-serif, sans-serif;">Vladimir</span></div>
Hello,
pg_dbms_stats worked well for my case, thank you all for your quick responses and for the support!
Victor
On Sat, Jul 2, 2016 at 8:28 PM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:
> The problem is that, even if I set the reltuples and relpages of my choice, when I run the EXPLAIN clause for a query in which the 'newTable' is involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost and row estimation.>Could anyone help me with that?There's a pg_dbms_stats extension that enables you to override/freeze the statistics: https://github.com/ossc-db/pg_dbms_statsVladimir
Victor Giannakouris - Salalidis
Researcher
Computing Systems Laboratory (CSLab),
National Technical University of Athens
Personal Page: http://gsvic.github.io