Обсуждение: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
<div class="WordSection1"><p class="MsoNormal">Hey all,<p class="MsoNormal"> <p class="MsoNormal">I’m building a FDW arounda column-store backend (similar to CStore but for genomic data!).<p class="MsoNormal"> <p class="MsoNormal">I havetables in the billions of rows, and have a common query pattern of asking for the table size (i.e. SELECT COUNT(*) FROMbig_fdw_table; ).<p class="MsoNormal"> <p class="MsoNormal">This is a read-optimized system in which I know in constanttime the exact dimensions of the table.<p class="MsoNormal"> <p class="MsoNormal">Is there any way to convince PostgresFDW to leverage the analyze row counts or even the “double* totalRowCount” returned from the AcquireSampleRows callbackfrom my AnalyzeForeignTable function so that it does not do a full-table scan for a COUNT(*) etc?<p class="MsoNormal"> <pclass="MsoNormal">My current fallback is to export a specialized function that returns the table rowcount for a given FDW table, but that then leaks into the user-application driving these queries.<p class="MsoNormal"> <pclass="MsoNormal">Thanks in advance!<p class="MsoNormal">Gabe<p class="MsoNormal"> <p class="MsoNormal"style="background:white"><b><span style="font-size:10.0pt;color:#262626">Gabe Rudy</span></b><span style="font-size:10.0pt;color:#595959">| VP Product & Engineering | Golden Helix, Inc.</span><span style="color:black"></span><pclass="MsoNormal"> </div>
"Gabe F. Rudy" <rudy@goldenhelix.com> writes: > Is there any way to convince Postgres FDW to leverage the analyze row counts or even the "double* totalRowCount" returnedfrom the AcquireSampleRows callback from my AnalyzeForeignTable function so that it does not do a full-table scanfor a COUNT(*) etc? No. In PG's view, ANALYZE-based row counts are imprecise by definition. regards, tom lane
Hey all,
I’m building a FDW around a column-store backend (similar to CStore but for genomic data!).
I have tables in the billions of rows, and have a common query pattern of asking for the table size (i.e. SELECT COUNT(*) FROM big_fdw_table; ).
This is a read-optimized system in which I know in constant time the exact dimensions of the table.
Is there any way to convince Postgres FDW to leverage the analyze row counts or even the “double* totalRowCount” returned from the AcquireSampleRows callback from my AnalyzeForeignTable function so that it does not do a full-table scan for a COUNT(*) etc?
My current fallback is to export a specialized function that returns the table row count for a given FDW table, but that then leaks into the user-application driving these queries.
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
Ok, I get that. Really what I am *rooting* for is Aggregate (and Sort By) Push-Down to FDW plugins. I can already internalize conditional filters for most cases, and doing a count on the filtered results would be considerablyfaster in my FDW back-end before all the records and Datums have to be constructed for postgres to do the counting. Similarly, I'm very excited about the potential for FDW to advertise a-priori sort states, so things like external merge-sortscan pass-through the request for sorted data for fields in which sorting is a no-op in my backend. Importantly my IDs are sorted by definition since they are essentially array indexes into the column-store, so joining onthem with merge-sort should be blazing fast, but currently time is wasted sorting these pre-sorted fields. Just my 2c, and I'll be tracking the 9.6 progress that includes some of these proposals. Gabe -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Thursday, February 25, 2016 11:21 PM To: Gabe F. Rudy <rudy@goldenhelix.com> Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] FDW handling count(*) through AnalyzeForeignTable or other constant time push-down "Gabe F. Rudy" <rudy@goldenhelix.com> writes: > Is there any way to convince Postgres FDW to leverage the analyze row counts or even the "double* totalRowCount" returnedfrom the AcquireSampleRows callback from my AnalyzeForeignTable function so that it does not do a full-table scanfor a COUNT(*) etc? No. In PG's view, ANALYZE-based row counts are imprecise by definition. regards, tom lane
Hi Gabe, Did you get Aggregate Pushdown FDW plugin? Would be really helpful if you can share some insight on your investigation. Regards Anant -- View this message in context: http://postgresql.nabble.com/FDW-handling-count-through-AnalyzeForeignTable-or-other-constant-time-push-down-tp5889291p5912699.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.