Re: Import Statistics in postgres_fdw before resorting to sampling.
| От | Michael Paquier |
|---|---|
| Тема | Re: Import Statistics in postgres_fdw before resorting to sampling. |
| Дата | |
| Msg-id | aPXaeowZ6RBPlsrD@paquier.xyz обсуждение исходный текст |
| Ответ на | Re: Import Statistics in postgres_fdw before resorting to sampling. (Corey Huinker <corey.huinker@gmail.com>) |
| Ответы |
Re: Import Statistics in postgres_fdw before resorting to sampling.
|
| Список | pgsql-hackers |
On Sat, Oct 18, 2025 at 08:32:24PM -0400, Corey Huinker wrote: > Rebased again. Hearing an opinion from Fujita-san would be interesting here, so I am adding him in CC. I have been looking a little bit at this patch. + ImportStatistics_function ImportStatistics; All FDW callbacks are documented in fdwhandler.sgml. This new one is not. I am a bit uncomfortable regarding the design you are using here, where the ImportStatistics callback, if defined, takes priority over the existing AnalyzeForeignTable, especially regarding the fact that both callbacks attempt to retrieve the same data, except that the existing callback has a different idea of the timing to use to retrieve reltuples and relpages. The original callback AnalyzeForeignTable retrieves immediately the total number of pages via SQL, to feed ANALYZE. reltuples is then fed to ANALYZE from a callback function that that's defined in AnalyzeForeignTable(). My point is: rather than trying to implement a second solution with a new callback, shouldn't we try to rework the existing callback so as it would fit better with what you are trying to do here: feed data that ANALYZE would then be in charge of inserting? Relying on pg_restore_relation_stats() for the job feels incorrect to me knowing that ANALYZE is the code path in charge of updating the stats of a relation. The new callback is a shortcut that bypasses what ANALYZE does, so the problem, at least it seems to me, is that we want to retrieve all the data in a single step, like your new callback, not in two steps, something that only the existing callback allows. Hence, wouldn't it be more natural to retrieve the total number of pages and reltuples from one callback, meaning that for local relations we should delay RelationGetNumberOfBlocks() inside the existing "acquire_sample_rows" callback (renaming it would make sense)? This requires some redesign of AnalyzeForeignTable and the internals of analyze.c, but it would let FDW extensions know about the potential efficiency gain. There is also a performance concern to be made with the patch, but as it's an ANALYZE path that may be acceptable: if we fail the first callback, then we may call the second callback. Fujita-san, what do you think? -- Michael
Вложения
В списке pgsql-hackers по дате отправления: