Re: [HACKERS] postgres_fdw bug in 9.6
От | Robert Haas |
---|---|
Тема | Re: [HACKERS] postgres_fdw bug in 9.6 |
Дата | |
Msg-id | CA+TgmoZ0DAVcoN8i3o6=26+GkaqG6C3Zxqk-4AG1KZt0JDARSQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] postgres_fdw bug in 9.6 (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [HACKERS] postgres_fdw bug in 9.6
|
Список | pgsql-hackers |
On Mon, Jan 15, 2018 at 4:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > There *is* a problem with GetExistingLocalJoinPath not honoring its > API spec: it will sometimes return join nests that include remote > joins at levels below the top, as I'd speculated to begin with. > This turns out not to be a problem for postgres_fdw, because any > such remote-join node will just recursively fob off EPQ checks > onto its own outerpath, so that eventually you get down to base > scan relations anyway, and everything works. However, because > GetExistingLocalJoinPath is claimed to be a general purpose function > useful for other FDWs, the fact that its API contract is a lie > seems to me to be a problem anyway. Yeah, that's not good. The idea, as the comments inside GetExistingLocalJoinPath say, is that if we find an existing join whose inner or outer side is a ForeignPath for a join, we fish out the outer path which should be that subpath's EPQ path. I'm not 100% sure why that's not happening in your example, but my first guess is that it's because the inner path has a Materialize node on top of the join-ForeignPath, and GetExistingLocalJoinPath just sees the Materialize node and concludes that it doesn't need to look further. If that's correct, bet we could fix that by telling it to drill through any MaterialPath it sees and use the sub-path instead; given that there is only one tuple per relation, the Materialize step can't be important for performance. > I'm also still pretty unhappy with the amount of useless planning work > caused by doing GetExistingLocalJoinPath during path creation. It strikes > me that we could likely replace the entire thing with some code that just > reconstructs the join node's output tuple during EPQ using the rowmark > data for all the base relations. Outer joins aren't really a problem: > we could tell which relations were replaced by nulls because the rowmark > values that bubbled up to the top went to nulls themselves. However, > that's a nontrivial amount of work and probably wouldn't result in > something we cared to back-patch, especially since it's not really a bug > fix. Yes, I believe something like this would be possible. I did consider that approach, but this seemed easier to implement and, as it was, it took two release cycles to get join pushdown committed. And multiple hundreds of emails, most of them about EPQ, if I remember correctly. I'm happy to have somebody implement something better, but I will probably get a bit hypoxic if I hold my breath waiting for it to happen. > Your patch seems OK codewise, but I think the comment is not nearly > adequate. Maybe something like "The EPQ path must be at least as well > sorted as the path itself, in case it gets used as input to a mergejoin." > > Also, a regression test case would be appropriate perhaps. I tried > to reproduce Jeff's complaint in the postgres_fdw regression database, > and it actually failed on the very first multi-way join I tried: > > contrib_regression=# explain select * from ft1,ft2,ft4,ft5 where ft1.c1=ft2.c1 and ft1.c1=ft4.c1 and ft1.c1=ft5.c1 forupdate; > ERROR: outer pathkeys do not match mergeclauses Thanks for the review. Updated patch attached. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Claudio FreireДата:
Сообщение: Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently
Следующее
От: Claudio FreireДата:
Сообщение: Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem