Re: [HACKERS] Discussion on missing optimizations
От | Robert Haas |
---|---|
Тема | Re: [HACKERS] Discussion on missing optimizations |
Дата | |
Msg-id | CA+TgmoZfUUuTk+AdfD-cwW-b8dOLRPsvbg6Kqj4QGGT9qBS4iQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Discussion on missing optimizations (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [HACKERS] Discussion on missing optimizations
(Laurenz Albe <laurenz.albe@cybertec.at>)
Re: [HACKERS] Discussion on missing optimizations (David Rowley <david.rowley@2ndquadrant.com>) |
Список | pgsql-hackers |
On Fri, Oct 6, 2017 at 10:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 9. Unneeded Self JOIN > >> Can't remember discussions of this. > > I can't get very excited about that one either. My memories of being a PostgreSQL user rather than a developer are getting a bit distant by now, but I definitely remember hitting this problem especially in cases involving DELETE FROM and UPDATE USING. You can't specify a left join between the result relation and some other table directly, so you end up having to do DELETE FROM x USING x LEFT JOIN y ... which then has lousy performance. I think you could hit it in other cases, too, e.g. a join between two views which share a common base table. Of course maybe you wouldn't have two views in the first place if join removal worked better than it currently does, but without, at least, inner join removal you can't really rely on the query planner to prune things down to what's really needed. > In the end, what the article fails to consider is that all of these are > tradeoffs, not unalloyed goods. If you spend planner cycles on every > query to look for cases that only the most unabashedly brain-dead ORMs > ever generate, you're not really doing your users a favor on balance. I think what you're failing to consider is that the author of the article is a user. When he says these optimizations are cool, he means that they would benefit his use cases (and presumably that he's willing to pay some number of additional cycles to get them). We haven't really done a very good job figuring out what to do about optimizations that some people (mostly you) think are marginal. It's obviously true that we can't just burn infinite planner cycles on things that don't usually help, but at the same time, we can't just keep ignoring requests by users for the same features and saying "you'll be sad if we give this to you". Those people don't give up on wanting the optimization; they just don't use PostgreSQL. I think we need to stop just saying "no" and start thinking about what we could do that would let us say "yes". One trick that some system use is avoid replanning as much as we do by, for example, saving plans in a shared cache and reusing them even in other sessions. That's hard to do in our architecture because the controlling GUCs can be different in every session and there's not even any explicit labeling of which GUCs control planner behavior. But if you had it, then extra planning cycles would be, perhaps, more tolerable. Another trick is to plan harder when the cost or complexity of the query exceeds some threshold, but that's hard to do because we don't really know what the cost or complexity is until after we get done planning, by which time it's too late to (cheaply) go back and apply those optimizations. This problem of not really knowing whether we've got an expensive query is also a problem in terms of being smart about how many parallel workers to pick or whether to consider parallelism at all; we'd like to consider more options for expensive queries than cheap ones. But there's not really any good way to do this as things stand today. Maybe one way to attack this problem is to make it more general: what's the next big thing for the planner? It wasn't altogether clear to me that the path->plan conversion was going to be a huge improvement, but it definitely has been. Nothing really got much faster as a direct result of that work, but it made the code simpler and more understandable and thus easier to enhance, unblocking improvements like postgres_fdw aggregate pushdown. I think we're a long way from reaching the end of the benefits of that commit - there is a lot more that can usefully be done. But, what's next? You told me a few years back when I was thinking about what to do next that there was still a lot of gains to be had from improving the planner, and I wasn't sure I quite believed it. But it's becoming more clear to me that this is true, and that the kinds of things this article is talking about are one approach. I don't really have the answers here, but I don't think our planner is anywhere close to as good as it can be, even if in certain respects it is stuck at some kind of local maximum. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Nico WilliamsДата:
Сообщение: Re: [HACKERS] [PATCH] Add ALWAYS DEFERRED option for constraints