Re: Removing useless DISTINCT clauses

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Removing useless DISTINCT clauses
Дата
Msg-id 20180824021214.GJ3326@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Removing useless DISTINCT clauses  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Removing useless DISTINCT clauses
Список pgsql-hackers
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * David Rowley (david.rowley@2ndquadrant.com) wrote:
> >> On 24 August 2018 at 11:34, Stephen Frost <sfrost@snowman.net> wrote:
> >>> * David Rowley (david.rowley@2ndquadrant.com) wrote:
> >>>> My personal opinion of only being able to completely remove the
> >>>> DISTINCT when there's a single item in the rtable (or a single base
> >>>> table) is that it's just too poor to bother with.
>
> > Hm, so you're suggesting that this isn't the right place for this
> > optimization to be implemented, even now, with the single-relation
> > caveat?
>
> There is no case where planner optimizations should depend on the length
> of the rtable.  Full stop.
>
> It could make sense to optimize if there is just one baserel in the join
> tree --- although even that is best checked only after join removal.

Hm, that's certainly a fair point.

> As an example of the difference, such an optimization should be able to
> optimize "select * from view" if the view contains just one base table.
> The rtable will list both the view and the base table, but the view
> is only hanging around for permissions-checking purposes; it should not
> affect the planner's behavior.

This is happening at the same time as some optimizations around GROUP
BY, so either there's something different about what's happening there
and I didn't appreciate it, or does that optimization suffer from a
similar issue?

> I've not read the patch, but David's reaction makes it sound like its
> processing is done too early.  There are right places and wrong places
> to do most everything in the planner, and I do not wish to accept a
> patch that does something in the wrong place.

Right, I definitely agree with you there.  This seemed like a reasonable
place given the similar optimization (at least in appearance to me)
being done there for the GROUP BY case.  I'm happy to admit that I
haven't looked at it in very much depth (hence my question to David) and
I'm not an expert in this area, but I did want to bring up that the
general idea and the relative trade-offs at least sounded reasonable.

I'll also note that I didn't see these concerned raised earlier on the
thread when I re-read your remarks on it, so I'm a bit concerned that
perhaps either this isn't an actual concern to be realized or perhaps it
was missed previously.

Thanks!

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Removing useless DISTINCT clauses
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Improve behavior of concurrent ANALYZE/VACUUM