Re: Avoid full GIN index scan when possible
От | Alexander Korotkov |
---|---|
Тема | Re: Avoid full GIN index scan when possible |
Дата | |
Msg-id | CAPpHfdvfK+Zk=MjjC1M_66+ALD+vdR6BG3quoFAZzjL6hNv-NQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Avoid full GIN index scan when possible (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Avoid full GIN index scan when possible
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-hackers |
On Thu, Aug 1, 2019 at 9:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Julien Rouhaud <rjuju123@gmail.com> writes: > > On Thu, Aug 1, 2019 at 4:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Eyeing this a bit further ... doesn't scanPendingInsert also need > >> to honor so->forcedRecheck? Something along the lines of > > > I think so. > > Yeah, it does --- the updated pg_trgm test attached fails if it doesn't. > > Also, I found that Alexander's concern upthread: > > >> What would happen when two-columns index have GIN_SEARCH_MODE_DEFAULT > >> scan on first column and GIN_SEARCH_MODE_ALL on second? I think even > >> if triconsistent() for second column returns GIN_TRUE, we still need > >> to recheck to verify second columns is not NULL. > > is entirely on-point. This patch generates the wrong answer in the > case I added to gin.sql below. (The expected output was generated > with HEAD and seems correct, but with these code changes, we incorrectly > report the row with NULL as matching. So I expect the cfbot is going > to complain about the patch in this state.) > > While I've not attempted to fix that here, I wonder whether we shouldn't > fix it by just forcing forcedRecheck to true in any case where we discard > an ALL qualifier. That would get rid of all the ugliness around > ginFillScanKey, which I'd otherwise really want to refactor to avoid > this business of adding and then removing a scan key. It would also > get rid of the bit about "XXX Need to use ALL mode instead of EVERYTHING > to skip NULLs if ALL mode has been seen", which aside from being ugly > seems to be dead wrong for multi-column-index cases. +1 for setting forcedRecheck in any case we discard ALL qualifier. ISTM, real life number of cases we can skip recheck here is negligible. And it doesn't justify complexity. > BTW, it's not particularly the fault of this patch, but: what does it > even mean to specify GIN_SEARCH_MODE_ALL with a nonzero number of keys? It might mean we would like to see all the results, which don't contain given key. > Should we decide to treat that as an error? It doesn't look to me like > any of the in-tree opclasses will return such a case, and I'm not at > all convinced what the GIN scan code would actually do with it, except > that I doubt it matches the documentation. I think tsvector_ops behaves so. See gin_extract_tsquery(). /* * If the query doesn't have any required positive matches (for * instance, it's something like '! foo'), we have to do a full index * scan. */ if (tsquery_requires_match(item)) *searchMode = GIN_SEARCH_MODE_DEFAULT; else *searchMode = GIN_SEARCH_MODE_ALL; ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)