Re: How to know referenced sub-fields of a composite type?
От | Kohei KaiGai |
---|---|
Тема | Re: How to know referenced sub-fields of a composite type? |
Дата | |
Msg-id | CAOP8fzY4s9rgXwtZsEfbL9fqmmTbXEXGzZO3RDMC_GyFugAS6w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: How to know referenced sub-fields of a composite type? (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
Ответы |
Re: How to know referenced sub-fields of a composite type?
Re: How to know referenced sub-fields of a composite type? |
Список | pgsql-hackers |
Hi Amit, 2019年5月29日(水) 13:26 Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>: > > Kaigai-san, > > On 2019/05/29 12:13, Kohei KaiGai wrote: > > One interesting data type in Apache Arrow is "Struct" data type. It is > > equivalent to composite > > type in PostgreSQL. The "Struct" type has sub-fields, and individual > > sub-fields have its own > > values array for each. > > > > It means we can skip to load the sub-fields unreferenced, if > > query-planner can handle > > referenced and unreferenced sub-fields correctly. > > On the other hands, it looks to me RelOptInfo or other optimizer > > related structure don't have > > this kind of information. RelOptInfo->attr_needed tells extension > > which attributes are referenced > > by other relation, however, its granularity is not sufficient for sub-fields. > > Isn't that true for some other cases as well, like when a query accesses > only some sub-fields of a json(b) column? In that case too, planner > itself can't optimize away access to other sub-fields. What it can do > though is match a suitable index to the operator used to access the > individual sub-fields, so that the index (if one is matched and chosen) > can optimize away accessing unnecessary sub-fields. IOW, it seems to me > that the optimizer leaves it up to the indexes (and plan nodes) to further > optimize access to within a field. How is this case any different? > I think it is a little bit different scenario. Even if an index on sub-fields can indicate the tuples to be fetched, the fetched tuple contains all the sub-fields because heaptuple is row-oriented data. For example, if WHERE-clause checks a sub-field: "x" then aggregate function references other sub-field "y", Scan/Join node has to return a tuple that contains both "x" and "y". IndexScan also pops up a tuple with a full composite type, so here is no problem if we cannot know which sub-fields are referenced in the later stage. Maybe, if IndexOnlyScan supports to return a partial composite type, it needs similar infrastructure that can be used for a better composite type support on columnar storage. > > Probably, all we can do right now is walk-on the RelOptInfo list to > > lookup FieldSelect node > > to see the referenced sub-fields. Do we have a good idea instead of > > this expensive way? > > # Right now, PG-Strom loads all the sub-fields of Struct column from > > arrow_fdw foreign-table > > # regardless of referenced / unreferenced sub-fields. Just a second best. > > I'm missing something, but if PG-Strom/arrow_fdw does look at the > FieldSelect nodes to see which sub-fields are referenced, why doesn't it > generate a plan that will only access those sub-fields or why can't it? > Likely, it is not a technical problem but not a smart implementation. If I missed some existing infrastructure we can apply, it may be more suitable than query/expression tree walking. Best regards, -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei <kaigai@heterodb.com>
В списке pgsql-hackers по дате отправления:
Следующее
От: Pavan DeolaseeДата:
Сообщение: Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits