New WIP patch for cross column statistics Re: TEXT vs PG_NODE_TREE in system columns (cross column and expression statistics patch)

Поиск
Список
Период
Сортировка
От Boszormenyi Zoltan
Тема New WIP patch for cross column statistics Re: TEXT vs PG_NODE_TREE in system columns (cross column and expression statistics patch)
Дата
Msg-id 4E3A8CDE.7020409@cybertec.at
обсуждение исходный текст
Ответ на Re: TEXT vs PG_NODE_TREE in system columns (cross column and expression statistics patch)  (Alvaro Herrera <alvherre@commandprompt.com>)
Список pgsql-hackers
Hi,

2011-04-28 17:20 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of jue abr 28 11:03:56 -0300 2011:
>> Hi,
>>
>> attached is the WIP patch for cross-column statistics and
>> extra expression statistics.
>>
>> My question is that why pg_node_tree is unusable as
>> syscache attribute? I attempted to alias it as text in the patch
>> but I get the following error if I try to use it by setting
>> USE_SYSCACHE_FOR_SEARCH to 1 in selfuncs.c.
>> Directly using the underlying pg_statistic3 doesn't cause an error.
> Two comments:
> 1. it seems that expression stats are mostly separate from cross-column
> stats; does it really make sense to submit the two in the same patch?
>
> 2. there are almost no code comments anywhere
>
> 3. (bonus) if you're going to copy/paste pg_attribute.h verbatim into
> the new files, please remove the bits you currently have in "#if 0".
> (Not to mention the fact that the new catalogs seem rather poorly
> named).

OK, we went to a different route this time. Here is what we came
up with. Attached are two patches.

attnum-int2vector.patch implements:

- int2vector support routines and catalog entries for them
- pg_statistic is modified so "staattnum int2" it converted to
  "staattnums int2vector". RemoveStatistics() is modified to take
  an array of AttrNumber and the length of it.
- pg_attribute.attstattarget is moved to pg_statistic.statarget,
  pg_statistic gains a new "stavalid" bool field. Two support routines
  are added: AddStatistics() and InvalidateStatistics(). Entries
  in pg_statistic for table columns are created upon table creation
  and ALTER TABLE ADD COLUMN and maintained for the lifetime
  of the column. Exceptions are system tables: calling AddStatistics()
  for them during initdb is a Catch-22 when pg_statistic doesn't yet
  exist. For these, ANALYZE creates the pg_statistic record just
  as before. ALTER TABLE ALTER COLUMN SET DATA TYPE
  only invalidates the record by setting "stavalid" to false.
- Factor out common code for getting the statistics tuple into a
  new function called validate_statistics().

cross-col-syntax.patch builds on the first patch and implements:

CREATE CROSS COLUMN STATISTICS ON TABLE tabname (col, ...)
     [ WITH ( statistics_target ) ] ;

DROP CROSS COLUMN STATISTICS ON TABLE tabname (col, ...) ;

CREATE CROSS COLUMN STATISTICS ON INDEX idxname
     [ WITH ( statistics_target ) ] ;

DROP CROSS COLUMN STATISTICS ON INDEX idxname ;

and puts new records into pg_statistic with array_length(staattnums, 1) > 1.
Note: this patch should record dependencies on the respective table or
index and the fields but doesn't.

The data structure for storing the N-dimension histogram is not yet decided.

Comments?

Best regards,
Zoltán böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
     http://www.postgresql.at/


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Boszormenyi Zoltan
Дата:
Сообщение: TRUE/FALSE vs true/false
Следующее
От: Robert Haas
Дата:
Сообщение: Re: TRUE/FALSE vs true/false