Re: Bootstrap DATA is a pita

Поиск
Список
Период
Сортировка
От Caleb Welton
Тема Re: Bootstrap DATA is a pita
Дата
Msg-id CAOjayEfFrRvvvXOcJngCNqeGAW-HaS+=h1U_LAtb+Hmuuqj70Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Bootstrap DATA is a pita  (Caleb Welton <cwelton@pivotal.io>)
Список pgsql-hackers
I took a look at a few of the most recent bulk edit cases for pg_proc.h:

There were two this year:
* The addition of proparallel  [1]
* The addition of protransform [2]

And prior to that the most recent seems to be from 2012:
* The addition of proleakproof [3]

Quick TLDR - the changes needed to reflect these are super simple to reflect when generating SQL for CREATE FUNCTION statements.

Attached is the SQL that would generate function definitions prior to proleakproof and the diffs that would be required after adding support for proleakproof, protransform and proparallel. 

Each of the diffs indicates the changes that would be needed after the new column is added, the question of how to populate default values for the new columns is beyond the scope that can easily be expressed in general terms and depends entirely on what the nature of the new column is.

Note: Currently I have focused on the 'pure' functions, e.g. not the drivers of type serialization, language validation, operators, or other object types.  I would want to deal with each of those while handling the conversion for each of those object types in turn.  Additional modifications would likely be needed for other types of functions.



On Fri, Dec 11, 2015 at 12:55 PM, Caleb Welton <cwelton@pivotal.io> wrote:
Makes sense.

During my own prototyping what I did was generate the sql statements via sql querying the existing catalog.  Way easier than hand writing 1000+ function definitions and not difficult to modify for future changes.  As affirmed that it was very easy to adapt my existing sql to account for some of the newer features in master.

The biggest challenge was establishing a sort order that ensures both a unique ordering and that the dependencies needed for SQL functions have been processed before trying to define them.  Which effects about 4/1000 functions based on a natural oid ordering.

> On Dec 11, 2015, at 11:43 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Caleb Welton wrote:
>> I'm happy working these ideas forward if there is interest.
>>
>> Basic design proposal is:
>>  - keep a minimal amount of bootstrap to avoid intrusive changes to core
>> components
>>  - Add capabilities of creating objects with specific OIDs via DDL during
>> initdb
>>  - Update the caching/resolution mechanism for builtin functions to be
>> more dynamic.
>>  - Move as much of bootstrap as possible into SQL files and create catalog
>> via DDL
>
> I think the point we got stuck last time at was deciding on a good
> format for the data coming from the DATA lines.  One of the objections
> raised for formats such as JSON is that it's trivial for "git merge" (or
> similar tools) to make a mistake because object-end/object-start lines
> are all identical.  And as for the SQL-format version, the objection was
> that it's hard to modify the lines en-masse when modifying the catalog
> definition (new column, etc).  Ideally we would like a format that can
> be bulk-edited without too much trouble.
>
> A SQL file would presumably not have the merge issue, but mass-editing
> would be a pain.
>
> Crazy idea: we could just have a CSV file which can be loaded into a
> table for mass changes using regular DDL commands, then dumped back from
> there into the file.  We already know how to do these things, using
> \copy etc.  Since CSV uses one line per entry, there would be no merge
> problems either (or rather: all merge problems would become conflicts,
> which is what we want.)
>
> --
> Álvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Bootstrap DATA is a pita
Следующее
От: Greg Stark
Дата:
Сообщение: Re: Using quicksort for every external sort run