Обсуждение: ALTER TABLE ... NOREWRITE option
It's hard to know whether your tables will be locked for long periods when implementing DDL changes. The NOREWRITE option would cause an ERROR if the table would be rewritten by the command. This would allow testing to highlight long running statements before code hits production. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes:
> It's hard to know whether your tables will be locked for long periods
> when implementing DDL changes.
> The NOREWRITE option would cause an ERROR if the table would be
> rewritten by the command.
> This would allow testing to highlight long running statements before
> code hits production.
I'm not thrilled about inventing YA keyword for this.  If you have a
problem with that sort of scenario, why aren't you testing your DDL
on a test server before you do it on production?
Or even more to the point, you can always cancel the statement once
you realize it's taking too long.
Also, I don't really like the idea of exposing syntax knobs for
what ought to be purely an internal optimization.  If someday the
optimization becomes unnecessary or radically different in behavior,
you're stuck with dead syntax.  Sometimes the knob is sufficiently
important to take that risk, but it doesn't seem to be so here.
        regards, tom lane
			
		On 1 December 2012 16:38, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> It's hard to know whether your tables will be locked for long periods >> when implementing DDL changes. > >> The NOREWRITE option would cause an ERROR if the table would be >> rewritten by the command. > >> This would allow testing to highlight long running statements before >> code hits production. > > I'm not thrilled about inventing YA keyword for this. If you have a > problem with that sort of scenario, why aren't you testing your DDL > on a test server before you do it on production? That's the point. You run it on a test server first, and you can conclusively see that it will/will not run for a long time on production server. Greg Sabine Mullane wrote an interesting blog about a way of solving the problem in userspace. > Or even more to the point, you can always cancel the statement once > you realize it's taking too long. Which means you have to watch it, which is not always possible. > Also, I don't really like the idea of exposing syntax knobs for > what ought to be purely an internal optimization. If someday the > optimization becomes unnecessary or radically different in behavior, > you're stuck with dead syntax. Sometimes the knob is sufficiently > important to take that risk, but it doesn't seem to be so here. I think it was an interesting idea, but I agree with comments about weird syntax. We need something better and more general for impact assessment. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2012-12-01 18:27:08 +0000, Simon Riggs wrote: > On 1 December 2012 16:38, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Simon Riggs <simon@2ndQuadrant.com> writes: > >> It's hard to know whether your tables will be locked for long periods > >> when implementing DDL changes. > > > >> The NOREWRITE option would cause an ERROR if the table would be > >> rewritten by the command. > > > >> This would allow testing to highlight long running statements before > >> code hits production. > > > > I'm not thrilled about inventing YA keyword for this. If you have a > > problem with that sort of scenario, why aren't you testing your DDL > > on a test server before you do it on production? > > That's the point. You run it on a test server first, and you can > conclusively see that it will/will not run for a long time on > production server. > > Greg Sabine Mullane wrote an interesting blog about a way of solving > the problem in userspace. > > > Or even more to the point, you can always cancel the statement once > > you realize it's taking too long. > > Which means you have to watch it, which is not always possible. > > > Also, I don't really like the idea of exposing syntax knobs for > > what ought to be purely an internal optimization. If someday the > > optimization becomes unnecessary or radically different in behavior, > > you're stuck with dead syntax. Sometimes the knob is sufficiently > > important to take that risk, but it doesn't seem to be so here. > > I think it was an interesting idea, but I agree with comments about > weird syntax. > > We need something better and more general for impact assessment. My first thought is to add more detailed EXPLAIN support for DDL... Although that unfortunately broadens the scope of this a tiny bit. Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
>> I'm not thrilled about inventing YA keyword for this.  If you have a
>> problem with that sort of scenario, why aren't you testing your DDL
>> on a test server before you do it on production?
*I* do test my DDL.  However, there are literally hundreds of thousands
of Rails, Django and Hibernate developers who "test" their DDL by
running it using a 5-row dataset on their laptops before pushing it to
production.  As far as these folks are concerned, "rewrite if there's a
default" is a completely unintuitive booby-trap.
I agree that adding "NOREWRITE" is a bad solution, though.  Personally,
I'd rather have a system function which tests whether a series of DDL
statements involves a rewrite anywhere.  e.g.:
SELECT pg_test_for_rewrite('ALTER TABLE josh ADD COLUMN haircolor')
Hmmmm.  Actually, that wouldn't work with migrations tools, especially
Rails.  Better, what about a GUC?
SET message_on_rewrite=WARNING;
-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
			
		On 2012-12-01 12:24:57 -0800, Josh Berkus wrote:
> 
> >> I'm not thrilled about inventing YA keyword for this.  If you have a
> >> problem with that sort of scenario, why aren't you testing your DDL
> >> on a test server before you do it on production?
> 
> *I* do test my DDL.  However, there are literally hundreds of thousands
> of Rails, Django and Hibernate developers who "test" their DDL by
> running it using a 5-row dataset on their laptops before pushing it to
> production.  As far as these folks are concerned, "rewrite if there's a
> default" is a completely unintuitive booby-trap.
> 
> I agree that adding "NOREWRITE" is a bad solution, though.  Personally,
> I'd rather have a system function which tests whether a series of DDL
> statements involves a rewrite anywhere.  e.g.:
> 
> SELECT pg_test_for_rewrite('ALTER TABLE josh ADD COLUMN haircolor')
> 
> Hmmmm.  Actually, that wouldn't work with migrations tools, especially
> Rails.  Better, what about a GUC?
> 
> SET message_on_rewrite=WARNING;
If you have a framework and you want to test for this you can just
compare relfilenodes before/after.
Greetings,
Andres Freund
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services
			
		> If you have a framework and you want to test for this you can just > compare relfilenodes before/after. You're targeting the wrong users. This feature isn't for helping you or me. It's for helping the Rails users. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 2012-12-01 12:35:15 -0800, Josh Berkus wrote: > > > If you have a framework and you want to test for this you can just > > compare relfilenodes before/after. > > You're targeting the wrong users. This feature isn't for helping you or > me. It's for helping the Rails users. I am not targeting anything. All I am saying is that if some framework (not the framework's users!) wants to build something like that today its not all that complicated. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Andres Freund > Sent: 01 December 2012 21:40 > To: Josh Berkus > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] ALTER TABLE ... NOREWRITE option > > On 2012-12-01 12:35:15 -0800, Josh Berkus wrote: > > > > > If you have a framework and you want to test for this you can just > > > compare relfilenodes before/after. > > > > You're targeting the wrong users. This feature isn't for helping you > > or me. It's for helping the Rails users. > > I am not targeting anything. All I am saying is that if some framework (not the > framework's users!) wants to build something like that today its not all that > complicated. I would prefer to be able to detect this *before* the rewrite happens though when writing tooling. Regards Petr Jelinek
On Sat, Dec 01, 2012 at 07:34:51PM +0100, Andres Freund wrote: > On 2012-12-01 18:27:08 +0000, Simon Riggs wrote: > > On 1 December 2012 16:38, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Simon Riggs <simon@2ndQuadrant.com> writes: > > >> It's hard to know whether your tables will be locked for long periods > > >> when implementing DDL changes. > > > > > >> The NOREWRITE option would cause an ERROR if the table would be > > >> rewritten by the command. > > > > > >> This would allow testing to highlight long running statements before > > >> code hits production. > > > > > > I'm not thrilled about inventing YA keyword for this. If you have a > > > problem with that sort of scenario, why aren't you testing your DDL > > > on a test server before you do it on production? > > > > That's the point. You run it on a test server first, and you can > > conclusively see that it will/will not run for a long time on > > production server. Acquiring the lock could still take an unpredictable amount of time. > > Greg Sabine Mullane wrote an interesting blog about a way of solving > > the problem in userspace. I currently recommend using the DEBUG1 messages for this purpose: [local] test=# set client_min_messages = debug1; SET [local] test=# create table t (c int8 primary key, c1 text); DEBUG: building index "pg_toast_109381_index" on table "pg_toast_109381" DEBUG: CREATE TABLE / PRIMARY KEY will create implicit index "t_pkey" for table "t" DEBUG: building index "t_pkey" on table "t" CREATE TABLE [local] test=# alter table t alter c type int4; DEBUG: building index "pg_toast_109391_index" on table "pg_toast_109391" DEBUG: rewriting table "t" DEBUG: building index "t_pkey" on table "t" ALTER TABLE [local] test=# alter table t alter c type oid; DEBUG: building index "t_pkey" on table "t" ALTER TABLE Observe that some changes rewrite the table and all indexes, while others skip rewriting the table but rebuild one or more indexes. I've threatened to optimize type changes like (base type) -> (domain with CHECK constraint) by merely scanning the table for violations. If we do add syntax such as you have proposed, I recommend using a different name and defining it to reject any operation with complexity O(n) or worse relative to table size. That being said, I share Tom's doubts. The DEBUG1 messages are a sorry excuse for a UI, but I'm not seeing a clear improvement in NOREWRITE. > > > Or even more to the point, you can always cancel the statement once > > > you realize it's taking too long. > > > > Which means you have to watch it, which is not always possible. There's statement_timeout. > My first thought is to add more detailed EXPLAIN support for > DDL... Although that unfortunately broadens the scope of this a tiny > bit. That would be ideal. Thanks, nm
Noah Misch <noah@leadboat.com> writes: > Acquiring the lock could still take an unpredictable amount of time. I think there's a new GUC brewing about setting the lock timeout separately from the statement timeout, right? > being said, I share Tom's doubts. The DEBUG1 messages are a sorry excuse for > a UI, but I'm not seeing a clear improvement in NOREWRITE. EXPLAIN ALTER TABLE would be the next step I guess. I discovered plenty of magic tricks when working on the rewriting support for that command, and I think exposing them in the EXPLAIN output would go a long way towards reducing some POLA violations. Ideally the EXPLAIN command would include names of new objects created by the command, such as constraints and indexes. >> My first thought is to add more detailed EXPLAIN support for >> DDL... Although that unfortunately broadens the scope of this a tiny >> bit. > > That would be ideal. +1 Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
> EXPLAIN ALTER TABLE would be the next step I guess. I discovered plenty > of magic tricks when working on the rewriting support for that command, > and I think exposing them in the EXPLAIN output would go a long way > towards reducing some POLA violations. I think this would be cool on its own merits. However, for a very large user group -- users with ORMs which do their own DDL migrations -- we could also use a way to log or WARN for table rewrites. Since the ORMs are not gonna do an EXPLAIN. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes:
>> EXPLAIN ALTER TABLE would be the next step I guess. I discovered plenty
>> of magic tricks when working on the rewriting support for that command,
>> and I think exposing them in the EXPLAIN output would go a long way
>> towards reducing some POLA violations.
> I think this would be cool on its own merits.
> However, for a very large user group -- users with ORMs which do their
> own DDL migrations -- we could also use a way to log or WARN for table
> rewrites.  Since the ORMs are not gonna do an EXPLAIN.
An ORM is also quite unlikely to pay attention to a warning, much less a
postmaster log entry ... so your argument seems unfounded from here.
        regards, tom lane
			
		On Mon, Dec 03, 2012 at 11:37:17AM +0100, Dimitri Fontaine wrote: > Noah Misch <noah@leadboat.com> writes: > > Acquiring the lock could still take an unpredictable amount of time. > > I think there's a new GUC brewing about setting the lock timeout > separately from the statement timeout, right? Yes.
On 12/02/2012 03:07 AM, Noah Misch wrote: > On Sat, Dec 01, 2012 at 07:34:51PM +0100, Andres Freund wrote: >> On 2012-12-01 18:27:08 +0000, Simon Riggs wrote: >>> On 1 December 2012 16:38, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Simon Riggs <simon@2ndQuadrant.com> writes: >>>>> It's hard to know whether your tables will be locked for long periods >>>>> when implementing DDL changes. >>>>> The NOREWRITE option would cause an ERROR if the table would be >>>>> rewritten by the command. >>>>> This would allow testing to highlight long running statements before >>>>> code hits production. >>>> I'm not thrilled about inventing YA keyword for this. If you have a >>>> problem with that sort of scenario, why aren't you testing your DDL >>>> on a test server before you do it on production? >>> That's the point. You run it on a test server first, and you can >>> conclusively see that it will/will not run for a long time on >>> production server. > Acquiring the lock could still take an unpredictable amount of time. > >>> Greg Sabine Mullane wrote an interesting blog about a way of solving >>> the problem in userspace. > I currently recommend using the DEBUG1 messages for this purpose: > > [local] test=# set client_min_messages = debug1; > SET > [local] test=# create table t (c int8 primary key, c1 text); > DEBUG: building index "pg_toast_109381_index" on table "pg_toast_109381" > DEBUG: CREATE TABLE / PRIMARY KEY will create implicit index "t_pkey" for table "t" > DEBUG: building index "t_pkey" on table "t" > CREATE TABLE > [local] test=# alter table t alter c type int4; > DEBUG: building index "pg_toast_109391_index" on table "pg_toast_109391" > DEBUG: rewriting table "t" > DEBUG: building index "t_pkey" on table "t" > ALTER TABLE > [local] test=# alter table t alter c type oid; > DEBUG: building index "t_pkey" on table "t" > ALTER TABLE > > Observe that some changes rewrite the table and all indexes, while others skip > rewriting the table but rebuild one or more indexes. I've threatened to > optimize type changes like (base type) -> (domain with CHECK constraint) by > merely scanning the table for violations. If we do add syntax such as you > have proposed, I recommend using a different name and defining it to reject > any operation with complexity O(n) or worse relative to table size. That > being said, I share Tom's doubts. The DEBUG1 messages are a sorry excuse for > a UI, but I'm not seeing a clear improvement in NOREWRITE. > >>>> Or even more to the point, you can always cancel the statement once >>>> you realize it's taking too long. >>> Which means you have to watch it, which is not always possible. > There's statement_timeout. I think the point was in catching the rewrite behaviour in testing. for statement_timeout to kick in you may need to have both production size and production load which is not always easy to achieve in testing. >> My first thought is to add more detailed EXPLAIN support for >> DDL... Although that unfortunately broadens the scope of this a tiny >> bit. > That would be ideal. > It would also be nice to add a "dry run" support, which switches to EXPLAIN for all SQL instead of executing. SET explain_only TO TRUE; ---------------- Hannu
>> However, for a very large user group -- users with ORMs which do their >> own DDL migrations -- we could also use a way to log or WARN for table >> rewrites. Since the ORMs are not gonna do an EXPLAIN. > > An ORM is also quite unlikely to pay attention to a warning, much less a > postmaster log entry ... so your argument seems unfounded from here. But the DevOps staff *is*. There's the workflow: 1. developer writes migrations in Rails/Django/Hibernate 2. DevOps staff tests migrations on test machine. 3. DevOps staff looks at logs for rewrites. The problem with an EXPLAIN path is that you're requiring the developers to extract the DDL generated by Rails or whatever from the migration, something to which the framework is not very friendly. So we need a path for identifying rewrites which doesn't require extracting DDL in SQL form. EXCEPT: I just realized, if we have "explain ALTER" then we can just log explains, no? In which case, problem solved. Come to think of it, it would be good to warn about ACCESS EXCLUSIVE locks as well. That's another big booby trap for developers. Note that "writing stuff to the log" is far from an ideal solution for this user base. I think they'd rather have "ERROR on REWRITE". It would be good to have one of the Heroku folks speak up here ... -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes:
>>> However, for a very large user group -- users with ORMs which do their
>>> own DDL migrations -- we could also use a way to log or WARN for table
>>> rewrites.  Since the ORMs are not gonna do an EXPLAIN.
>> An ORM is also quite unlikely to pay attention to a warning, much less a
>> postmaster log entry ... so your argument seems unfounded from here.
> But the DevOps staff *is*.
Sure, and the DevOps staff would be using the EXPLAIN feature (if we had
it).  After which they could do little anyway except complain to the ORM
authors, who might or might not give a damn.  I don't see that there's
enough value-added from what you suggest to justify the development
time.
        regards, tom lane
			
		On 4 December 2012 22:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Josh Berkus <josh@agliodbs.com> writes: >>>> However, for a very large user group -- users with ORMs which do their >>>> own DDL migrations -- we could also use a way to log or WARN for table >>>> rewrites. Since the ORMs are not gonna do an EXPLAIN. > >>> An ORM is also quite unlikely to pay attention to a warning, much less a >>> postmaster log entry ... so your argument seems unfounded from here. > >> But the DevOps staff *is*. > > Sure, and the DevOps staff would be using the EXPLAIN feature (if we had > it). After which they could do little anyway except complain to the ORM > authors, who might or might not give a damn. I don't see that there's > enough value-added from what you suggest to justify the development > time. I've never had my POLA violated, so its hard to comment on so many buzzphrases. But normal people I've worked with do struggle with knowing for certain what will happen when they run a DDL change script on a production system, which is the heart of the problem. Other software lags behind the services we provide, but if we never provide it they definitely will never use it. Eyeballs work well, but something automated else would work even better. Adding EXPLAIN in front of everything doesn't work at all because many actions depend upon the results of earlier actions, so the test must actually perform the action in order for the whole script to work. Plus, if we have to edit a script to add EXPLAIN in front of everything, you can't put it live without editing out the EXPLAINs, which leaves you open to the failure caused by editing immediately prior to go live. So my NOREWRITE option fails those criteria and should be discarded. On reflection, what is needed is a way to ask "what just happened" when a script is tested. The best way to do that is a special stats collection mode that can be enabled just for that run and then the stats analyzed afterwards. Noah's contribution comes closest to what we need, but its not complete enough, yet. I'll have a play with this idea some more. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> Sure, and the DevOps staff would be using the EXPLAIN feature (if we had > it). After which they could do little anyway except complain to the ORM > authors, who might or might not give a damn. I don't see that there's > enough value-added from what you suggest to justify the development > time. You're still thinking of a schema change as a SQL script. ORM-based applications usually do not run their schema changes as SQL scripts, thus there's nothing to EXPLAIN. Anything which assumes the presense of a distict, user-accessible SQL script is going to leave out a large class of our users. However, as I said, if we had the EXPLAIN ALTER, we could use auto-explain to log the ALTER plans (finally, a good use for auto-explain). So that's a workable workaround. And EXPLAIN ALTER would offer us more flexibility than any logging option, of course. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 5 December 2012 00:16, Josh Berkus <josh@agliodbs.com> wrote: > >> Sure, and the DevOps staff would be using the EXPLAIN feature (if we had >> it). After which they could do little anyway except complain to the ORM >> authors, who might or might not give a damn. I don't see that there's >> enough value-added from what you suggest to justify the development >> time. > > You're still thinking of a schema change as a SQL script. ORM-based > applications usually do not run their schema changes as SQL scripts, > thus there's nothing to EXPLAIN. Anything which assumes the presense of > a distict, user-accessible SQL script is going to leave out a large > class of our users. And anything which assumes the *absence* of a manual script is also leaving out a large class of users. ORMs are very important, but not the only thing we serve. Please assume that script meant a set of SQL statements that are executed in a specific sequence to change a database model from one version to another. Anything which requires editing of all (or worse, just some) of the SQL statements is not a good solution. For ORMs, this requires each ORM to make its own change to support that functionality and to have a separate mode where it is used. For manual scripts, this requires specific editing, which fails, as already described. Either way EXPLAIN is bad, since editing/separate modes can introduce bugs. I think we need a parameter called schema_change_reporting = off (default) | on [USERSET] which displays relevant statistics/reports about the actions taken by DDL statements. That will also highlight locks and the need to reduce their lock levels. That's best used as a function to turn it on and then a function to produce the report. > However, as I said, if we had the EXPLAIN ALTER, we could use > auto-explain to log the ALTER plans (finally, a good use for > auto-explain). So that's a workable workaround. And EXPLAIN ALTER would > offer us more flexibility than any logging option, of course. Auto explain executes things twice, which is not possible for DDL, so it won't work. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/5/2012 1:42 AM, Simon Riggs wrote: > I think we need a parameter called > > schema_change_reporting = off (default) | on [USERSET] > > which displays relevant statistics/reports about the actions taken by > DDL statements. That will also highlight locks and the need to reduce > their lock levels. where does this get displayed? is it just tossed into the postgres log files?
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Josh Berkus > Sent: 05 December 2012 01:17 > To: Tom Lane > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] ALTER TABLE ... NOREWRITE option > > However, as I said, if we had the EXPLAIN ALTER, we could use auto-explain > to log the ALTER plans (finally, a good use for auto-explain). So that's a > workable workaround. And EXPLAIN ALTER would offer us more flexibility > than any logging option, of course. > +1 And for the minority who wants to check themselves (like me) in their tooling this is also usable solution. Regards Petr Jelinek
On 5 December 2012 09:46, John R Pierce <pierce@hogranch.com> wrote: > On 12/5/2012 1:42 AM, Simon Riggs wrote: >> >> I think we need a parameter called >> >> schema_change_reporting = off (default) | on [USERSET] >> >> which displays relevant statistics/reports about the actions taken by >> DDL statements. That will also highlight locks and the need to reduce >> their lock levels. > > > where does this get displayed? is it just tossed into the postgres log > files? Good question. Where would you like? It needs to be per-session rather than global. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
John R Pierce <pierce@hogranch.com> writes:
> On 12/5/2012 1:42 AM, Simon Riggs wrote:
>> I think we need a parameter called
>> 
>> schema_change_reporting = off (default) | on   [USERSET]
>> 
>> which displays relevant statistics/reports about the actions taken by
>> DDL statements. That will also highlight locks and the need to reduce
>> their lock levels.
> where does this get displayed?   is it just tossed into the postgres log 
> files?
And perhaps more to the point, what's the advantage compared to
good old "log_statement = ddl"?
        regards, tom lane
			
		Simon Riggs <simon@2ndQuadrant.com> writes: > On 5 December 2012 09:46, John R Pierce <pierce@hogranch.com> wrote: >> On 12/5/2012 1:42 AM, Simon Riggs wrote: >>> >>> I think we need a parameter called >>> >>> schema_change_reporting = off (default) | on [USERSET] >>> >>> which displays relevant statistics/reports about the actions taken by >>> DDL statements. That will also highlight locks and the need to reduce >>> their lock levels. >> >> >> where does this get displayed? is it just tossed into the postgres log >> files? > > Good question. Where would you like? What about pg_stat_ddl, a new system's view? It would maybe need to have some xid/cid like ordering to be able to reproduce the script. It could also maybe use the ddl_rewrite module I'm working on for the event triggers framework, as far as "normalized SQL" goes. The rewrite information would be a boolean column in that view, I guess. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 5 December 2012 15:01, Tom Lane <tgl@sss.pgh.pa.us> wrote: > John R Pierce <pierce@hogranch.com> writes: >> On 12/5/2012 1:42 AM, Simon Riggs wrote: >>> I think we need a parameter called >>> >>> schema_change_reporting = off (default) | on [USERSET] >>> >>> which displays relevant statistics/reports about the actions taken by >>> DDL statements. That will also highlight locks and the need to reduce >>> their lock levels. > >> where does this get displayed? is it just tossed into the postgres log >> files? > > And perhaps more to the point, what's the advantage compared to > good old "log_statement = ddl"? That logs DDL statements for the whole system and isn't user settable. It wouldn't be useful to extend that, since the user wouldn't be able to enable/disable and the stats would get dumped in the server log. Need something more user specific. Ideas * pg_stat_ddl (from Dimitri) which would be a temp table containing results * Stream of NOTICE statements back to client.... that seems easier * err... -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes:
> On 5 December 2012 15:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> And perhaps more to the point, what's the advantage compared to
>> good old "log_statement = ddl"?
> That logs DDL statements for the whole system and isn't user settable.
Not true; you can set it with ALTER ROLE/DATABASE, and a superuser can
adjust it for his own session.
> * pg_stat_ddl (from Dimitri) which would be a temp table containing results
> * Stream of NOTICE statements back to client.... that seems easier
> * err...
And an ORM is going to do what with either, pray tell?  None of these
are of any use except with an interactive session; in which the user
could perfectly well use EXPLAIN, anyway.
        regards, tom lane
			
		Simon, > And anything which assumes the *absence* of a manual script is also > leaving out a large class of users. ORMs are very important, but not > the only thing we serve. Yes. In the long run, we'll probably need two solutions. An interactive EXPLAIN, and something which logs or aborts for the ORM folks. > Please assume that script meant a set of SQL statements that are > executed in a specific sequence to change a database model from one > version to another. Anything which requires editing of all (or worse, > just some) of the SQL statements is not a good solution. For ORMs, > this requires each ORM to make its own change to support that > functionality and to have a separate mode where it is used. Exactly. And only the ORMs which are very close to PostgreSQL would be willing to do this. Most would not. > I think we need a parameter called > > schema_change_reporting = off (default) | on [USERSET] The problem with anything which reports back to the session is that even when DBAs are running SQL scripts, migrations are seldom run in an interactive session. For example, I manage all migrations for large projects using Python and YAML files, and SQLitch uses Perl and JSON wrappers for the SQL. Doing migrations via "psql -f filename -q" is also very common. So anything reported back in an interactive session would be lost. That's why we need a mechanism which either logs, or aborts on specific actions. From the perspective of the DevOps staff, abort is possibly the better option, but there may be issues with it on our end. That was the attraction of the original NOREWRITE patch, although as I said that suffers from new keywords and a total lack of extensibility. What about adding something like: ddl_action = [ none, log, warn, abort ] ddl_events = [ all, rewrite, exclusive, access_exclusive ] I realize I'm getting out into the weeds here, but I'm thinking "as a contract DBA, what would *really* help me?" and something like the above would do it. This would allow me to do something like: "I wanna test this Rails migration, and have it die if it tries to do a full table rewrite or take an access_exclusive lock. And I'll check the logs afterwards if it blows up." ddl_action = 'log,abort' ddl_events = 'rewrite,access_exclusive' This would make it very easy to set some rules for the organization, and enforce them with automated testing. > Auto explain executes things twice, which is not possible for DDL, so > it won't work. I keep trying to find a use for auto-explain. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Dec 5, 2012 at 1:41 PM, Josh Berkus <josh@agliodbs.com> wrote: > That's why we need a mechanism which either logs, or aborts on specific > actions. From the perspective of the DevOps staff, abort is possibly > the better option, but there may be issues with it on our end. That was > the attraction of the original NOREWRITE patch, although as I said that > suffers from new keywords and a total lack of extensibility. You know, event triggers seem like an awfully good solution to this problem. All we'd need is a new event called table_rewrite: CREATE EVENT TRIGGER my_event_trigger ON table_rewrite EXECUTE PROCEDURE consider_whether_to_throw_an_error(); -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> You know, event triggers seem like an awfully good solution to this > problem. All we'd need is a new event called table_rewrite: > > CREATE EVENT TRIGGER my_event_trigger > ON table_rewrite > EXECUTE PROCEDURE consider_whether_to_throw_an_error(); Oh, wow, that would be perfect. That also solves the problem of "I don't care if I rewrite really_small_table, but I do care if I rewrite really_big_table". -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 5 December 2012 19:15, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Dec 5, 2012 at 1:41 PM, Josh Berkus <josh@agliodbs.com> wrote: >> That's why we need a mechanism which either logs, or aborts on specific >> actions. From the perspective of the DevOps staff, abort is possibly >> the better option, but there may be issues with it on our end. That was >> the attraction of the original NOREWRITE patch, although as I said that >> suffers from new keywords and a total lack of extensibility. > > You know, event triggers seem like an awfully good solution to this > problem. All we'd need is a new event called table_rewrite: > > CREATE EVENT TRIGGER my_event_trigger > ON table_rewrite > EXECUTE PROCEDURE consider_whether_to_throw_an_error(); +1, I was just thinking that myself. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes: >> CREATE EVENT TRIGGER my_event_trigger >> ON table_rewrite >> EXECUTE PROCEDURE consider_whether_to_throw_an_error(); > > +1, I was just thinking that myself. +1, and I think that can happen for 9.3, as soon as we agree on the list of code points where we want that event to fire. ALTER TABLE variants that are rewriting the heap, sure. CLUSTER? VACUUM FULL? TRUNCATE? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 5 December 2012 22:21, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >>> CREATE EVENT TRIGGER my_event_trigger >>> ON table_rewrite >>> EXECUTE PROCEDURE consider_whether_to_throw_an_error(); >> >> +1, I was just thinking that myself. > > +1, and I think that can happen for 9.3, as soon as we agree on the list > of code points where we want that event to fire. ALTER TABLE variants > that are rewriting the heap, sure. CLUSTER? VACUUM FULL? TRUNCATE? Events needed * Table rewrite * Index rebuild * Relation scan (index/table/toast etc) * AccessExclusiveLock -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes:
> On 5 December 2012 22:21, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>>> CREATE EVENT TRIGGER my_event_trigger
>>>>     ON table_rewrite
>>>>     EXECUTE PROCEDURE consider_whether_to_throw_an_error();
>>>
>>> +1, I was just thinking that myself.
>>
>> +1, and I think that can happen for 9.3, as soon as we agree on the list
>> of code points where we want that event to fire. ALTER TABLE variants
>> that are rewriting the heap, sure. CLUSTER? VACUUM FULL? TRUNCATE?
>
> Events needed
> * Table rewrite
> * Index rebuild
> * Relation scan (index/table/toast etc)
> * AccessExclusiveLock
For each of those events we need to find the exact code location from
where to call the registered user defined function, if any. I would like
us to at least devise which commands are going to fire the events here.
 Table Rewrite:  ALTER TABLE, CLUSTER, VACUUM… ? Index Rebuild:  ALTER TABLE, REINDEX, CLUSTER, VACUUM FULL… ?
 Relation scan:  SELECT, ALTER TABLE … ADD CHECK …, etc
                 maybe targeting index/seq scan from the executor code                 directly would be enough in that
case?I'm not sure I                 can call into src/backend/commands/event_trigger.c                 from anywhere in
theexecutor though, I need advice 
AccessExclusiveLock on a relation when taken by *any* command? before
the lock is taken I suppose…
Regards,
--
Dimitri Fontaine                                        06 63 07 10 78
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support
			
		On 2012-12-05 22:41:21 +0000, Simon Riggs wrote: > On 5 December 2012 22:21, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > > Simon Riggs <simon@2ndQuadrant.com> writes: > >>> CREATE EVENT TRIGGER my_event_trigger > >>> ON table_rewrite > >>> EXECUTE PROCEDURE consider_whether_to_throw_an_error(); > >> > >> +1, I was just thinking that myself. > > > > +1, and I think that can happen for 9.3, as soon as we agree on the list > > of code points where we want that event to fire. ALTER TABLE variants > > that are rewriting the heap, sure. CLUSTER? VACUUM FULL? TRUNCATE? > > Events needed > * Table rewrite > * Index rebuild Those should be fairly easy. > * Relation scan (index/table/toast etc) > * AccessExclusiveLock I am worried about the overhead of looking for triggers for those two though. Especially for RelationScans. Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Dec 5, 2012 at 5:56 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2012-12-05 22:41:21 +0000, Simon Riggs wrote: >> On 5 December 2012 22:21, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> > Simon Riggs <simon@2ndQuadrant.com> writes: >> >>> CREATE EVENT TRIGGER my_event_trigger >> >>> ON table_rewrite >> >>> EXECUTE PROCEDURE consider_whether_to_throw_an_error(); >> >> >> >> +1, I was just thinking that myself. >> > >> > +1, and I think that can happen for 9.3, as soon as we agree on the list >> > of code points where we want that event to fire. ALTER TABLE variants >> > that are rewriting the heap, sure. CLUSTER? VACUUM FULL? TRUNCATE? >> >> Events needed >> * Table rewrite >> * Index rebuild > > Those should be fairly easy. > >> * Relation scan (index/table/toast etc) >> * AccessExclusiveLock > > I am worried about the overhead of looking for triggers for those two > though. Especially for RelationScans. Yep. The other thing we have to consider very carefully is which of these locations are safe places to run arbitrary code. In some cases, refactoring may be needed. I suspect that, even for table_rewrite, it's not gonna be safe to inject that at the place where the table rewrite actually happens. At that point, we've already done things like CheckTableNotInUse(); but the trigger could break that by opening a cursor that references the table and leaving it open. If we're rewriting multiple tables, is it really safe to fire a trigger after the first one has been rewritten and before the second one gets rewritten? Maybe, but I've got my doubts. Similarly, if you want to have an event trigger for an index rebuild, we'll probably have to figure out earlier than we currently do whether or not an index build is going to be required. I think we currently defer that decision until after we've rewritten the table, and I suspect that's going to be too late to safely fire a trigger. So while I certainly think this is doable, I strongly suggest that we keep our initial goals modest. Adding the new event trigger is a piece of cake. Making sure that it doesn't break anything is not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 5 December 2012 22:49, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> On 5 December 2012 22:21, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >>> Simon Riggs <simon@2ndQuadrant.com> writes: >>>>> CREATE EVENT TRIGGER my_event_trigger >>>>> ON table_rewrite >>>>> EXECUTE PROCEDURE consider_whether_to_throw_an_error(); >>>> >>>> +1, I was just thinking that myself. >>> >>> +1, and I think that can happen for 9.3, as soon as we agree on the list >>> of code points where we want that event to fire. ALTER TABLE variants >>> that are rewriting the heap, sure. CLUSTER? VACUUM FULL? TRUNCATE? >> >> Events needed >> * Table rewrite >> * Index rebuild >> * Relation scan (index/table/toast etc) >> * AccessExclusiveLock > > For each of those events we need to find the exact code location from > where to call the registered user defined function, if any. I would like > us to at least devise which commands are going to fire the events here. > > Table Rewrite: ALTER TABLE, CLUSTER, VACUUM… ? > Index Rebuild: ALTER TABLE, REINDEX, CLUSTER, VACUUM FULL… ? yes please > Relation scan: SELECT, ALTER TABLE … ADD CHECK …, etc > > maybe targeting index/seq scan from the executor code > directly would be enough in that case? I'm not sure I > can call into src/backend/commands/event_trigger.c > from anywhere in the executor though, I need advice Not SELECT - DDL only - for things like CREATE INDEX, so IndexBuildScan() IIRC > AccessExclusiveLock on a relation when taken by *any* command? before > the lock is taken I suppose… At end of Lock Acquire, same place we already WAL-log the action. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 5 December 2012 23:34, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Dec 5, 2012 at 5:56 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> On 2012-12-05 22:41:21 +0000, Simon Riggs wrote: >>> On 5 December 2012 22:21, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >>> > Simon Riggs <simon@2ndQuadrant.com> writes: >>> >>> CREATE EVENT TRIGGER my_event_trigger >>> >>> ON table_rewrite >>> >>> EXECUTE PROCEDURE consider_whether_to_throw_an_error(); >>> >> >>> >> +1, I was just thinking that myself. >>> > >>> > +1, and I think that can happen for 9.3, as soon as we agree on the list >>> > of code points where we want that event to fire. ALTER TABLE variants >>> > that are rewriting the heap, sure. CLUSTER? VACUUM FULL? TRUNCATE? >>> >>> Events needed >>> * Table rewrite >>> * Index rebuild >> >> Those should be fairly easy. >> >>> * Relation scan (index/table/toast etc) >>> * AccessExclusiveLock >> >> I am worried about the overhead of looking for triggers for those two >> though. Especially for RelationScans. > > Yep. The other thing we have to consider very carefully is which of > these locations are safe places to run arbitrary code. In some cases, > refactoring may be needed. I suspect that, even for table_rewrite, > it's not gonna be safe to inject that at the place where the table > rewrite actually happens. At that point, we've already done things > like CheckTableNotInUse(); but the trigger could break that by opening > a cursor that references the table and leaving it open. If we're > rewriting multiple tables, is it really safe to fire a trigger after > the first one has been rewritten and before the second one gets > rewritten? Maybe, but I've got my doubts. > > Similarly, if you want to have an event trigger for an index rebuild, > we'll probably have to figure out earlier than we currently do whether > or not an index build is going to be required. I think we currently > defer that decision until after we've rewritten the table, and I > suspect that's going to be too late to safely fire a trigger. > > So while I certainly think this is doable, I strongly suggest that we > keep our initial goals modest. Adding the new event trigger is a > piece of cake. Making sure that it doesn't break anything is not. Yes, but it is also the trigger writers problem. If they keep their goals modest, they'll work. If I understand this, Dimitri is planning to include a ready-made trigger, so this will both test and show how to use them, as well as being a useful application. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Dec 5, 2012 at 6:45 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Yes, but it is also the trigger writers problem. Maybe to some degree. I don't think that a server crash or something like a block-read error is ever tolerable though, no matter how silly the user is with their event trigger logic. If we go down that road it will be impossible to know whether errors that are currently reliable indicators of software or hardware problems are in fact caused by event triggers. Of course, if an event trigger causes the system to error out in some softer way, that's perfectly fine... > If I understand this, Dimitri is planning to include a ready-made > trigger, so this will both test and show how to use them, as well as > being a useful application. Cool. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 6 December 2012 00:46, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Dec 5, 2012 at 6:45 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Yes, but it is also the trigger writers problem. > > Maybe to some degree. I don't think that a server crash or something > like a block-read error is ever tolerable though, no matter how silly > the user is with their event trigger logic. If we go down that road > it will be impossible to know whether errors that are currently > reliable indicators of software or hardware problems are in fact > caused by event triggers. Of course, if an event trigger causes the > system to error out in some softer way, that's perfectly fine... How are event triggers more dangerous than normal triggers/functions? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2012-12-06 18:21:09 +0000, Simon Riggs wrote: > On 6 December 2012 00:46, Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Dec 5, 2012 at 6:45 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > >> Yes, but it is also the trigger writers problem. > > > > Maybe to some degree. I don't think that a server crash or something > > like a block-read error is ever tolerable though, no matter how silly > > the user is with their event trigger logic. If we go down that road > > it will be impossible to know whether errors that are currently > > reliable indicators of software or hardware problems are in fact > > caused by event triggers. Of course, if an event trigger causes the > > system to error out in some softer way, that's perfectly fine... > > How are event triggers more dangerous than normal triggers/functions? Normal triggers aren't run when the catalog is in an in-between state because they aren't run while catalog modifications are taking place. Consider a trigger running before CREATE INDEX CONCURRENTLY (which relies on being the first thing to do database access in a transaction) that does database access. Or a trigger running during a table rewrite that inserts into the intermediary table (pg_rewrite_xxx or whatever they are named). That possibly would lead to a crash because the pg_class entry of that table are suddently gone. Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 6 December 2012 18:31, Andres Freund <andres@2ndquadrant.com> wrote: > On 2012-12-06 18:21:09 +0000, Simon Riggs wrote: >> On 6 December 2012 00:46, Robert Haas <robertmhaas@gmail.com> wrote: >> > On Wed, Dec 5, 2012 at 6:45 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> Yes, but it is also the trigger writers problem. >> > >> > Maybe to some degree. I don't think that a server crash or something >> > like a block-read error is ever tolerable though, no matter how silly >> > the user is with their event trigger logic. If we go down that road >> > it will be impossible to know whether errors that are currently >> > reliable indicators of software or hardware problems are in fact >> > caused by event triggers. Of course, if an event trigger causes the >> > system to error out in some softer way, that's perfectly fine... >> >> How are event triggers more dangerous than normal triggers/functions? > > Normal triggers aren't run when the catalog is in an in-between state > because they aren't run while catalog modifications are taking place. "in-between state" means what? And what danger do you see?If its just "someone might write bad code" that horse has already bolted - functions, triggers, executor hooks, operators, indexes etc I don't see any difference between an event trigger and these statements... BEGIN; ALTER TABLE x ...; SELECT somefunction(); ALTER TABLE y ...; COMMIT; -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2012-12-06 18:42:22 +0000, Simon Riggs wrote: > On 6 December 2012 18:31, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2012-12-06 18:21:09 +0000, Simon Riggs wrote: > >> On 6 December 2012 00:46, Robert Haas <robertmhaas@gmail.com> wrote: > >> > On Wed, Dec 5, 2012 at 6:45 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > >> >> Yes, but it is also the trigger writers problem. > >> > > >> > Maybe to some degree. I don't think that a server crash or something > >> > like a block-read error is ever tolerable though, no matter how silly > >> > the user is with their event trigger logic. If we go down that road > >> > it will be impossible to know whether errors that are currently > >> > reliable indicators of software or hardware problems are in fact > >> > caused by event triggers. Of course, if an event trigger causes the > >> > system to error out in some softer way, that's perfectly fine... > >> > >> How are event triggers more dangerous than normal triggers/functions? > > > > Normal triggers aren't run when the catalog is in an in-between state > > because they aren't run while catalog modifications are taking place. > > "in-between state" means what? And what danger do you see? For example during table rewrites we have a temporary pg_class entry thats a full copy of the table, with a separate oid, relfilenode and everything. That gets dropped rather unceremonially, without the usual safety checks. If the user did anything referencing that table we would possibly have a corrupt catalog or even a segfault at our hands. For normal triggers the code takes quite some care to avoid such dangers. > If its just "someone might write bad code" that horse has already > bolted - functions, triggers, executor hooks, operators, indexes etc Not sure what you mean by that. Those don't get called in situation where they don't have a reliable work-environment. > I don't see any difference between an event trigger and these statements... > > BEGIN; > ALTER TABLE x ...; > SELECT somefunction(); > ALTER TABLE y ...; > COMMIT; Event triggers get called *during* the ALTER TABLE. So if were not careful they see something thats not easy to handle. I am for example not sure what would happen if we had a "rewrite" event trigger which inserts a log entry into a logtable. Not a stupid idea, right? Now imagine we had a deferred unique key on that logtable and the logtable is the one that gets rewritten... Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Andres Freund > Sent: 06 December 2012 20:04 > To: Simon Riggs > Cc: Robert Haas; Dimitri Fontaine; Josh Berkus; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] ALTER TABLE ... NOREWRITE option > > I don't see any difference between an event trigger and these > statements... > > > > BEGIN; > > ALTER TABLE x ...; > > SELECT somefunction(); > > ALTER TABLE y ...; > > COMMIT; > > Event triggers get called *during* the ALTER TABLE. So if were not careful > they see something thats not easy to handle. > I thought the point of this was to call the trigger *before* anything happens. Regards Petr Jelinek
On 2012-12-06 20:27:33 +0100, Petr Jelinek wrote: > > -----Original Message----- > > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > > owner@postgresql.org] On Behalf Of Andres Freund > > Sent: 06 December 2012 20:04 > > To: Simon Riggs > > Cc: Robert Haas; Dimitri Fontaine; Josh Berkus; > pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] ALTER TABLE ... NOREWRITE option > > > I don't see any difference between an event trigger and these > > statements... > > > > > > BEGIN; > > > ALTER TABLE x ...; > > > SELECT somefunction(); > > > ALTER TABLE y ...; > > > COMMIT; > > > > Event triggers get called *during* the ALTER TABLE. So if were not careful > > they see something thats not easy to handle. > > > > I thought the point of this was to call the trigger *before* anything > happens. Just because the rewrite hasn't started yet, doesn't mean nothing else has been changed. Note, I am not saying this is impossible or anything, the original point drawn into question was that we need to be especially careful with choosing callsites and thats its not trivial to do right. Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Andres Freund > Sent: 06 December 2012 20:44 > To: Petr Jelinek > Cc: 'Simon Riggs'; 'Robert Haas'; 'Dimitri Fontaine'; 'Josh Berkus'; pgsql- > hackers@postgresql.org > Subject: Re: [HACKERS] ALTER TABLE ... NOREWRITE option > > > Event triggers get called *during* the ALTER TABLE. So if were not > > > careful they see something thats not easy to handle. > > > > > > > I thought the point of this was to call the trigger *before* anything > > happens. > > Just because the rewrite hasn't started yet, doesn't mean nothing else has > been changed. > > Note, I am not saying this is impossible or anything, the original point drawn > into question was that we need to be especially careful with choosing > callsites and thats its not trivial to do right. > Ok my assumption is that the event would be fired before ALTER actually did anything, firing triggers while DDL is actually already being executed seems like bad idea. Regards Petr Jelinek
Andres Freund <andres@2ndquadrant.com> writes: > On 2012-12-06 18:42:22 +0000, Simon Riggs wrote: >> "in-between state" means what? And what danger do you see? > > For example during table rewrites we have a temporary pg_class entry > thats a full copy of the table, with a separate oid, relfilenode and > everything. That gets dropped rather unceremonially, without the usual > safety checks. If the user did anything referencing that table we would > possibly have a corrupt catalog or even a segfault at our hands. > > For normal triggers the code takes quite some care to avoid such > dangers. I think we need to be solving that problem when we implement new firing points for event trigger. The 'table_rewrite' event needs to fire at a time when the code can cope with it. That's the main difficulty in adding events in that system, asserting their code location safety. > Event triggers get called *during* the ALTER TABLE. So if were not > careful they see something thats not easy to handle. They need to fire before catalogs are modified, or after, not in between, I agree with that. I don't see other ways of implementing that than carefully placing the call to user code in the backend's code. > I am for example not sure what would happen if we had a "rewrite" event > trigger which inserts a log entry into a logtable. Not a stupid idea, > right? > Now imagine we had a deferred unique key on that logtable and the > logtable is the one that gets rewritten... The log insert needs to happen either before or after the rewrite, in terms of catalog state. I don't see any magic bullet here. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Thu, Dec 6, 2012 at 3:34 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > I think we need to be solving that problem when we implement new firing > points for event trigger. The 'table_rewrite' event needs to fire at a > time when the code can cope with it. That's the main difficulty in > adding events in that system, asserting their code location safety. Agreed. > They need to fire before catalogs are modified, or after, not in > between, I agree with that. I don't see other ways of implementing that > than carefully placing the call to user code in the backend's code. Also agreed. > The log insert needs to happen either before or after the rewrite, in > terms of catalog state. I don't see any magic bullet here. And again agreed. I think in this case we need to work out before actually doing anything whether a rewrite will occur, and then remember that decision. If the decision is yes, then we call the trigger. After calling the trigger, we need to revalidate that it hasn't invalidated any critical assumptions upon which we're relying for the sanity of the system (e.g. the table hasn't been altered or dropped). Assuming all is well, we then proceed to do the actual operation, basing the decision as to whether or not to rewrite on the remembered state. I think ALTER TABLE actually has a lot of this machinery already in the form of a distinction between "prep" and "exec". However, there are some cases where the prep work has been folded into the execute phase (or maybe visca versa). So there may be some code cleanup that is needed there, or we may need to move some things from the exec phase to the prep phase to make it all work out. I think it's totally doable, but it's not going to be a 50-line patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > is needed there, or we may need to move some things from the exec > phase to the prep phase to make it all work out. I think it's totally > doable, but it's not going to be a 50-line patch. If you want to work on it, please be my guest :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support