Обсуждение: Repeatable crash in pg_dump (with -d2 info)

Поиск
Список
Период
Сортировка

Repeatable crash in pg_dump (with -d2 info)

От
David Schnur
Дата:
I'm seeing pg_dump [8.3.15 Windows] crash reproducibly against a particular database.  Searching the web, I found [http://grokbase.com/t/postgresql.org/pgsql-general/2001/02/pg-dump-crash/06ss55h5l35jh4bnnqfigxisy534] with a response from Tom Lane suggesting that it was probably due to a bug in pg_dump related to dumping functions.

That post was made more than 10 years ago, but I figured it was worth a try, and set my pg_dump to run with PGOPTIONS="-d2".  Sure enough, the crash is preceded by the following lines:

SELECT proretset, prosrc, probin, proallargtypes, proargmodes, proargnames, provolatile, proisstrict, prosecdef, proconfig, procost, prorows, (SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) as lanname FROM pg_catalog.pg_proc WHERE oid = '16730'::pg_catalog.oid
SELECT pg_catalog.format_type('1185'::pg_catalog.oid, NULL)
SELECT pg_catalog.format_type('1022'::pg_catalog.oid, NULL)
SELECT pg_catalog.format_type('1184'::pg_catalog.oid, NULL)
SELECT pg_catalog.format_type('701'::pg_catalog.oid, NULL)
SELECT pg_catalog.format_type('1185'::pg_catalog.oid, NULL)
SELECT pg_catalog.format_type('1022'::pg_catalog.oid, NULL)
SELECT pg_catalog.format_type('2249'::pg_catalog.oid, NULL)

The OIDs 1185, 1022, 1184 and 701 are present in proallargtypes, so pg_dump appears to be looking up argument types for that function.  I couldn't see 2249 anywhere, though, so I ran this:

SELECT * FROM pg_catalog.pg_proc WHERE oid = '16730'

And 2249 showed up in prorettype.  The result from that last query is 'record', since the function RETURNS SETOF RECORD.

That original post turned out to be some temporary trigger function that the user simply deleted to solve the problem.  This function, however, is legitimate, and I can't simply delete it.  Does anyone know what might be happening here, and how to work around it?  Thanks!

Re: Repeatable crash in pg_dump (with -d2 info)

От
Tom Lane
Дата:
David Schnur <dnschnur@gmail.com> writes:
> I'm seeing pg_dump [8.3.15 Windows] crash reproducibly against a particular
> database.

Can you provide a stack trace from the crash?  There's some info about
that here:
http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows
though I think it's oriented towards backend crashes more than client-side
crashes.

Alternatively, do you have the SQL definition for the function you think
is causing the crash?  There's not much that anybody else is going to be
able to do about this unless you can provide a reproducible test case.

            regards, tom lane

Re: Repeatable crash in pg_dump (with -d2 info)

От
David Schnur
Дата:
Sure; the function is created programmatically as part of schema creation, by the same user who owns (almost) everything else in the database.  The definition looks like this:

        CREATE OR REPLACE FUNCTION datastore_unpack(
            data_times TIMESTAMP WITH TIME ZONE[],
            data_values DOUBLE PRECISION[],
            OUT data_time TIMESTAMP WITH TIME ZONE,
            OUT data_value DOUBLE PRECISION
        ) RETURNS SETOF RECORD AS $$
            SELECT $1[rowx] AS data_time, $2[rowx] AS data_value
            FROM generate_series(1, array_upper($1, 1)) AS rowx;
        $$ LANGUAGE SQL STABLE;

It takes two parallel arrays and unpacks them into a set of records.

Re: Repeatable crash in pg_dump (with -d2 info)

От
Tom Lane
Дата:
David Schnur <dnschnur@gmail.com> writes:
> Sure; the function is created programmatically as part of schema creation,
> by the same user who owns (almost) everything else in the database.  The
> definition looks like this:

>         CREATE OR REPLACE FUNCTION datastore_unpack(
>             data_times TIMESTAMP WITH TIME ZONE[],
>             data_values DOUBLE PRECISION[],
>             OUT data_time TIMESTAMP WITH TIME ZONE,
>             OUT data_value DOUBLE PRECISION
>         ) RETURNS SETOF RECORD AS $$
>             SELECT $1[rowx] AS data_time, $2[rowx] AS data_value
>             FROM generate_series(1, array_upper($1, 1)) AS rowx;
>         $$ LANGUAGE SQL STABLE;

No crash observed here.  So either we fixed it since 8.3.15 (doesn't
seem likely, as I see nothing potentially related in the commit logs),
or the trigger for the problem is not (just) this function.  It seems
entirely plausible that the crash is associated with the
next-to-be-dumped object rather than this one, anyway.

How about that stack trace?

Alternatively, if you have an idea of the next object it might be
working on (most likely, the next function in alphabetical order)
you might see if loading just that function into an empty database
is enough to create a crash.

It's also entirely plausible that the crash requires multiple objects
to trigger it (for instance it might have something to do with
inter-object dependencies).  So if you're not eager to burrow into
stack traces, you might see if you can strip your standard schema
down to something you could post that still triggers the crash
starting from an empty database.

            regards, tom lane

Re: Repeatable crash in pg_dump (with -d2 info)

От
David Schnur
Дата:
I probably can't get a stack trace, but I was able to reproduce it with just that function.  Without the function, pg_dump works fine.  I can DROP the function, pg_dump works, then add it back again and pg_dump crashes.  Here are my steps:

initdb -A md5 --no-locale -E UTF8 -U testuser -D "C:\Users\David\Desktop\testdb" -W

pg_ctl -U testuser -D "C:\Users\David\Desktop\testdb" start

CREATE DATABASE testdb OWNER testuser;

CREATE OR REPLACE FUNCTION datastore_unpack(
    data_times TIMESTAMP WITH TIME ZONE[],
    data_values DOUBLE PRECISION[],
    OUT data_time TIMESTAMP WITH TIME ZONE,
    OUT data_value DOUBLE PRECISION
) RETURNS SETOF RECORD AS $$
    SELECT $1[rowx] AS data_time, $2[rowx] AS data_value
    FROM generate_series(1, array_upper($1, 1)) AS rowx;
$$ LANGUAGE SQL STABLE;

pg_dump -v -F c -x -O -f "C:\Users\David\Desktop\dumptest" -U testuser testdb

Here's the output from pg_dump:

pg_dump: reading schemas
pg_dump: reading user-defined functions
pg_dump: reading user-defined types
pg_dump: reading procedural languages
pg_dump: reading user-defined aggregate functions
pg_dump: reading user-defined operators
pg_dump: reading user-defined operator classes
pg_dump: reading user-defined text search parsers
pg_dump: reading user-defined text search templates
pg_dump: reading user-defined text search dictionaries
pg_dump: reading user-defined text search configurations
pg_dump: reading user-defined operator families
pg_dump: reading user-defined conversions
pg_dump: reading user-defined tables
pg_dump: reading table inheritance information
pg_dump: reading rewrite rules
pg_dump: reading type casts
pg_dump: finding inheritance relationships
pg_dump: reading column info for interesting tables
pg_dump: flagging inherited columns in subtables
pg_dump: reading indexes
pg_dump: reading constraints
pg_dump: reading triggers
pg_dump: reading dependency data
pg_dump: saving encoding = UTF8
pg_dump: saving standard_conforming_strings = off
pg_dump: saving database definition

And here's the output from the postmaster:

LOG:  statement: SET DATESTYLE = ISO
LOG:  statement: SET extra_float_digits TO 2
LOG:  statement: SET synchronize_seqscans TO off
LOG:  statement: BEGIN
LOG:  statement: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
LOG:  statement: SET search_path = pg_catalog
LOG:  statement: SELECT tableoid, oid, nspname, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = nspowner) as rolname, nspacl FROM pg_namespace
LOG:  statement: SELECT tableoid, oid, proname, prolang, pronargs, proargtypes, prorettype, proacl, pronamespace,(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = proowner) as rolname FROM pg_proc WHERE NOT proisagg AND pronamespace != (select oid from pg_namespace where nspname = 'pg_catalog')
LOG:  statement: SELECT tableoid, oid, typname, typnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) as rolname, typinput::oid as typinput, typoutput::oid as typoutput, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::"char" ELSE (SELECT relkind FROM pg_class WHERE oid = typrelid) END as typrelkind, typtype, typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type
LOG:  statement: SELECT tableoid, oid, lanname, lanpltrusted, lanplcallfoid, lanvalidator,  lanacl, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = lanowner) as lanowner FROM pg_language WHERE lanispl ORDER BY oid
LOG:  statement: SELECT tableoid, oid, proname as aggname, pronamespace as aggnamespace, pronargs, proargtypes, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = proowner) as rolname, proacl as aggacl FROM pg_proc WHERE proisagg AND pronamespace != (select oid from pg_namespace where nspname = 'pg_catalog')
LOG:  statement: SELECT tableoid, oid, oprname, oprnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = oprowner) as rolname, oprcode::oid as oprcode FROM pg_operator
LOG:  statement: SELECT tableoid, oid, opcname, opcnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = opcowner) as rolname FROM pg_opclass
LOG:  statement: SELECT tableoid, oid, prsname, prsnamespace, prsstart::oid, prstoken::oid, prsend::oid, prsheadline::oid, prslextype::oid FROM pg_ts_parser
LOG:  statement: SELECT tableoid, oid, tmplname, tmplnamespace, tmplinit::oid, tmpllexize::oid FROM pg_ts_template
LOG:  statement: SELECT tableoid, oid, dictname, dictnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = dictowner) as rolname, dicttemplate, dictinitoption FROM pg_ts_dict
LOG:  statement: SELECT tableoid, oid, cfgname, cfgnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = cfgowner) as rolname, cfgparser FROM pg_ts_config
LOG:  statement: SELECT tableoid, oid, opfname, opfnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = opfowner) as rolname FROM pg_opfamily
LOG:  statement: SELECT tableoid, oid, conname, connamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = conowner) as rolname FROM pg_conversion
LOG:  statement: SELECT c.tableoid, c.oid, relname, relacl, relkind, relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = relowner) as rolname, relchecks, reltriggers, relhasindex, relhasrules, relhasoids, d.refobjid as owning_tab, d.refobjsubid as owning_col, (SELECT spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS reltablespace, array_to_string(c.reloptions, ', ') as reloptions from pg_class c left join pg_depend d on (c.relkind = 'S' and d.classid = c.tableoid and d.objid = c.oid and d.objsubid = 0 and d.refclassid = c.tableoid and d.deptype = 'a') where relkind in ('r', 'S', 'v', 'c') order by c.oid
LOG:  statement: SELECT inhrelid, inhparent from pg_inherits
LOG:  statement: SELECT tableoid, oid, rulename, ev_class as ruletable, ev_type, is_instead, ev_enabled FROM pg_rewrite ORDER BY oid
LOG:  statement: SELECT tableoid, oid, castsource, casttarget, castfunc, castcontext FROM pg_cast ORDER BY 3,4
LOG:  statement: SELECT loid FROM pg_largeobject LIMIT 1
LOG:  statement: SELECT classid, objid, refclassid, refobjid, deptype FROM pg_depend WHERE deptype != 'p' ORDER BY 1,2
LOG:  statement: SELECT tableoid, oid, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = datdba) as dba, pg_encoding_to_char(encoding) as encoding, (SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) as tablespace, shobj_description(oid, 'pg_database') as description FROM pg_database WHERE datname = 'testdb'
LOG:  statement: SELECT description, classoid, objoid, objsubid FROM pg_catalog.pg_description ORDER BY classoid, objoid, objsubid
LOG:  statement: SET search_path = public, pg_catalog
LOG:  statement: SELECT proretset, prosrc, probin, proallargtypes, proargmodes, proargnames, provolatile, proisstrict, prosecdef, proconfig, procost, prorows, (SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) as lanname FROM pg_catalog.pg_proc WHERE oid = '16386'::pg_catalog.oid
LOG:  statement: SELECT pg_catalog.format_type('1185'::pg_catalog.oid, NULL)
LOG:  statement: SELECT pg_catalog.format_type('1022'::pg_catalog.oid, NULL)
LOG:  statement: SELECT pg_catalog.format_type('1184'::pg_catalog.oid, NULL)
LOG:  statement: SELECT pg_catalog.format_type('701'::pg_catalog.oid, NULL)
LOG:  statement: SELECT pg_catalog.format_type('1185'::pg_catalog.oid, NULL)
LOG:  statement: SELECT pg_catalog.format_type('1022'::pg_catalog.oid, NULL)
LOG:  statement: SELECT pg_catalog.format_type('2249'::pg_catalog.oid, NULL)
LOG:  could not receive data from client: No connection could be made because the target machine actively refused it.

LOG:  unexpected EOF on client connection

Re: Repeatable crash in pg_dump (with -d2 info)

От
Tom Lane
Дата:
David Schnur <dnschnur@gmail.com> writes:
> I probably can't get a stack trace, but I was able to reproduce it with
> just that function.  Without the function, pg_dump works fine.  I can DROP
> the function, pg_dump works, then add it back again and pg_dump crashes.

Hmph.  I still can't reproduce this here, which seems to mean one of
three things:
1. We fixed this since 8.3.15 (still doesn't seem likely).
2. It's specific to Windows.
3. It's specific to the pg_dump build you are using.

I wonder whether anyone else can get this to happen on the strength of
David's description?

>  Here are my steps:

> initdb -A md5 --no-locale -E UTF8 -U testuser -D
> "C:\Users\David\Desktop\testdb" -W

> pg_ctl -U testuser -D "C:\Users\David\Desktop\testdb" start

> CREATE DATABASE testdb OWNER testuser;

> CREATE OR REPLACE FUNCTION datastore_unpack(
>     data_times TIMESTAMP WITH TIME ZONE[],
>     data_values DOUBLE PRECISION[],
>     OUT data_time TIMESTAMP WITH TIME ZONE,
>     OUT data_value DOUBLE PRECISION
> ) RETURNS SETOF RECORD AS $$
>     SELECT $1[rowx] AS data_time, $2[rowx] AS data_value
>     FROM generate_series(1, array_upper($1, 1)) AS rowx;
> $$ LANGUAGE SQL STABLE;

> pg_dump -v -F c -x -O -f "C:\Users\David\Desktop\dumptest" -U testuser
> testdb

> Here's the output from pg_dump:

> pg_dump: reading schemas
> pg_dump: reading user-defined functions
> pg_dump: reading user-defined types
> pg_dump: reading procedural languages
> pg_dump: reading user-defined aggregate functions
> pg_dump: reading user-defined operators
> pg_dump: reading user-defined operator classes
> pg_dump: reading user-defined text search parsers
> pg_dump: reading user-defined text search templates
> pg_dump: reading user-defined text search dictionaries
> pg_dump: reading user-defined text search configurations
> pg_dump: reading user-defined operator families
> pg_dump: reading user-defined conversions
> pg_dump: reading user-defined tables
> pg_dump: reading table inheritance information
> pg_dump: reading rewrite rules
> pg_dump: reading type casts
> pg_dump: finding inheritance relationships
> pg_dump: reading column info for interesting tables
> pg_dump: flagging inherited columns in subtables
> pg_dump: reading indexes
> pg_dump: reading constraints
> pg_dump: reading triggers
> pg_dump: reading dependency data
> pg_dump: saving encoding = UTF8
> pg_dump: saving standard_conforming_strings = off
> pg_dump: saving database definition

> And here's the output from the postmaster:

> LOG:  statement: SET DATESTYLE = ISO
> LOG:  statement: SET extra_float_digits TO 2
> LOG:  statement: SET synchronize_seqscans TO off
> LOG:  statement: BEGIN
> LOG:  statement: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
> LOG:  statement: SET search_path = pg_catalog
> LOG:  statement: SELECT tableoid, oid, nspname, (SELECT rolname FROM
> pg_catalog.pg_roles WHERE oid = nspowner) as rolname, nspacl FROM
> pg_namespace
> LOG:  statement: SELECT tableoid, oid, proname, prolang, pronargs,
> proargtypes, prorettype, proacl, pronamespace,(SELECT rolname FROM
> pg_catalog.pg_roles WHERE oid = proowner) as rolname FROM pg_proc WHERE NOT
> proisagg AND pronamespace != (select oid from pg_namespace where nspname =
> 'pg_catalog')
> LOG:  statement: SELECT tableoid, oid, typname, typnamespace, (SELECT
> rolname FROM pg_catalog.pg_roles WHERE oid = typowner) as rolname,
> typinput::oid as typinput, typoutput::oid as typoutput, typelem, typrelid,
> CASE WHEN typrelid = 0 THEN ' '::"char" ELSE (SELECT relkind FROM pg_class
> WHERE oid = typrelid) END as typrelkind, typtype, typisdefined, typname[0]
> = '_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid =
> pg_type.typelem) = oid AS isarray FROM pg_type
> LOG:  statement: SELECT tableoid, oid, lanname, lanpltrusted,
> lanplcallfoid, lanvalidator,  lanacl, (SELECT rolname FROM
> pg_catalog.pg_roles WHERE oid = lanowner) as lanowner FROM pg_language
> WHERE lanispl ORDER BY oid
> LOG:  statement: SELECT tableoid, oid, proname as aggname, pronamespace as
> aggnamespace, pronargs, proargtypes, (SELECT rolname FROM
> pg_catalog.pg_roles WHERE oid = proowner) as rolname, proacl as aggacl FROM
> pg_proc WHERE proisagg AND pronamespace != (select oid from pg_namespace
> where nspname = 'pg_catalog')
> LOG:  statement: SELECT tableoid, oid, oprname, oprnamespace, (SELECT
> rolname FROM pg_catalog.pg_roles WHERE oid = oprowner) as rolname,
> oprcode::oid as oprcode FROM pg_operator
> LOG:  statement: SELECT tableoid, oid, opcname, opcnamespace, (SELECT
> rolname FROM pg_catalog.pg_roles WHERE oid = opcowner) as rolname FROM
> pg_opclass
> LOG:  statement: SELECT tableoid, oid, prsname, prsnamespace,
> prsstart::oid, prstoken::oid, prsend::oid, prsheadline::oid,
> prslextype::oid FROM pg_ts_parser
> LOG:  statement: SELECT tableoid, oid, tmplname, tmplnamespace,
> tmplinit::oid, tmpllexize::oid FROM pg_ts_template
> LOG:  statement: SELECT tableoid, oid, dictname, dictnamespace, (SELECT
> rolname FROM pg_catalog.pg_roles WHERE oid = dictowner) as rolname,
> dicttemplate, dictinitoption FROM pg_ts_dict
> LOG:  statement: SELECT tableoid, oid, cfgname, cfgnamespace, (SELECT
> rolname FROM pg_catalog.pg_roles WHERE oid = cfgowner) as rolname,
> cfgparser FROM pg_ts_config
> LOG:  statement: SELECT tableoid, oid, opfname, opfnamespace, (SELECT
> rolname FROM pg_catalog.pg_roles WHERE oid = opfowner) as rolname FROM
> pg_opfamily
> LOG:  statement: SELECT tableoid, oid, conname, connamespace, (SELECT
> rolname FROM pg_catalog.pg_roles WHERE oid = conowner) as rolname FROM
> pg_conversion
> LOG:  statement: SELECT c.tableoid, c.oid, relname, relacl, relkind,
> relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid =
> relowner) as rolname, relchecks, reltriggers, relhasindex, relhasrules,
> relhasoids, d.refobjid as owning_tab, d.refobjsubid as owning_col, (SELECT
> spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS
> reltablespace, array_to_string(c.reloptions, ', ') as reloptions from
> pg_class c left join pg_depend d on (c.relkind = 'S' and d.classid =
> c.tableoid and d.objid = c.oid and d.objsubid = 0 and d.refclassid =
> c.tableoid and d.deptype = 'a') where relkind in ('r', 'S', 'v', 'c') order
> by c.oid
> LOG:  statement: SELECT inhrelid, inhparent from pg_inherits
> LOG:  statement: SELECT tableoid, oid, rulename, ev_class as ruletable,
> ev_type, is_instead, ev_enabled FROM pg_rewrite ORDER BY oid
> LOG:  statement: SELECT tableoid, oid, castsource, casttarget, castfunc,
> castcontext FROM pg_cast ORDER BY 3,4
> LOG:  statement: SELECT loid FROM pg_largeobject LIMIT 1
> LOG:  statement: SELECT classid, objid, refclassid, refobjid, deptype FROM
> pg_depend WHERE deptype != 'p' ORDER BY 1,2
> LOG:  statement: SELECT tableoid, oid, (SELECT rolname FROM
> pg_catalog.pg_roles WHERE oid = datdba) as dba,
> pg_encoding_to_char(encoding) as encoding, (SELECT spcname FROM
> pg_tablespace t WHERE t.oid = dattablespace) as tablespace,
> shobj_description(oid, 'pg_database') as description FROM pg_database WHERE
> datname = 'testdb'
> LOG:  statement: SELECT description, classoid, objoid, objsubid FROM
> pg_catalog.pg_description ORDER BY classoid, objoid, objsubid
> LOG:  statement: SET search_path = public, pg_catalog
> LOG:  statement: SELECT proretset, prosrc, probin, proallargtypes,
> proargmodes, proargnames, provolatile, proisstrict, prosecdef, proconfig,
> procost, prorows, (SELECT lanname FROM pg_catalog.pg_language WHERE oid =
> prolang) as lanname FROM pg_catalog.pg_proc WHERE oid =
> '16386'::pg_catalog.oid
> LOG:  statement: SELECT pg_catalog.format_type('1185'::pg_catalog.oid, NULL)
> LOG:  statement: SELECT pg_catalog.format_type('1022'::pg_catalog.oid, NULL)
> LOG:  statement: SELECT pg_catalog.format_type('1184'::pg_catalog.oid, NULL)
> LOG:  statement: SELECT pg_catalog.format_type('701'::pg_catalog.oid, NULL)
> LOG:  statement: SELECT pg_catalog.format_type('1185'::pg_catalog.oid, NULL)
> LOG:  statement: SELECT pg_catalog.format_type('1022'::pg_catalog.oid, NULL)
> LOG:  statement: SELECT pg_catalog.format_type('2249'::pg_catalog.oid, NULL)
> LOG:  could not receive data from client: No connection could be made
> because the target machine actively refused it.

> LOG:  unexpected EOF on client connection

            regards, tom lane

Re: Repeatable crash in pg_dump (with -d2 info)

От
Craig Ringer
Дата:
On 11/29/2011 06:20 AM, David Schnur wrote:
> I probably can't get a stack trace

Getting a usable stack trace on Windows isn't actually too hard. See:


http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows

That's focused on debugging the `postgres.exe' backend, but all the
setup applies just as well to pg_dump. You then start pg_dump via
windbg.exe or Visual Studio Express and, when it crashes, follow the
instructions given in the wiki to produce the backtrace.

--
Craig Ringer

Re: Repeatable crash in pg_dump (with -d2 info)

От
David Schnur
Дата:
On Mon, Nov 28, 2011 at 9:48 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:
Getting a usable stack trace on Windows isn't actually too hard.

The problem isn't getting the trace - I know how to do that - it's that I don't have the pdbs for this build, and so the trace wouldn't be very useful.  I may be able to get or recreate them, but can't right now.  I am going to try this on Linux a little later today, though, to see if I can narrow it down in that regard.

transaction error handling

От
Kasia Tuszynska
Дата:
Hi Everybody,

This is an architectural question.
I am testing on Postgres 9.0.2 on windows and linux(suse, rhel, ubuntu)

I want to make sure that I have the correct understanding of the Postgres architecture and would like to enquire if
thereare any plans to change it.  

Comparing Oracle and Postgres from the perspective of error handling on the transaction level I observed the following:

Oracle:
Begin transaction
Insert - no error
Implicit savepoint
Insert - error raised
Implicit rollback to the savepoint, no transaction loss, error raised on the insert statement that errored out.
End transaction, implicit commit, with the single error free insert.

Postgres:
Begin transaction
Insert - no error
Insert - error raised
Transaction loss = no implicit rollback to the single error free insert.

Is this a correct interpretation of the Postgres transaction error handling?
If so, are there any changes being considered, or perhaps already implemented?

Sincerely,
Kasia

Re: transaction error handling

От
"Kevin Grittner"
Дата:
Kasia Tuszynska <ktuszynska@esri.com> wrote:

> Oracle:
> Begin transaction
> Insert - no error
> Implicit savepoint
> Insert - error raised
> Implicit rollback to the savepoint, no transaction loss, error
> raised on the insert statement that errored out.
> End transaction, implicit commit, with the single error free
> insert.
>
> Postgres:
> Begin transaction
> Insert - no error
> Insert - error raised
> Transaction loss = no implicit rollback to the single error free
> insert.
>
> Is this a correct interpretation of the Postgres transaction error
> handling?

Well, in psql you can set ON_ERROR_ROLLBACK so that each statement
will be automatically preceded by a SAVEPOINT which will be
automatically rolled back if the statement has an error.  There are
various constructs for accomplishing this in supported PLs,
depending on the language.

I'm not aware of any "explicitly start a transaction but guess at
whether a commit is intended" feature in PostgreSQL.  An explicit
transaction is committed if and when you say so.

-Kevin

Re: transaction error handling

От
Kasia Tuszynska
Дата:
Hi Kevin,
Thank you, that is very helpful.
I am not worried about the implicit commits. The "no implicit savepoint" was more of an issue, since it created a
necessityto create and destroy savepoints per each sql statement to capture any statement level error without losing a
transaction,that approach has prohibitive performance repercussions.  
I will check out the ON_ERROR_ROLLBACK feature.
Thank you,
Sincerely,
Kasia

-----Original Message-----
From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov]
Sent: Tuesday, November 29, 2011 10:55 AM
To: Kasia Tuszynska; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] transaction error handling

Kasia Tuszynska <ktuszynska@esri.com> wrote:

> Oracle:
> Begin transaction
> Insert - no error
> Implicit savepoint
> Insert - error raised
> Implicit rollback to the savepoint, no transaction loss, error
> raised on the insert statement that errored out.
> End transaction, implicit commit, with the single error free
> insert.
>
> Postgres:
> Begin transaction
> Insert - no error
> Insert - error raised
> Transaction loss = no implicit rollback to the single error free
> insert.
>
> Is this a correct interpretation of the Postgres transaction error
> handling?

Well, in psql you can set ON_ERROR_ROLLBACK so that each statement
will be automatically preceded by a SAVEPOINT which will be
automatically rolled back if the statement has an error.  There are
various constructs for accomplishing this in supported PLs,
depending on the language.

I'm not aware of any "explicitly start a transaction but guess at
whether a commit is intended" feature in PostgreSQL.  An explicit
transaction is committed if and when you say so.

-Kevin


Re: transaction error handling

От
Scott Marlowe
Дата:
On Tue, Nov 29, 2011 at 10:57 AM, Kasia Tuszynska <ktuszynska@esri.com> wrote:
> Postgres:
> Begin transaction
> Insert - no error
> Insert - error raised
> Transaction loss = no implicit rollback to the single error free insert.
>
> Is this a correct interpretation of the Postgres transaction error handling?
> If so, are there any changes being considered, or perhaps already implemented?

You can insert a savepoint explicitly if you wish, but without setting
one, then the whole transaction will be rolled back.

Re: transaction error handling

От
Walter Hurry
Дата:
On Tue, 29 Nov 2011 09:57:24 -0800, Kasia Tuszynska wrote:

> Hi Everybody,
>
> This is an architectural question.
> I am testing on Postgres 9.0.2 on windows and linux(suse, rhel, ubuntu)
>
> I want to make sure that I have the correct understanding of the
> Postgres architecture and would like to enquire if there are any plans
> to change it.
>
> Comparing Oracle and Postgres from the perspective of error handling on
> the transaction level I observed the following:
>
> Oracle:
> Begin transaction Insert - no error Implicit savepoint Insert - error
> raised Implicit rollback to the savepoint, no transaction loss, error
> raised on the insert statement that errored out.
> End transaction, implicit commit, with the single error free insert.
>
> Postgres:
> Begin transaction Insert - no error Insert - error raised Transaction
> loss = no implicit rollback to the single error free insert.
>
> Is this a correct interpretation of the Postgres transaction error
> handling?
> If so, are there any changes being considered, or perhaps already
> implemented?

I suspect you may be barking up the wrong tree. Comparing default
behaviour of PSQL to SQL*Plus is not the same thing as comparing
PostgreSQL to Oracle.


Re: transaction error handling

От
Kasia Tuszynska
Дата:
Hi,
Yes, I believe that you are right.

As far as I can gather, the postgres transaction error handling is like oracle stored procedures. If you do not catch
theerror the whole transaction is rolled back. I am curious why Postgres has gone with a model that does not allow the
usera choice to deal with the statement level errors that may arise in a long transaction. 
 

That either calls for very short transactions or an introduction of explicit savepoint creation and explicit savepoint
destructionfor every statement, if you - the user, want the ability to deal with statement errors that may arise. 
 

I realize that it is almost impossible to change that architecture now, since it would be such a low level change, but
Iam surprised that it is not a common complaint from the user community, since bulk ddl loads would truly suffer. 
 

I do not wish to compare Postgres to Oracle per se, I used oracle because I am more familiar with it than the Sql
Servertransaction model, they did a rewrite on transaction handling for SS 2005 and I never fully got into it. 
 

Sincerely,
Kasia 

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Walter Hurry
Sent: Tuesday, November 29, 2011 12:50 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] transaction error handling

On Tue, 29 Nov 2011 09:57:24 -0800, Kasia Tuszynska wrote:

> Hi Everybody,
> 
> This is an architectural question.
> I am testing on Postgres 9.0.2 on windows and linux(suse, rhel, ubuntu)
> 
> I want to make sure that I have the correct understanding of the
> Postgres architecture and would like to enquire if there are any plans
> to change it.
> 
> Comparing Oracle and Postgres from the perspective of error handling on
> the transaction level I observed the following:
> 
> Oracle:
> Begin transaction Insert - no error Implicit savepoint Insert - error
> raised Implicit rollback to the savepoint, no transaction loss, error
> raised on the insert statement that errored out.
> End transaction, implicit commit, with the single error free insert.
> 
> Postgres:
> Begin transaction Insert - no error Insert - error raised Transaction
> loss = no implicit rollback to the single error free insert.
> 
> Is this a correct interpretation of the Postgres transaction error
> handling?
> If so, are there any changes being considered, or perhaps already
> implemented?

I suspect you may be barking up the wrong tree. Comparing default 
behaviour of PSQL to SQL*Plus is not the same thing as comparing 
PostgreSQL to Oracle.



-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: transaction error handling

От
"Nicholson, Brad (Toronto, ON, CA)"
Дата:
> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-
> owner@postgresql.org] On Behalf Of Kasia Tuszynska
> Sent: Tuesday, November 29, 2011 3:35 PM
> To: Kevin Grittner; pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] transaction error handling
>
> Hi Kevin,
> Thank you, that is very helpful.
> I am not worried about the implicit commits. The "no implicit
> savepoint" was more of an issue, since it created a necessity to create
> and destroy savepoints per each sql statement to capture any statement
> level error without losing a transaction, that approach has prohibitive
> performance repercussions.
> I will check out the ON_ERROR_ROLLBACK feature.
> Thank you,
> Sincerely,
> Kasia


Be aware that this option is a psql option and not one in the database itself, which means unless you are executing
yourSQL via psql it will not be of help to you. 

Also the implementation of this is that psql issues implicit savepoints for you before each command in a transaction
andhandles the rollback for you if needed (which sounds an awful lot like the performance concern you have). 

This is a major pain for porting Oracle based applications over for those that rely on this functionality.

Brad.


Re: transaction error handling

От
Rob Richardson
Дата:
Very naïve question here:  Why would you want to save the data from the first insert?

I thought the purpose of a transaction was to make sure that all steps in the transaction executed, or none of them
executed. If Oracle saves half of the data  between the beginning and ending of the transaction, doesn't that defeat
thepurpose of the transaction? 

RobR

Re: transaction error handling

От
Rural Hunter
Дата:
I have the same confusion...

于 2011/11/30 2:34, Rob Richardson 写道:
> Very naïve question here:  Why would you want to save the data from the first insert?
>
> I thought the purpose of a transaction was to make sure that all steps in the transaction executed, or none of them
executed. If Oracle saves half of the data  between the beginning and ending of the transaction, doesn't that defeat
thepurpose of the transaction? 
>
> RobR
>


Re: transaction error handling

От
Bèrto ëd Sèra
Дата:
On 29 November 2011 21:34, Rob Richardson <RDRichardson@rad-con.com> wrote:
If Oracle saves half of the data  between the beginning and ending of the transaction, doesn't that defeat the purpose of the transaction?
 
It sure enough kills Atomicity. I can see a use for this on importing data from external sources that may violate existing unique keys, so illegal inserts are ignored, but you still are left without any knowledge of what rows where silently dropped. Since when is Oracle doing this, FMI? (It's been a long while since I used it for anything serious)

Bèrto

--
==============================
If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music.

Re: transaction error handling

От
"Nicholson, Brad (Toronto, ON, CA)"
Дата:
> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-
> owner@postgresql.org] On Behalf Of Rob Richardson
> Sent: Tuesday, November 29, 2011 1:35 PM
> To: pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] transaction error handling
>
> Very naïve question here:  Why would you want to save the data from the
> first insert?

You might want your code to recover from an error and take a different approach.

> I thought the purpose of a transaction was to make sure that all steps
> in the transaction executed, or none of them executed.  If Oracle saves
> half of the data  between the beginning and ending of the transaction,
> doesn't that defeat the purpose of the transaction?
>

This functionality is something that Postgres can do today.  We expose the ability to do this with explicit savepoints.
The difference is that Oracle allows you to set it on a per transaction basis (I believe) and it will behave this way
forall statements in the transaction, where as we need to do it explicitly.  Looking through the archives there does
seemto be a performance problem on commit in Postgres if you issue a lot of savepoints (there were discussions of a fix
butI am not sure the status of this). 

Brad.

Re: transaction error handling

От
Kasia Tuszynska
Дата:

Hi,

Oracle does not save the data, in mid transaction, in the sense of a commit.

It keeps the association of the memory address related to the error free changes to the transaction and allows you – the developer to capture the error on  that single incorrect change, and then continue with the subsequent sql statements that are part of that long transaction.

While in that state, the changes pertaining to that transaction are not written to any logs and are not written to file, you can still roll back the entire transaction.

Only when a commit occurs, does the transaction get written to SGA, archiving, file etc…

 

With Postgres that is not the case, if the 50th sql statement in a long transaction incurs an error, the whole transaction is rolled back for you automatically, you the developer have no say in that unless you bracket each statement with a savepoint creation and destruction, just to be able to capture the potential error that may arise on that 50th sql statement.

 

Sincerely,

Kasia

From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Bèrto ëd Sèra
Sent: Wednesday, November 30, 2011 12:49 AM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] transaction error handling

 

On 29 November 2011 21:34, Rob Richardson <RDRichardson@rad-con.com> wrote:

If Oracle saves half of the data  between the beginning and ending of the transaction, doesn't that defeat the purpose of the transaction?

 

It sure enough kills Atomicity. I can see a use for this on importing data from external sources that may violate existing unique keys, so illegal inserts are ignored, but you still are left without any knowledge of what rows where silently dropped. Since when is Oracle doing this, FMI? (It's been a long while since I used it for anything serious)


Bèrto

 

--
==============================
If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music.

Re: transaction error handling

От
Craig Ringer
Дата:
On 11/30/2011 09:19 PM, Nicholson, Brad (Toronto, ON, CA) wrote:
> This functionality is something that Postgres can do today. We expose
> the ability to do this with explicit savepoints. The difference is
> that Oracle allows you to set it on a per transaction basis (I
> believe) and it will behave this way for all statements in the
> transaction, where as we need to do it explicitly. Looking through the
> archives there does seem to be a performance problem on commit in
> Postgres if you issue a lot of savepoints (there were discussions of a
> fix but I am not sure the status of this).
Savepoint performance has had several improvements over time. Back in
8.1 when I got started developing against Pg seriously I was having
*MASSIVE* performance issues with PL/PgSQL exception blocks (which use
savepoints) in loops; these days it's perfect.

To make automatic savepoints viable, Pg would need to be able to
completely forget a savepoint once it's been released, so there's no
ongoing cost. That way a transaction would only need two savepoints at
any particular point in time. My understanding is that it's not there
yet; AFAIK released savepoints still have a non-trivial cost that would
add up if someone was using automatic savepoints in (say) a 10,000
INSERT transaction.

--
Craig Ringer

Re: transaction error handling

От
Bèrto ëd Sèra
Дата:
Hi

On 30 November 2011 22:30, Kasia Tuszynska <ktuszynska@esri.com> wrote:

With Postgres that is not the case, if the 50th sql statement in a long transaction incurs an error, the whole transaction is rolled back for you automatically, you the developer have no say in that unless you bracket each statement with a savepoint creation and destruction, just to be able to capture the potential error that may arise on that 50th sql statement.

 
Thanks, now I finally got what you meant. 

Bèrto


Re: Repeatable crash in pg_dump (with -d2 info)

От
David Schnur
Дата:
I finally had time to test this further on a variety of systems, and was unable to reproduce on any non-Windows platform.  The dump even works fine on Windows XP; just not Windows 7.

This prompted me to do a little more research, and this time I found this thread from Sept. 2011:


From Tom Lane in the above thread:

Hmm.  I can see how that would happen if you're using one of the Windows 
environments wherein malloc's done inside libpq have to be free'd inside 
libpq.  (The PQExpBuffer support code is in libpq...) 
 
However, the flaw in that explanation is that it would basically mean 
pg_dump doesn't work at all on Windows, at least not if you have any 
user-defined functions, and probably some other cases too because there 
seem to be multiple instances of the dubious coding.  It's a bit hard to 
believe that nobody's noticed that before. 

This appears to describe exactly the issue I'm encountering, and my build is in fact linked against the static runtime.  I guess the reason this hasn't come up sooner is because most Windows users either use the 'official' binaries rather than compiling from source, or link against the dynamic runtime.

Is this something I could expect to be fixed in the near future, or is it enough of an edge case that I should come up with some solution or work-around on my own?  Thanks,

David


On Mon, Nov 28, 2011 at 8:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmph.  I still can't reproduce this here, which seems to mean one of
three things:
1. We fixed this since 8.3.15 (still doesn't seem likely).
2. It's specific to Windows.
3. It's specific to the pg_dump build you are using.

Re: [BUGS] Repeatable crash in pg_dump (with -d2 info)

От
Bruce Momjian
Дата:
On Tue, Jan 17, 2012 at 04:46:50PM -0500, David Schnur wrote:
> I finally had time to test this further on a variety of systems, and was unable
> to reproduce on any non-Windows platform.  The dump even works fine on Windows
> XP; just not Windows 7.
>
> This prompted me to do a little more research, and this time I found this
> thread from Sept. 2011:
>
> http://postgresql.1045698.n5.nabble.com/
> BUG-6233-pg-dump-hangs-with-Access-Violation-C0000005-td4851598.html
>
> From Tom Lane in the above thread:
>
>
>     Hmm.  I can see how that would happen if you're using one of the Windows
>     environments wherein malloc's done inside libpq have to be free'd inside
>     libpq.  (The PQExpBuffer support code is in libpq...)
>
>
>
>     However, the flaw in that explanation is that it would basically mean
>     pg_dump doesn't work at all on Windows, at least not if you have any
>     user-defined functions, and probably some other cases too because there
>     seem to be multiple instances of the dubious coding.  It's a bit hard to
>     believe that nobody's noticed that before.
>
>
> This appears to describe exactly the issue I'm encountering, and my build is in
> fact linked against the static runtime.  I guess the reason this hasn't come up
> sooner is because most Windows users either use the 'official' binaries rather
> than compiling from source, or link against the dynamic runtime.
>
> Is this something I could expect to be fixed in the near future, or is it
> enough of an edge case that I should come up with some solution or work-around
> on my own?  Thanks,

Late reply, but I don't see any way we could fix this easily.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: [BUGS] Repeatable crash in pg_dump (with -d2 info)

От
Robert Haas
Дата:
On Mon, Aug 27, 2012 at 9:58 AM, Bruce Momjian <bruce@momjian.us> wrote:
> On Tue, Jan 17, 2012 at 04:46:50PM -0500, David Schnur wrote:
>> I finally had time to test this further on a variety of systems, and was unable
>> to reproduce on any non-Windows platform.  The dump even works fine on Windows
>> XP; just not Windows 7.
>>
>> This prompted me to do a little more research, and this time I found this
>> thread from Sept. 2011:
>>
>> http://postgresql.1045698.n5.nabble.com/
>> BUG-6233-pg-dump-hangs-with-Access-Violation-C0000005-td4851598.html
>>
>> From Tom Lane in the above thread:
>>
>>
>>     Hmm.  I can see how that would happen if you're using one of the Windows
>>     environments wherein malloc's done inside libpq have to be free'd inside
>>     libpq.  (The PQExpBuffer support code is in libpq...)
>>
>>
>>
>>     However, the flaw in that explanation is that it would basically mean
>>     pg_dump doesn't work at all on Windows, at least not if you have any
>>     user-defined functions, and probably some other cases too because there
>>     seem to be multiple instances of the dubious coding.  It's a bit hard to
>>     believe that nobody's noticed that before.
>>
>>
>> This appears to describe exactly the issue I'm encountering, and my build is in
>> fact linked against the static runtime.  I guess the reason this hasn't come up
>> sooner is because most Windows users either use the 'official' binaries rather
>> than compiling from source, or link against the dynamic runtime.
>>
>> Is this something I could expect to be fixed in the near future, or is it
>> enough of an edge case that I should come up with some solution or work-around
>> on my own?  Thanks,
>
> Late reply, but I don't see any way we could fix this easily.

To me it seems like mostly a case of chasing down all the places where
this happens.  It's not impossible to do; it's just a bunch of work
that nobody's gotten excited about doing yet.  We've fixed similar
issues in many other cases, IIUC.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [BUGS] Repeatable crash in pg_dump (with -d2 info)

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Aug 27, 2012 at 9:58 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> From Tom Lane in the above thread:
>>> Hmm.  I can see how that would happen if you're using one of the Windows
>>> environments wherein malloc's done inside libpq have to be free'd inside
>>> libpq.  (The PQExpBuffer support code is in libpq...)

>> Late reply, but I don't see any way we could fix this easily.

> To me it seems like mostly a case of chasing down all the places where
> this happens.  It's not impossible to do; it's just a bunch of work
> that nobody's gotten excited about doing yet.  We've fixed similar
> issues in many other cases, IIUC.

Well, the problem with what I suspect you're thinking of is that even
after we fixed all the existing trouble spots, it would keep on
breaking.  Unless we found some mechanical way to warn about unsafe
coding; which in itself would be more work than I want to put into this.

However, a plan B occurs to me: what about preventing pg_dump from using
libpq's copy of PQExpBuffer?  IIUC, a copy of the PQExpBuffer code
statically linked into pg_dump would be free of this problem.  There's
not so much code there that this would be an intolerable price to pay.
(Besides, we could probably arrange for the extra copy to happen only on
Windows.)

            regards, tom lane