Обсуждение: BUG #6535: LEFT JOIN on large table is altering data

Поиск
Список
Период
Сортировка

BUG #6535: LEFT JOIN on large table is altering data

От
aren@arencambre.com
Дата:
The following bug has been logged on the website:

Bug reference:      6535
Logged by:          Aren Cambre
Email address:      aren@arencambre.com
PostgreSQL version: 9.1.3
Operating system:   Windows 7 x64
Description:=20=20=20=20=20=20=20=20

(Note that this issue is described fully at
http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-n=
ew-data.
A few knowledgeable people have weighed in, and there is no solution
identified.)

In a nutshell, I have a query LEFT JOINing table A with table B and
inserting the data into table A'. A' is a functional equivalent of table A.
Table B is actually built from a using a WITH at the top. I'm using it to
help add a rank() to the table.

Table A has just over 2 million rows.

There is a column named citation_id that has no null values. Not only does
citation_id have a NOT NULL constraint, I have verified with a query that it
has no null values.

The LEFT JOIN is pulling table A's citation_id. However, when the joined
table is inserted into table A', Postgres throws "null value in column
"citation_id" violates not-null constraint" errors.

Note that I am running Postgres 32 bit for PostGIS compatibility.

Re: BUG #6535: LEFT JOIN on large table is altering data

От
Tom Lane
Дата:
aren@arencambre.com writes:
> (Note that this issue is described fully at
> http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-new-data.
> A few knowledgeable people have weighed in, and there is no solution
> identified.)

This might be a bug, but you've not provided sufficient information for
someone else to reproduce the problem.  What would be good is a SQL
script that reproduces the error from a standing start (empty database).

            regards, tom lane

Re: BUG #6535: LEFT JOIN on large table is altering data

От
Aren Cambre
Дата:
Thanks. While creating that very file, I discovered that 1 row had blanks
in every field despite a column having a *NOT NULL* constraint and another
column being a* *serial. Removing that column appears to fixed the problem.

Something about that column made Postgres unhappy, though. If I ran these
queries:

SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NOT NULL
UNION
SELECT COUNT(*)
FROM consistent.master
UNION
SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NULL

I got this result:

2085344
2085343
0

Not clear how adding a WHERE clause, whose only practical effect is to
reduce the number of rows returned, could cause *more* rows to be returned.
That seems buggy to me.

Aren

On Fri, Mar 16, 2012 at 12:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> aren@arencambre.com writes:
> > (Note that this issue is described fully at
> >
> http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-new-data
> .
> > A few knowledgeable people have weighed in, and there is no solution
> > identified.)
>
> This might be a bug, but you've not provided sufficient information for
> someone else to reproduce the problem.  What would be good is a SQL
> script that reproduces the error from a standing start (empty database).
>
>                        regards, tom lane
>

Re: BUG #6535: LEFT JOIN on large table is altering data

От
"Kevin Grittner"
Дата:
Aren Cambre <aren@arencambre.com> wrote:

> SELECT COUNT(*)
> FROM consistent.master
> WHERE citation_id IS NOT NULL
> UNION
> SELECT COUNT(*)
> FROM consistent.master
> UNION
> SELECT COUNT(*)
> FROM consistent.master
> WHERE citation_id IS NULL
>
> I got this result:
>
> 2085344
> 2085343
> 0
>
> Not clear how adding a WHERE clause, whose only practical effect
> is to reduce the number of rows returned, could cause *more* rows
> to be returned. That seems buggy to me.

Never assume that the rows will be returned in any particular order
from a query unless you specify ORDER BY.  Assuming, as you seem to
be doing, that rows from the left side of a UNION will be output
before rows from the right side is not safe.  You have no way of
knowing which row in a result set like that came from which of the
UNIONed SELECTs.  In this case your assumption is almost certainly
wrong.

-Kevin

Re: BUG #6535: LEFT JOIN on large table is altering data

От
Aren Cambre
Дата:
Kevin,

You're absolutely correct; there's no telling what order the planner will
use.

I did in fact run the queries separately and got the same result that I
intimated in the UNIONed queries. I can no longer reproduce, however,
because I have since altered the table. So if there is a bug, I may not be
much help in nailing it down.

The only thing I can help with is this appears to have happened at row
# 583847 of just over 2 million rows, per a straight CSV dump of the table.

Aren

On Sun, Mar 18, 2012 at 3:22 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov
> wrote:

> Aren Cambre <aren@arencambre.com> wrote:
>
> > SELECT COUNT(*)
> > FROM consistent.master
> > WHERE citation_id IS NOT NULL
> > UNION
> > SELECT COUNT(*)
> > FROM consistent.master
> > UNION
> > SELECT COUNT(*)
> > FROM consistent.master
> > WHERE citation_id IS NULL
> >
> > I got this result:
> >
> > 2085344
> > 2085343
> > 0
> >
> > Not clear how adding a WHERE clause, whose only practical effect
> > is to reduce the number of rows returned, could cause *more* rows
> > to be returned. That seems buggy to me.
>
> Never assume that the rows will be returned in any particular order
> from a query unless you specify ORDER BY.  Assuming, as you seem to
> be doing, that rows from the left side of a UNION will be output
> before rows from the right side is not safe.  You have no way of
> knowing which row in a result set like that came from which of the
> UNIONed SELECTs.  In this case your assumption is almost certainly
> wrong.
>
> -Kevin
>

Re: BUG #6535: LEFT JOIN on large table is altering data

От
"Kevin Grittner"
Дата:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
> Aren Cambre <aren@arencambre.com> wrote:
>
>> SELECT COUNT(*)
>> FROM consistent.master
>> WHERE citation_id IS NOT NULL
>> UNION
>> SELECT COUNT(*)
>> FROM consistent.master
>> UNION
>> SELECT COUNT(*)
>> FROM consistent.master
>> WHERE citation_id IS NULL
>>
>> I got this result:
>>
>> 2085344
>> 2085343
>> 0
>>
>> Not clear how adding a WHERE clause, whose only practical effect
>> is to reduce the number of rows returned, could cause *more* rows
>> to be returned.

> Never assume that the rows will be returned in any particular
> order from a query unless you specify ORDER BY.

Hmm.  That doesn't explain why the numbers don't add up, though. Is
that a copy/paste from an actual query run, or was there some
hand-editing there?  In particular, you might easily get that result
if that last line was really:

  WHERE citation_id = ''

instead of the IS NULL test.  In the ANSI standard and in PostgreSQL
there is a big difference between an empty string and NULL, although
there is at least one product I know of which breaks from standard
compliance by treating them as equivalent.

-Kevin

Re: BUG #6535: LEFT JOIN on large table is altering data

От
Aren Cambre
Дата:
Kevin,

Thanks. That was a direct copy/paste. It happened that the *UNION*ed
queries spat out those results in the same order that the queries appeared.

Just want to again emphasize that my database state has changed, so I am
not sure this is remains a good case for finding a bug.

Aren

On Sun, Mar 18, 2012 at 3:33 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov
> wrote:

> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
> > Aren Cambre <aren@arencambre.com> wrote:
> >
> >> SELECT COUNT(*)
> >> FROM consistent.master
> >> WHERE citation_id IS NOT NULL
> >> UNION
> >> SELECT COUNT(*)
> >> FROM consistent.master
> >> UNION
> >> SELECT COUNT(*)
> >> FROM consistent.master
> >> WHERE citation_id IS NULL
> >>
> >> I got this result:
> >>
> >> 2085344
> >> 2085343
> >> 0
> >>
> >> Not clear how adding a WHERE clause, whose only practical effect
> >> is to reduce the number of rows returned, could cause *more* rows
> >> to be returned.
>
> > Never assume that the rows will be returned in any particular
> > order from a query unless you specify ORDER BY.
>
> Hmm.  That doesn't explain why the numbers don't add up, though. Is
> that a copy/paste from an actual query run, or was there some
> hand-editing there?  In particular, you might easily get that result
> if that last line was really:
>
>  WHERE citation_id = ''
>
> instead of the IS NULL test.  In the ANSI standard and in PostgreSQL
> there is a big difference between an empty string and NULL, although
> there is at least one product I know of which breaks from standard
> compliance by treating them as equivalent.
>
> -Kevin
>

Re: BUG #6535: LEFT JOIN on large table is altering data

От
Tom Lane
Дата:
Aren Cambre <aren@arencambre.com> writes:
> Just want to again emphasize that my database state has changed, so I am
> not sure this is remains a good case for finding a bug.

Your description of an apparently-all-nulls row sounds a great deal like
data corruption.  I suspect there may be more wrong than just the one
row, and that that's what's leading to the inconsistent results.  It'd
be worthwhile to run hardware diagnostics (memtest, disk tests), make
sure you are on the latest minor release of Postgres, and perhaps do a
database dump and reload.  (If the latter seems impractical, at least
consider reindexing everything.)

            regards, tom lane

Re: BUG #6535: LEFT JOIN on large table is altering data

От
Aren Cambre
Дата:
Thanks. I am running this on 1 year old hardware on a wholly-encrypted C
drive, encrypted with TrueCrypt. I suspect that I would have been alerted
already with that or SMART if I had hardware issues?

I am on 9.1.3, although I think the data was inserted back when I was
running 9.0.x. I upgraded to 9.1 almost as soon as it came out, doing the
requisite dumps and reloads. I have also reindexed a couple of times as
diagnostic steps.

So far, looks like only that one row has data corruption.

Aren

On Sun, Mar 18, 2012 at 9:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Aren Cambre <aren@arencambre.com> writes:
> > Just want to again emphasize that my database state has changed, so I am
> > not sure this is remains a good case for finding a bug.
>
> Your description of an apparently-all-nulls row sounds a great deal like
> data corruption.  I suspect there may be more wrong than just the one
> row, and that that's what's leading to the inconsistent results.  It'd
> be worthwhile to run hardware diagnostics (memtest, disk tests), make
> sure you are on the latest minor release of Postgres, and perhaps do a
> database dump and reload.  (If the latter seems impractical, at least
> consider reindexing everything.)
>
>                        regards, tom lane
>

Re: BUG #6535: LEFT JOIN on large table is altering data

От
Alvaro Herrera
Дата:
Excerpts from Aren Cambre's message of dom mar 18 23:39:16 -0300 2012:
> Thanks. I am running this on 1 year old hardware on a wholly-encrypted C
> drive, encrypted with TrueCrypt. I suspect that I would have been alerted
> already with that or SMART if I had hardware issues?

That sounds excessively optimistic -- not necessarily a good
characteristic on a database administrator.

--=20
=C3=81lvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #6535: LEFT JOIN on large table is altering data

От
Aren Cambre
Дата:
>
> > Thanks. I am running this on 1 year old hardware on a wholly-encrypted C
> > drive, encrypted with TrueCrypt. I suspect that I would have been alerted
> > already with that or SMART if I had hardware issues?
>
> That sounds excessively optimistic -- not necessarily a good
> characteristic on a database administrator.
>

Which is why I could never be a DBA. :-)

I am the only user. This is for my doctoral research project.

Aren