Обсуждение: Bug in ordered views?
Hello all, I think I found a little but annoying bug in views when ordering is involved. First, my version of Postgres: PostgreSQL 8.1.3 on i386-portbld-freebsd6.1, compiled by GCC cc (GCC) 3.4.4 [FreeBSD] 20050518 Please try the following: CREATE TABLE datum ( projekt_id INTEGER NOT NULL, datum DATE NOT NULL, UNIQUE (projekt_id, datum) ) WITHOUT OIDS; CREATE TABLE test ( id SERIAL PRIMARY KEY, projekt_id INTEGER NOT NULL, datum DATE NOT NULL, approved BOOLEAN NOT NULL DEFAULT FALSE, test_id INTEGER, test_text TEXT ) WITHOUT OIDS; CREATE OR REPLACE VIEW bug AS SELECT DISTINCT ON (test_id,projekt_id,datum) t.id, d.projekt_id, d.datum, t.approved, t.test_id, t.test_text FROM datum d JOIN test t ON (t.projekt_id = d.projekt_id OR t.approved IS TRUE) AND t.datum <= d.datum ORDER BY t.test_id DESC, d.projekt_id DESC, d.datum DESC, t.datum DESC; INSERT INTO datum (projekt_id,datum) VALUES (1,now()); INSERT INTO datum (projekt_id,datum) VALUES (1,now()+'1d'::interval); INSERT INTO test (projekt_id,datum,test_id,test_text) VALUES (1,now(),1,'old'); INSERT INTO test (projekt_id,datum,test_id,test_text) VALUES (1,now(),2,'old'); UPDATE test SET approved = TRUE WHERE projekt_id = 1; INSERT INTO datum (projekt_id,datum) VALUES (2,now()); INSERT INTO datum (projekt_id,datum) VALUES (2,now()+'1d'::interval); INSERT INTO test (projekt_id,datum,test_id,test_text) VALUES (2,now(),1,'new'); INSERT INTO test (projekt_id,datum,test_id,test_text) VALUES (2,now()+'1d'::interval,2,'new'); Now do a simple select: SELECT * FROM bug; id | projekt_id | datum | approved | test_id | test_text ----+------------+------------+----------+---------+----------- 4 | 2 | 16.05.2006 | f | 2 | new 2 | 2 | 15.05.2006 | t | 2 | old 2 | 1 | 16.05.2006 | t | 2 | old 2 | 1 | 15.05.2006 | t | 2 | old 3 | 2 | 16.05.2006 | f | 1 | new 1 | 2 | 15.05.2006 | t | 1 | old 1 | 1 | 16.05.2006 | t | 1 | old 1 | 1 | 15.05.2006 | t | 1 | old And now constrain the above select: SELECT * FROM bug WHERE test_id = 1; id | projekt_id | datum | approved | test_id | test_text ----+------------+------------+----------+---------+----------- 1 | 2 | 16.05.2006 | t | 1 | old 1 | 2 | 15.05.2006 | t | 1 | old 1 | 1 | 16.05.2006 | t | 1 | old 1 | 1 | 15.05.2006 | t | 1 | old Notice that the should be 1 line with test_text showing "new"! Did I miss anything or is it a bug? Sebastian
Sebastian Böck wrote: > Hello all, > > I think I found a little but annoying bug in views when ordering is > involved. First, my version of Postgres: > > PostgreSQL 8.1.3 on i386-portbld-freebsd6.1, compiled by GCC cc (GCC) > 3.4.4 [FreeBSD] 20050518 > > Please try the following: > > CREATE TABLE datum ( > projekt_id INTEGER NOT NULL, > datum DATE NOT NULL, > UNIQUE (projekt_id, datum) > ) WITHOUT OIDS; > > CREATE TABLE test ( > id SERIAL PRIMARY KEY, > projekt_id INTEGER NOT NULL, > datum DATE NOT NULL, > approved BOOLEAN NOT NULL DEFAULT FALSE, > test_id INTEGER, > test_text TEXT > ) WITHOUT OIDS; > > CREATE OR REPLACE VIEW bug AS > SELECT DISTINCT ON (test_id,projekt_id,datum) > t.id, d.projekt_id, d.datum, t.approved, > t.test_id, t.test_text > FROM datum d > JOIN test t ON > (t.projekt_id = d.projekt_id OR t.approved IS TRUE) AND > t.datum <= d.datum > ORDER BY t.test_id DESC, d.projekt_id DESC, d.datum DESC, t.datum DESC; > > INSERT INTO datum (projekt_id,datum) VALUES (1,now()); > INSERT INTO datum (projekt_id,datum) VALUES (1,now()+'1d'::interval); > INSERT INTO test (projekt_id,datum,test_id,test_text) VALUES > (1,now(),1,'old'); > INSERT INTO test (projekt_id,datum,test_id,test_text) VALUES > (1,now(),2,'old'); > > UPDATE test SET approved = TRUE WHERE projekt_id = 1; > > INSERT INTO datum (projekt_id,datum) VALUES (2,now()); > INSERT INTO datum (projekt_id,datum) VALUES (2,now()+'1d'::interval); > INSERT INTO test (projekt_id,datum,test_id,test_text) VALUES > (2,now(),1,'new'); > INSERT INTO test (projekt_id,datum,test_id,test_text) VALUES > (2,now()+'1d'::interval,2,'new'); > > Now do a simple select: > > SELECT * FROM bug; > > id | projekt_id | datum | approved | test_id | test_text > ----+------------+------------+----------+---------+----------- > 4 | 2 | 16.05.2006 | f | 2 | new > 2 | 2 | 15.05.2006 | t | 2 | old > 2 | 1 | 16.05.2006 | t | 2 | old > 2 | 1 | 15.05.2006 | t | 2 | old > 3 | 2 | 16.05.2006 | f | 1 | new > 1 | 2 | 15.05.2006 | t | 1 | old > 1 | 1 | 16.05.2006 | t | 1 | old > 1 | 1 | 15.05.2006 | t | 1 | old > > And now constrain the above select: > > SELECT * FROM bug WHERE test_id = 1; > > id | projekt_id | datum | approved | test_id | test_text > ----+------------+------------+----------+---------+----------- > 1 | 2 | 16.05.2006 | t | 1 | old > 1 | 2 | 15.05.2006 | t | 1 | old > 1 | 1 | 16.05.2006 | t | 1 | old > 1 | 1 | 15.05.2006 | t | 1 | old > > Notice that the should be 1 line with test_text showing "new"! Try removing the DISTINCT ON from your view - that should make things clearer to you. When t.approved is true, the row is joined to all rows of the datum table satisfying the criteria. The sort order you specify does not guarantee a unique ordering of the rows, which explains the inconsistency between the two cases. /Nis
Nis Jorgensen <nis@superlativ.dk> writes: > Try removing the DISTINCT ON from your view - that should make things > clearer to you. When t.approved is true, the row is joined to all rows > of the datum table satisfying the criteria. The sort order you specify > does not guarantee a unique ordering of the rows, which explains the > inconsistency between the two cases. More specifically, look at this: select t.test_id,d.projekt_id,d.datum,t.datum, t.id, t.approved, t.test_text FROM datum d JOIN test t ON (t.projekt_id = d.projekt_id OR t.approved IS TRUE) AND t.datum <= d.datum ORDER BY t.test_id DESC, d.projekt_id DESC, d.datum DESC, t.datum DESC; test_id | projekt_id | datum | datum | id | approved | test_text ---------+------------+------------+------------+----+----------+----------- 2 | 2 | 2006-05-16 | 2006-05-16 | 4 | f | new 2 | 2 | 2006-05-16 | 2006-05-15 | 2 | t | old 2 | 2 | 2006-05-15 | 2006-05-15 | 2 | t | old 2 | 1 | 2006-05-16 | 2006-05-15 | 2 | t | old 2 | 1 | 2006-05-15 | 2006-05-15 | 2 | t | old 1 | 2 | 2006-05-16 | 2006-05-15 | 1 | t | old 1 | 2 | 2006-05-16 | 2006-05-15 | 3 | f | new * 1 | 2 | 2006-05-15 | 2006-05-15 | 3 | f | new * 1 | 2 | 2006-05-15 | 2006-05-15 | 1 | t | old 1 | 1 | 2006-05-16 | 2006-05-15 | 1 | t | old 1 | 1 | 2006-05-15 | 2006-05-15 | 1 | t | old (11 rows) The two rows I've marked with * are identical in all the columns that are used in the DISTINCT ON and ORDER BY clauses, which means it's unspecified which one you get out of the DISTINCT ON. I'm not entirely sure why adding the test_id condition changes the results, but it may be an artifact of qsort() behavior. Anyway you need to constrain the ORDER BY some more to ensure you get well-defined results from the DISTINCT ON. regards, tom lane
Tom Lane wrote: > Nis Jorgensen <nis@superlativ.dk> writes: > >>Try removing the DISTINCT ON from your view - that should make things >>clearer to you. When t.approved is true, the row is joined to all rows >>of the datum table satisfying the criteria. The sort order you specify >>does not guarantee a unique ordering of the rows, which explains the >>inconsistency between the two cases. > > > More specifically, look at this: > > select t.test_id,d.projekt_id,d.datum,t.datum, t.id, t.approved, > t.test_text > FROM datum d > JOIN test t ON > (t.projekt_id = d.projekt_id OR t.approved IS TRUE) AND > t.datum <= d.datum > ORDER BY t.test_id DESC, d.projekt_id DESC, d.datum DESC, t.datum DESC; > test_id | projekt_id | datum | datum | id | approved | test_text > ---------+------------+------------+------------+----+----------+----------- > 2 | 2 | 2006-05-16 | 2006-05-16 | 4 | f | new > 2 | 2 | 2006-05-16 | 2006-05-15 | 2 | t | old > 2 | 2 | 2006-05-15 | 2006-05-15 | 2 | t | old > 2 | 1 | 2006-05-16 | 2006-05-15 | 2 | t | old > 2 | 1 | 2006-05-15 | 2006-05-15 | 2 | t | old > 1 | 2 | 2006-05-16 | 2006-05-15 | 1 | t | old > 1 | 2 | 2006-05-16 | 2006-05-15 | 3 | f | new > * 1 | 2 | 2006-05-15 | 2006-05-15 | 3 | f | new > * 1 | 2 | 2006-05-15 | 2006-05-15 | 1 | t | old > 1 | 1 | 2006-05-16 | 2006-05-15 | 1 | t | old > 1 | 1 | 2006-05-15 | 2006-05-15 | 1 | t | old > (11 rows) > > The two rows I've marked with * are identical in all the columns that > are used in the DISTINCT ON and ORDER BY clauses, which means it's > unspecified which one you get out of the DISTINCT ON. I'm not entirely > sure why adding the test_id condition changes the results, but it may be > an artifact of qsort() behavior. Anyway you need to constrain the ORDER > BY some more to ensure you get well-defined results from the DISTINCT ON. > > regards, tom lane Classical "pilot error". I recognized the missing order by a few minutes after sending my message. Sorry for the noise, but it looked totally reproducible, no matter what kind of where clause I added. Thanks anyway Sebastian