Обсуждение: Still a bug in the VACUUM ??? !!!
Hello *, I have a problem with the "vacuum full". every time I run the vacuum command I loose data from the parent tables. maybe also from the subtables (haven't checked yet). I tried it a few times up to now an I can reproduce the phenomena. I am running postgresql 7.3.2 on solaris 8 (E450 4x 400 sun4u 1.5 GB) regards -Andreas -- Andreas Schmitz - Phone +49 201 8501 318 Cityweb-Technik-Service-Gesellschaft mbH Friedrichstr. 12 - Fax +49 201 8501 104 45128 Essen - email a.schmitz@cityweb.de
Andreas Schmitz <a.schmitz@cityweb.de> writes: > I have a problem with the "vacuum full". every time I run the vacuum command I > loose data from the parent tables. maybe also from the subtables (haven't > checked yet). I tried it a few times up to now an I can reproduce the > phenomena. That sounds ugly ... but are you sure you don't have a hardware problem? I don't think anyone's ever reported such behavior before. If it is a Postgres bug, we can't do much to help you without a lot more detail. regards, tom lane
On Friday 14 February 2003 17:55, Tom Lane wrote: > Andreas Schmitz <a.schmitz@cityweb.de> writes: > > I have a problem with the "vacuum full". every time I run the vacuum > > command I loose data from the parent tables. maybe also from the > > subtables (haven't checked yet). I tried it a few times up to now an I > > can reproduce the phenomena. > > That sounds ugly ... but are you sure you don't have a hardware problem? > I don't think anyone's ever reported such behavior before. > > If it is a Postgres bug, we can't do much to help you without a lot more > detail. > > regards, tom lane hi, it does sound ugly. I checked the hardware. I can't see any problems with it. I know, somestimes you need a lot of luck to see a CPU problem under solaris. But I think the hardware is ok. however, what kind of details do you need to qualify if it's a postgres problem or not ? regards -andreas schmitz
> > > I have a problem with the "vacuum full". every time I run the vacuum > > > command I loose data from the parent tables. maybe also from the > > > subtables (haven't checked yet). I tried it a few times up to now an I > > > can reproduce the phenomena. > > > > That sounds ugly ... but are you sure you don't have a hardware problem? > > I don't think anyone's ever reported such behavior before. > > > > If it is a Postgres bug, we can't do much to help you without a lot more > > detail. > > > > regards, tom lane > I have a similiar problem with VACUUM FULL ANALYZE. I do not loose any data, but get hundreds of uninitialized pages and a segmentation fault. Processing is very slow (twenty minutes). The only thing unusual about my configuration is that system indices are bloated. I expect the hardwhere to be ok, but I can not verify it because the sever is hosted elsewhere. The last lines of output were: NOTICE: Rel pg_class: Uninitialized page 3344 - fixing NOTICE: Rel pg_class: Uninitialized page 3345 - fixing NOTICE: Rel pg_class: Uninitialized page 3346 - fixing NOTICE: Rel pg_class: Uninitialized page 3347 - fixing NOTICE: Rel pg_class: Uninitialized page 3348 - fixing NOTICE: Rel pg_class: Uninitialized page 3349 - fixing NOTICE: Rel pg_class: Uninitialized page 3350 - fixing NOTICE: Rel pg_class: Uninitialized page 3351 - fixing NOTICE: Rel pg_class: Uninitialized page 3352 - fixing NOTICE: Rel pg_class: Uninitialized page 3353 - fixing NOTICE: Rel pg_class: Uninitialized page 3354 - fixing NOTICE: Rel pg_class: Uninitialized page 3355 - fixing NOTICE: Rel pg_class: Uninitialized page 3356 - fixing batch/nachts.sh: line 3: 30855 Segmentation fault /usr/bin/php -q /usr/local/httpd/htdocs/kunden/web41/html/wcopy/batch/vacuum.php Running VACUUM FULL ANALYZE another time there were no errors. > hi, > > it does sound ugly. I checked the hardware. I can't see any problems with > i > t. > I know, somestimes you need a lot of luck to see a CPU problem under > solari > s. > But I think the hardware is ok. however, what kind of details do you need > t > o > qualify if it's a postgres problem or not ? > > regards > > -andreas schmitz -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!
Andreas Schmitz <a.schmitz@cityweb.de> writes: > however, what kind of details do you need to > qualify if it's a postgres problem or not ? Ultimately, we need a way to reproduce the problem for debugging. If it is a Postgres bug, it should be possible to reproduce it. regards, tom lane
daniel alvarez <d-alvarez@gmx.de> writes: > NOTICE: Rel pg_class: Uninitialized page 3344 - fixing > NOTICE: Rel pg_class: Uninitialized page 3345 - fixing > NOTICE: Rel pg_class: Uninitialized page 3346 - fixing > NOTICE: Rel pg_class: Uninitialized page 3347 - fixing > NOTICE: Rel pg_class: Uninitialized page 3348 - fixing > NOTICE: Rel pg_class: Uninitialized page 3349 - fixing > [etc] This is a known and not very serious problem --- see http://archives.postgresql.org/pgsql-hackers/2002-11/msg00486.php > batch/nachts.sh: line 3: 30855 Segmentation fault /usr/bin/php -q > /usr/local/httpd/htdocs/kunden/web41/html/wcopy/batch/vacuum.php As best I can tell, that's your own client code crashing, not Postgres. regards, tom lane
> The last lines of output were: > > > NOTICE: Rel pg_class: Uninitialized page 3344 - fixing > [...] > NOTICE: Rel pg_class: Uninitialized page 3355 - fixing > > NOTICE: Rel pg_class: Uninitialized page 3356 - fixing > > batch/nachts.sh: line 3: 30855 Segmentation fault /usr/bin/php -q > /usr/local/httpd/htdocs/kunden/web41/html/wcopy/batch/vacuum.php > > > Running VACUUM FULL ANALYZE another time there were no errors. Hi, I think it is not the same. When I ran the vaccum when no other clients where connected to the database. I did the "update dpa_text set titleidx=txt2txtidx(volltext);" When I ran "vacuumdb --full --verbose --analyze -d newsdb2 >>/tmp/vacuum.log 2>&1" Maybe the log will provide some information. I noticed a few messages like this in the database log while running the vacuum: Feb 17 11:19:36 postgres2 postgres[1803]: [ID 553393 local0.info] [5] LOG: pq_flush: send() failed: Broken pipe regards -andreas -- Andreas Schmitz - Phone +49 201 8501 318 Cityweb-Technik-Service-Gesellschaft mbH Friedrichstr. 12 - Fax +49 201 8501 104 45128 Essen - email a.schmitz@cityweb.de
Вложения
Andreas Schmitz <a.schmitz@cityweb.de> writes: > I think it is not the same. When I ran the vaccum when no other clients whe= > re=20 > connected to the database. The vacuum that reports the NOTICEs is not the one that created the problem. The scenario I was talking about requires concurrent clients during the preceding vacuum. regards, tom lane
On Monday 17 February 2003 19:56, Tom Lane wrote: > Andreas Schmitz <a.schmitz@cityweb.de> writes: > > I think it is not the same. When I ran the vaccum when no other clients > > whe= re=20 > > connected to the database. > > The vacuum that reports the NOTICEs is not the one that created the > problem. The scenario I was talking about requires concurrent clients > during the preceding vacuum. > > regards, tom lane Hi, ok. I got that one. I was able to reproduce it. but it still doesn't solve the problem. fact is that I loose data and that is a big problem. regards -andreas
ops ... forgot to attach the schema -- Andreas Schmitz - Phone +49 201 8501 318 Cityweb-Technik-Service-Gesellschaft mbH Friedrichstr. 12 - Fax +49 201 8501 104 45128 Essen - email a.schmitz@cityweb.de
Вложения
> Ultimately, we need a way to reproduce the problem for debugging. > If it is a Postgres bug, it should be possible to reproduce it. > > regards, tom lane Hello, the vacuum seems to be ok. I discovered that the "cluster" kills the data for some reason. I could reproduce it on to sun (SS20 sol9, e450 sol8, redhat 7.3 -> postgresql-7.3.2). I took the same dump to all of these machines und ran the cluster command. tables: multicom(pk)->multithumb(fk) (see attached schema) I checked before the cluster run: select id from multithumb where id not in (select id from multicom) order by id desc; and it returned 0 row(s). after this I ran "cluster multicom_txt_t_idx on multicom;" and repeated the statement above. 33 rows of data from the multicom table where gon and 32 references in multithumb were still there. the same 33 rows (id and data) were killed on all three databases. I tried it a few times and I always got the output. same id's same data. I picked the id's and had a look at the data. select * from multicom where id=18379; id | zulieferer | prioritaet | rubrik | kategorie | datum | kennung | dachzeile | ueberschrift | unterzeile | autor | zwischentitel | vorspann | kurztext | gueltig_bis | region | dateiname | volltext | titleidx -------+------------+------------+------------+-----------+------------------------+------------------------------+-----------+--------------+------------+-------+---------------+----------+----------+------------------------+----------+-------------------------------------------+--------------------------------+---------- 18379 | otz | 10 | Wirtschaft | BIL | 2003-02-26 19:47:00+01 | onbildotzBILWirNational37676 | | | | | | | | 2003-03-02 19:47:00+01 | National | /var/data/multikom/otz/m/WIRT/bulaptop.as | <p class="contentfliess"> </p> | (1 row) No txtidx was build on that datarows 'cause no long text was inserted. I filled the empty fields with the "<NODATA>" string. But it didn't help. can someone give me some advise on this. thanks and regards -andreas -- Andreas Schmitz - Phone +49 201 8501 318 Cityweb-Technik-Service-Gesellschaft mbH Friedrichstr. 12 - Fax +49 201 8501 104 45128 Essen - email a.schmitz@cityweb.de
Andreas Schmitz <a.schmitz@cityweb.de> writes: > the vacuum seems to be ok. I discovered that the "cluster" kills the > data for some reason. Hmm. I'm not sure whether clustering on a GIST index is really useful (seeing that there's no linear ordering associated with GIST). But it shouldn't lose data. I was able to reproduce a problem as follows: run the tsearch regression test, then do "cluster wowidx on test_txtidx". This appears to lose one row: regression=# select count(*) from test_txtidx; count ------- 502 (1 row) regression=# cluster wowidx on test_txtidx; CLUSTER regression=# select count(*) from test_txtidx; count ------- 501 (1 row) Oleg, Teodor, can you investigate this? regards, tom lane