"make check" fails over NFS or tmpfs

Поиск
Список
Период
Сортировка
От SODA Noriyuki
Тема "make check" fails over NFS or tmpfs
Дата
Msg-id 17521.15090.901883.16485@srapc2586.sra.co.jp
обсуждение исходный текст
Ответы Re: "make check" fails over NFS or tmpfs  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: "make check" fails over NFS or tmpfs  (Greg Stark <gsstark@mit.edu>)
Re: "make check" fails over NFS or tmpfs  ("Rafael Martinez, Guerrero" <r.m.guerrero@usit.uio.no>)
Список pgsql-general
Hi,

We've encountered failures of "make check", when we put PostgreSQL
data directory on a NFS filesystem or a tmpfs filesystem.
It doesn't always fail, but fails occasionally.

Is this expected behavior of PostgreSQL?

If it's expected, what is the reason of this symptom?
I grep'ed the source code of PostgreSQL, but it seems it doesn't
use problematic operations (for NFS) like flock(2) or F_SETLK/F_SETLKW
of fcntl(2)...  So, I guess (theoretically) it should work fine over
NFS or tmpfs.  Only idea which strucks me is there is some nasty
bug in Linux. ;-)

Of course, we are using single instance of PostgreSQL on single
machine. i.e. We are NOT accessing the data directory from
either multiple machines or multiple PostgreSQL instances.

To give an actual example,
when we invoked the following shell script:
    $ cat ~/regress-loop.sh
    #!/bin/sh
    loop=1
    make clean
    while true; do
        echo "############### loop = $loop ##################"
        make check
        ret=$?
        if [ $ret -ne 0 ]; then
        echo "error @ loop = $loop (return value = $ret)"
        exit $ret
        fi
        loop=`expr $loop + 1`
    done
Errors like the following happen, sometimes:
    $ sh ~/regress-loop.sh
       :
       :
    make: *** [check] Error 2
    error @ loop = 26 (return value = 2)

We observed this symptom under the following conditions:

1. putting PGDATA on NFS-async
  filesystem:
    NFS (async)
  NFS client:
    PostgreSQL version: 8.1.3
    OS version: Fedora Core 3 Linux
  NFS server:
    OS version: Fedora Core 3 Linux
    "async" is specified in /etc/exports, thus the server violates
    the NFS protocol, and replys to requests before it stores
    changes to its disk.
  How many loops until it fails:
    3000 loops or more

2. putting PGDATA on NFS
  filesystem:
    NFS
  NFS client:
    PostgreSQL version: 8.1.3
    OS version: Fedora Core 4 Linux
  NFS server:
    OS version: Fedora Core 5 Linux
  How many loops until it fails:
     approximately 300 loops

3. putting PGDATA on tmpfs
  filesystem:
    tmpfs
  PostgreSQL version: 8.1.3
  OS version: Fedora Core 5 Linux
  How many loops until it fails:
     approximately 100 loops

This symptom never happens over ext3fs, as far as we see.
I attached the diff between expected results and actual results in
this mail.

Any ideas appreciated, except using local filesystem. ;-)
--
SODA Noriyuki

*** ./expected/tablespace.out    Tue May 16 13:03:24 2006
--- ./results/tablespace.out    Fri May 19 21:04:30 2006
***************
*** 35,37 ****
--- 35,38 ----
  NOTICE:  drop cascades to table testschema.foo
  -- Should succeed
  DROP TABLESPACE testspace;
+ ERROR:  tablespace "testspace" is not empty

======================================================================

*** ./expected/tablespace.out    Fri May 19 15:28:32 2006
--- ./results/tablespace.out    Sat May 20 06:13:18 2006
***************
*** 35,37 ****
--- 35,38 ----
  NOTICE:  drop cascades to table testschema.foo
  -- Should succeed
  DROP TABLESPACE testspace;
+ ERROR:  tablespace "testspace" is not empty

======================================================================

*** ./expected/sanity_check.out    Fri Sep  9 05:07:42 2005
--- ./results/sanity_check.out    Fri May 19 16:31:37 2006
***************
*** 17,22 ****
--- 17,24 ----
   circle_tbl          | t
   fast_emp4000        | t
   func_index_heap     | t
+  gcircle_tbl         | t
+  gpolygon_tbl        | t
   hash_f8_heap        | t
   hash_i4_heap        | t
   hash_name_heap      | t
***************
*** 68,74 ****
   shighway            | t
   tenk1               | t
   tenk2               | t
! (58 rows)

  --
  -- another sanity check: every system catalog that has OIDs should have
--- 70,76 ----
   shighway            | t
   tenk1               | t
   tenk2               | t
! (60 rows)

  --
  -- another sanity check: every system catalog that has OIDs should have

======================================================================


В списке pgsql-general по дате отправления:

Предыдущее
От: Bruno Wolff III
Дата:
Сообщение: Re: SQL & Binary Data Questions
Следующее
От: Tom Lane
Дата:
Сообщение: Re: "make check" fails over NFS or tmpfs