Re: large object seek/write bug

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: large object seek/write bug
Дата
Msg-id 18322.961050495@sss.pgh.pa.us
обсуждение исходный текст
Ответ на  (Ian Grant <Ian.Grant@cl.cam.ac.uk>)
Ответы Re: large object seek/write bug
Список pgsql-bugs
Ian Grant <Ian.Grant@cl.cam.ac.uk> writes:
> I'm using V 7.0 on a Linux machine and I believe I have found a bug in the
> large object interface provided by libpq.

Thank you for the carefully developed test case.  The bug is actually
in the backend, not in libpq: there's an ancient hack in inv_write
that looks at rel->rd_nblocks to decide if the large object is empty.
Unfortunately rd_nblocks isn't maintained very carefully, so the test
might mistakenly consider a recently-created LO to be empty, leading
to the Wrong Thing.  But that hack is no longer necessary, since the
index bug it was trying to work around was swatted ages ago; diking
out the check suffices to fix it.

Another error I noticed while testing the fix is that inv_read doesn't
cope gracefully with "holes" in the large object, ie, ranges never
written because of a seek and write past the previous end of file.
With the attached patch, reads of "holes" reliably return zeroes,
as is considered standard behavior for Unix files.  (The garbage data
you saw was actually from this error, although the case would not have
been triggered if not for the first bug.)

I believe the attached patch fixes these problems, but I haven't been
able to wring it out very thoroughly because I don't have applications
that do random seeks and writes in large objects.  If you could bang
on it a little more and report back, that'd be great.

            regards, tom lane


*** src/backend/storage/large_object/inv_api.c.orig    Wed Apr 12 13:15:37 2000
--- src/backend/storage/large_object/inv_api.c    Thu Jun 15 02:10:27 2000
***************
*** 185,190 ****
--- 185,191 ----
      retval->idesc = RelationGetDescr(indr);
      retval->offset = retval->lowbyte = retval->highbyte = 0;
      ItemPointerSetInvalid(&(retval->htid));
+     retval->flags = 0;

      if (flags & INV_WRITE)
      {
***************
*** 233,238 ****
--- 234,240 ----
      retval->idesc = RelationGetDescr(indrel);
      retval->offset = retval->lowbyte = retval->highbyte = 0;
      ItemPointerSetInvalid(&(retval->htid));
+     retval->flags = 0;

      if (flags & INV_WRITE)
      {
***************
*** 371,384 ****
      if (whence == SEEK_CUR)
      {
          offset += obj_desc->offset;        /* calculate absolute position */
-         return inv_seek(obj_desc, offset, SEEK_SET);
      }
!
!     /*
!      * if you seek past the end (offset > 0) I have no clue what happens
!      * :-(                  B.L.     9/1/93
!      */
!     if (whence == SEEK_END)
      {
          /* need read lock for getsize */
          if (!(obj_desc->flags & IFS_RDLOCK))
--- 373,380 ----
      if (whence == SEEK_CUR)
      {
          offset += obj_desc->offset;        /* calculate absolute position */
      }
!     else if (whence == SEEK_END)
      {
          /* need read lock for getsize */
          if (!(obj_desc->flags & IFS_RDLOCK))
***************
*** 389,396 ****
          offset += _inv_getsize(obj_desc->heap_r,
                                 obj_desc->hdesc,
                                 obj_desc->index_r);
-         return inv_seek(obj_desc, offset, SEEK_SET);
      }

      /*
       * Whenever we do a seek, we turn off the EOF flag bit to force
--- 385,392 ----
          offset += _inv_getsize(obj_desc->heap_r,
                                 obj_desc->hdesc,
                                 obj_desc->index_r);
      }
+     /* now we can assume that the operation is SEEK_SET */

      /*
       * Whenever we do a seek, we turn off the EOF flag bit to force
***************
*** 414,422 ****
       * stores the offset of the last byte that appears on it, and we have
       * an index on this.
       */
-
-
-     /* right now, just assume that the operation is SEEK_SET */
      if (obj_desc->iscan != (IndexScanDesc) NULL)
      {
          d = Int32GetDatum(offset);
--- 410,415 ----
***************
*** 424,430 ****
      }
      else
      {
-
          ScanKeyEntryInitialize(&skey, 0x0, 1, F_INT4GE,
                                 Int32GetDatum(offset));

--- 417,422 ----
***************
*** 487,495 ****

          /* copy the data from this block into the buffer */
          d = heap_getattr(&tuple, 2, obj_desc->hdesc, &isNull);
          ReleaseBuffer(buffer);

!         fsblock = (struct varlena *) DatumGetPointer(d);

          off = obj_desc->offset - obj_desc->lowbyte;
          ncopy = obj_desc->highbyte - obj_desc->offset + 1;
--- 479,505 ----

          /* copy the data from this block into the buffer */
          d = heap_getattr(&tuple, 2, obj_desc->hdesc, &isNull);
+         fsblock = (struct varlena *) DatumGetPointer(d);
          ReleaseBuffer(buffer);

!         /*
!          * If block starts beyond current seek point, then we are looking
!          * at a "hole" (unwritten area) in the object.  Return zeroes for
!          * the "hole".
!          */
!         if (obj_desc->offset < obj_desc->lowbyte)
!         {
!             int        nzeroes = obj_desc->lowbyte - obj_desc->offset;
!
!             if (nzeroes > (nbytes - nread))
!                 nzeroes = (nbytes - nread);
!             MemSet(buf, 0, nzeroes);
!             buf += nzeroes;
!             nread += nzeroes;
!             obj_desc->offset += nzeroes;
!             if (nread >= nbytes)
!                 break;
!         }

          off = obj_desc->offset - obj_desc->lowbyte;
          ncopy = obj_desc->highbyte - obj_desc->offset + 1;
***************
*** 535,548 ****
          Buffer        buffer;

          /*
!          * Fetch the current inversion file system block.  If the class
!          * storing the inversion file is empty, we don't want to do an
!          * index lookup, since index lookups choke on empty files (should
!          * be fixed someday).
           */

!         if ((obj_desc->flags & IFS_ATEOF)
!             || obj_desc->heap_r->rd_nblocks == 0)
              tuple.t_data = NULL;
          else
              inv_fetchtup(obj_desc, &tuple, &buffer);
--- 545,555 ----
          Buffer        buffer;

          /*
!          * Fetch the current inversion file system block.  We can skip
!          * the work if we already know we are at EOF.
           */

!         if (obj_desc->flags & IFS_ATEOF)
              tuple.t_data = NULL;
          else
              inv_fetchtup(obj_desc, &tuple, &buffer);

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Vadim Passynkov
Дата:
Сообщение: Some problem with inet type on PostgreSQL-7.0
Следующее
От: Ian Grant
Дата:
Сообщение: Re: large object seek/write bug