Обсуждение: Fairly serious bug induced by latest guc enum changes

Поиск
Список
Период
Сортировка

Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
I see that assign_xlog_sync_method() is still assigning to sync_method.
This is 100% wrong: an assign hook is chartered to set derived values,
but not to set the GUC variable itself.  This will for example result
in set_config_option() stacking the wrong value (the already-updated
one) as the value to roll back to if the transaction aborts.

You could just remove the assignment from assign_xlog_sync_method,
although it might look a bit odd to be setting open_sync_bit but
not sync_method.  It also bothers me slightly that the derived and
main values wouldn't be set at exactly the same point --- problems
inside guc.c might lead to them getting out of sync.

Another possibility is to stick with something equivalent to the former
design: what GUC thinks is the variable is just a dummy static integer
in guc.c, and the assign hook is still setting the "real" value that the
rest of the code looks at.  A minor advantage of this second way is that
the "real" value could still be declared as enum rather than int.

Please fix this, and take another look at the prior WAL enum changes
to see if the same problem hasn't been created elsewhere.
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Magnus Hagander
Дата:
Tom Lane wrote:
> I see that assign_xlog_sync_method() is still assigning to
> sync_method. This is 100% wrong: an assign hook is chartered to set
> derived values, but not to set the GUC variable itself.  This will
> for example result in set_config_option() stacking the wrong value
> (the already-updated one) as the value to roll back to if the
> transaction aborts.

Hm. I never quite did get my head around how the stacking work, that's
probably why :-)


> You could just remove the assignment from assign_xlog_sync_method,
> although it might look a bit odd to be setting open_sync_bit but
> not sync_method.  It also bothers me slightly that the derived and
> main values wouldn't be set at exactly the same point --- problems
> inside guc.c might lead to them getting out of sync.
>
> Another possibility is to stick with something equivalent to the
> former design: what GUC thinks is the variable is just a dummy static
> integer in guc.c, and the assign hook is still setting the "real"
> value that the rest of the code looks at.  A minor advantage of this
> second way is that the "real" value could still be declared as enum
> rather than int.

That value never was an enum, so that part is not an advantage.

I still think going with the older method would be the safest, though,
for the other reasons. You agree?


> Please fix this, and take another look at the prior WAL enum changes
> to see if the same problem hasn't been created elsewhere.

(I assume you mean GUC enum here, that seems fairly obvious)


The only other assign hooks are assign_syslog_facility and
assign_session_replication_role. The changes there are:

In these, the value was previously derived from a string and set in the
hook. It's now set directly by the GUC code, and the hook only updates
"other things" (setting the actual syslog facility, and resetting the
cache, respectively).

I think that means there are no bugs there.

//Magnus


Re: Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> I still think going with the older method would be the safest, though,
> for the other reasons. You agree?

Seems reasonable to me.

> (I assume you mean GUC enum here, that seems fairly obvious)

Sorry, was writing in too much of a hurry.

> In these, the value was previously derived from a string and set in the
> hook. It's now set directly by the GUC code, and the hook only updates
> "other things" (setting the actual syslog facility, and resetting the
> cache, respectively).

> I think that means there are no bugs there.

Yeah, that's fine.  I think though that I may have created a bug inside
GUC itself: the new stacking code could conceivably fail (palloc error)
between success return from the assign hook and setting up the stack
entry that is needed to undo the assignment on abort.  In this situation
the assign hook would've made its "other thing" changes but there is no
GUC state to cause the hook to be called again to undo 'em.  I need to
fix it so that any palloc'ing needed is done before calling the assign
hook.
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Magnus Hagander
Дата:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> I still think going with the older method would be the safest, though,
>> for the other reasons. You agree?
>
> Seems reasonable to me.

Since it didn't really sound like a nice option, here's a third one I
came up with later. Basically, this one splits things apart so we only
use one variable, which is sync_method. Instead of using a macro to get
the open sync bit, it uses a function. This makes the assign hook only
responsible for flushing and closing the old file.

Thoughts? And if you like it, is it enough to expect the compiler to
figure out to inline it or should we explicitly inline it?

>> In these, the value was previously derived from a string and set in the
>> hook. It's now set directly by the GUC code, and the hook only updates
>> "other things" (setting the actual syslog facility, and resetting the
>> cache, respectively).
>
>> I think that means there are no bugs there.
>
> Yeah, that's fine.  I think though that I may have created a bug inside
> GUC itself: the new stacking code could conceivably fail (palloc error)
> between success return from the assign hook and setting up the stack
> entry that is needed to undo the assignment on abort.  In this situation
> the assign hook would've made its "other thing" changes but there is no
> GUC state to cause the hook to be called again to undo 'em.  I need to
> fix it so that any palloc'ing needed is done before calling the assign
> hook.

Ok. I'll leave that to you for now :) I still need to figure out how the
  stacking works because I want to add the "this setting came from file
X line Y" field to pg_settings, but that's a separate issue.

//Magnus
Index: xlog.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.307
diff -c -r1.307 xlog.c
*** xlog.c    12 May 2008 19:45:23 -0000    1.307
--- xlog.c    13 May 2008 06:03:45 -0000
***************
*** 86,97 ****
  #define XLOGfileslop    (2*CheckPointSegments + 1)


- /* these are derived from XLOG_sync_method by assign_xlog_sync_method */
- int            sync_method = DEFAULT_SYNC_METHOD;
- static int    open_sync_bit = DEFAULT_SYNC_FLAGBIT;
-
- #define XLOG_SYNC_BIT  (enableFsync ? open_sync_bit : 0)
-
  /*
   * GUC support
   */
--- 86,91 ----
***************
*** 444,449 ****
--- 438,444 ----
  static bool read_backup_label(XLogRecPtr *checkPointLoc,
                    XLogRecPtr *minRecoveryLoc);
  static void rm_redo_error_callback(void *arg);
+ static int get_sync_bit(void);


  /*
***************
*** 1960,1966 ****
       */
      if (*use_existent)
      {
!         fd = BasicOpenFile(path, O_RDWR | PG_BINARY | XLOG_SYNC_BIT,
                             S_IRUSR | S_IWUSR);
          if (fd < 0)
          {
--- 1955,1961 ----
       */
      if (*use_existent)
      {
!         fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(),
                             S_IRUSR | S_IWUSR);
          if (fd < 0)
          {
***************
*** 1986,1992 ****

      unlink(tmppath);

!     /* do not use XLOG_SYNC_BIT here --- want to fsync only at end of fill */
      fd = BasicOpenFile(tmppath, O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
--- 1981,1987 ----

      unlink(tmppath);

!     /* do not use get_sync_bit() here --- want to fsync only at end of fill */
      fd = BasicOpenFile(tmppath, O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
***************
*** 2064,2070 ****
      *use_existent = false;

      /* Now open original target segment (might not be file I just made) */
!     fd = BasicOpenFile(path, O_RDWR | PG_BINARY | XLOG_SYNC_BIT,
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
          ereport(ERROR,
--- 2059,2065 ----
      *use_existent = false;

      /* Now open original target segment (might not be file I just made) */
!     fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(),
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
          ereport(ERROR,
***************
*** 2115,2121 ****

      unlink(tmppath);

!     /* do not use XLOG_SYNC_BIT here --- want to fsync only at end of fill */
      fd = BasicOpenFile(tmppath, O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
--- 2110,2116 ----

      unlink(tmppath);

!     /* do not use get_sync_bit() here --- want to fsync only at end of fill */
      fd = BasicOpenFile(tmppath, O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
***************
*** 2297,2303 ****

      XLogFilePath(path, ThisTimeLineID, log, seg);

!     fd = BasicOpenFile(path, O_RDWR | PG_BINARY | XLOG_SYNC_BIT,
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
          ereport(PANIC,
--- 2292,2298 ----

      XLogFilePath(path, ThisTimeLineID, log, seg);

!     fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(),
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
          ereport(PANIC,
***************
*** 3650,3656 ****

      unlink(tmppath);

!     /* do not use XLOG_SYNC_BIT here --- want to fsync only at end of fill */
      fd = BasicOpenFile(tmppath, O_RDWR | O_CREAT | O_EXCL,
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
--- 3645,3651 ----

      unlink(tmppath);

!     /* do not use get_sync_bit() here --- want to fsync only at end of fill */
      fd = BasicOpenFile(tmppath, O_RDWR | O_CREAT | O_EXCL,
                         S_IRUSR | S_IWUSR);
      if (fd < 0)
***************
*** 6328,6341 ****


  /*
!  * GUC support
   */
! bool
! assign_xlog_sync_method(int new_sync_method, bool doit, GucSource source)
  {
!     int            new_sync_bit = 0;

!     switch (new_sync_method)
      {
          /*
           * Values for these sync options are defined even if they are not
--- 6323,6339 ----


  /*
!  * Return the (possible) sync flag used for opening a file, depending on the
!  * value of the GUC wal_sync_method.
   */
! static int
! get_sync_bit(void)
  {
!     /* If fsync is disabled, never open in sync mode */
!     if (!enableFsync)
!         return 0;

!     switch (sync_method)
      {
          /*
           * Values for these sync options are defined even if they are not
***************
*** 6346,6362 ****
          case SYNC_METHOD_FSYNC:
          case SYNC_METHOD_FSYNC_WRITETHROUGH:
          case SYNC_METHOD_FDATASYNC:
!             new_sync_bit = 0;
!             break;
  #ifdef OPEN_SYNC_FLAG
          case SYNC_METHOD_OPEN:
!             new_sync_bit = OPEN_SYNC_FLAG;
!             break;
  #endif
  #ifdef OPEN_DATASYNC_FLAG
          case SYNC_METHOD_OPEN_DSYNC:
!             new_sync_bit = OPEN_DATASYNC_FLAG;
!         break;
  #endif
          default:
              /*
--- 6344,6357 ----
          case SYNC_METHOD_FSYNC:
          case SYNC_METHOD_FSYNC_WRITETHROUGH:
          case SYNC_METHOD_FDATASYNC:
!             return 0;
  #ifdef OPEN_SYNC_FLAG
          case SYNC_METHOD_OPEN:
!             return OPEN_SYNC_FLAG;
  #endif
  #ifdef OPEN_DATASYNC_FLAG
          case SYNC_METHOD_OPEN_DSYNC:
!             return OPEN_DATASYNC_FLAG;
  #endif
          default:
              /*
***************
*** 6364,6377 ****
               * new_sync_method are controlled by the available enum
               * options.
               */
!             elog(PANIC, "unrecognized wal_sync_method: %d", new_sync_method);
!             break;
      }

      if (!doit)
          return true;

!     if (sync_method != new_sync_method || open_sync_bit != new_sync_bit)
      {
          /*
           * To ensure that no blocks escape unsynced, force an fsync on the
--- 6359,6379 ----
               * new_sync_method are controlled by the available enum
               * options.
               */
!             elog(PANIC, "unrecognized wal_sync_method: %d", sync_method);
!             return 0; /* silence warning */
      }
+ }

+ /*
+  * GUC support
+  */
+ bool
+ assign_xlog_sync_method(int new_sync_method, bool doit, GucSource source)
+ {
      if (!doit)
          return true;

!     if (sync_method != new_sync_method)
      {
          /*
           * To ensure that no blocks escape unsynced, force an fsync on the
***************
*** 6386,6396 ****
                          (errcode_for_file_access(),
                           errmsg("could not fsync log file %u, segment %u: %m",
                                  openLogId, openLogSeg)));
!             if (open_sync_bit != new_sync_bit)
                  XLogFileClose();
          }
-         sync_method = new_sync_method;
-         open_sync_bit = new_sync_bit;
      }

      return true;
--- 6388,6397 ----
                          (errcode_for_file_access(),
                           errmsg("could not fsync log file %u, segment %u: %m",
                                  openLogId, openLogSeg)));
!             if ((sync_method == SYNC_METHOD_OPEN && new_sync_method == SYNC_METHOD_OPEN_DSYNC) ||
!                 (sync_method == SYNC_METHOD_OPEN_DSYNC && new_sync_method == SYNC_METHOD_OPEN))
                  XLogFileClose();
          }
      }

      return true;

Re: Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> Since it didn't really sound like a nice option, here's a third one I 
> came up with later. Basically, this one splits things apart so we only 
> use one variable, which is sync_method. Instead of using a macro to get 
> the open sync bit, it uses a function. This makes the assign hook only 
> responsible for flushing and closing the old file.

Okay, but you failed to correctly reproduce the conditions for closing
the old file.

> Thoughts? And if you like it, is it enough to expect the compiler to 
> figure out to inline it or should we explicitly inline it?

I don't think we care that much, since it's only invoked when we're
about to do a moderately expensive kernel call.
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
I wrote:
> Okay, but you failed to correctly reproduce the conditions for closing
> the old file.

A more bulletproof solution might involve passing sync_method to
get_sync_bit as an explicit parameter, and then the assign hook
could doif (get_sync_bit(sync_method) != get_sync_bit(new_sync_method))    XLogFileClose();
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Magnus Hagander
Дата:
Tom Lane wrote:
> I wrote:
> > Okay, but you failed to correctly reproduce the conditions for
> > closing the old file.
> 
> A more bulletproof solution might involve passing sync_method to
> get_sync_bit as an explicit parameter, and then the assign hook
> could do
>     if (get_sync_bit(sync_method) !=
> get_sync_bit(new_sync_method)) XLogFileClose();

Right, but I still need the other part of the check, right? This one
still fails the same check as my patch, no? Because I assume the hole
you found there was that get_sync_bit() will return 0 for two different
sync methods as long as none of them are O_SYNC or O_DSYNC...

//Magnus


Re: Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> Right, but I still need the other part of the check, right? This one
> still fails the same check as my patch, no? Because I assume the hole
> you found there was that get_sync_bit() will return 0 for two different
> sync methods as long as none of them are O_SYNC or O_DSYNC...

No, my point was that there are three possible states of sync_bit and
your patch only accounted for transitions between two of 'em.  For
instance, if sync_bit goes to 0 we must close and reopen the file,
else we'll be doing both O_SYNC flush and whatever flush method
is supposed to be getting used.
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Bruce Momjian
Дата:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > Right, but I still need the other part of the check, right? This one
> > still fails the same check as my patch, no? Because I assume the hole
> > you found there was that get_sync_bit() will return 0 for two different
> > sync methods as long as none of them are O_SYNC or O_DSYNC...
> 
> No, my point was that there are three possible states of sync_bit and
> your patch only accounted for transitions between two of 'em.  For
> instance, if sync_bit goes to 0 we must close and reopen the file,
> else we'll be doing both O_SYNC flush and whatever flush method
> is supposed to be getting used.

Did this every get addressed?  I don't see a commit for it.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> No, my point was that there are three possible states of sync_bit and
>> your patch only accounted for transitions between two of 'em.

> Did this every get addressed?  I don't see a commit for it.

I thought it got fixed here:

2008-05-14 10:02  mha
* src/backend/access/transam/xlog.c: Remove the special variablefor open_sync_bit used in O_SYNC and O_DSYNC modes,
replacingitwith a call to a function that derives it from the sync_methodvariable, now that it has distinct values for
thesetwo cases.This means that assign_xlog_sync_method() no longer changes anysettings, thus fixing the bug introduced
inthe change to use a gucenum for wal_sync_method.
 

Hmm ... or at least more or less fixed.  Seems like there's no provision
to close and reopen the file if enableFsync changes.  Not sure if that's
worth worrying about.
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Magnus Hagander
Дата:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>> Tom Lane wrote:
>>> No, my point was that there are three possible states of sync_bit and
>>> your patch only accounted for transitions between two of 'em.
> 
>> Did this every get addressed?  I don't see a commit for it.
> 
> I thought it got fixed here:
> 
> 2008-05-14 10:02  mha
> 
>     * src/backend/access/transam/xlog.c: Remove the special variable
>     for open_sync_bit used in O_SYNC and O_DSYNC modes, replacing it
>     with a call to a function that derives it from the sync_method
>     variable, now that it has distinct values for these two cases.
>     
>     This means that assign_xlog_sync_method() no longer changes any
>     settings, thus fixing the bug introduced in the change to use a guc
>     enum for wal_sync_method.
> 
> Hmm ... or at least more or less fixed.  Seems like there's no provision
> to close and reopen the file if enableFsync changes.  Not sure if that's
> worth worrying about.

We didn't have that before either, did we? We close it when the sync bit
changes, but not just if we change say between fsync() and fdatasync().
Is there any actual reason we'd want to close it?

//Magnus


Re: Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> Tom Lane wrote:
>> Hmm ... or at least more or less fixed.  Seems like there's no provision
>> to close and reopen the file if enableFsync changes.  Not sure if that's
>> worth worrying about.

> We didn't have that before either, did we?

No, so I think it's a pre-existing bug.

> We close it when the sync bit
> changes, but not just if we change say between fsync() and fdatasync().
> Is there any actual reason we'd want to close it?

The point is that if you turn the fsync GUC on or off while using a wal
sync mode that requires supplying an option flag to open(), then really
you ought to close the WAL file and re-open it with the new correct
option flags.  The fact that we're not doing that implies that the
effects of a change in fsync might not fully take effect until the next
WAL segment is started.  Whether this is worth fixing isn't real clear.
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Magnus Hagander
Дата:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Tom Lane wrote:
>>> Hmm ... or at least more or less fixed.  Seems like there's no provision
>>> to close and reopen the file if enableFsync changes.  Not sure if that's
>>> worth worrying about.
> 
>> We didn't have that before either, did we?
> 
> No, so I think it's a pre-existing bug.

Ok, at least I'm reading the code right.


>> We close it when the sync bit
>> changes, but not just if we change say between fsync() and fdatasync().
>> Is there any actual reason we'd want to close it?
> 
> The point is that if you turn the fsync GUC on or off while using a wal
> sync mode that requires supplying an option flag to open(), then really
> you ought to close the WAL file and re-open it with the new correct
> option flags.  The fact that we're not doing that implies that the
> effects of a change in fsync might not fully take effect until the next
> WAL segment is started.  Whether this is worth fixing isn't real clear.

What scenario does it actually happen in, though? Doesn't the check: if (get_sync_bit(sync_method) !=
get_sync_bit(new_sync_method))

take care of that? If the sync bit changed, we close the file?


Or are you talking about changing the variable "fsync"? If so, doesn't
"fsync=off" also change the behavior of other parts of the code, so it's
not just WAL, which means it'd be pretty unsafe *anyway* unless you
actually "sync" the disks, and not just fsync?

//Magnus


//Magnus


Re: Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> Or are you talking about changing the variable "fsync"? If so, doesn't
> "fsync=off" also change the behavior of other parts of the code, so it's
> not just WAL, which means it'd be pretty unsafe *anyway* unless you
> actually "sync" the disks, and not just fsync?

No, because the other uses of it are controlling whether to issue
fsync() calls dynamically.  The use in get_sync_bit is the only one
that sets persistent state.  In fact md.c goes out of its way to ensure
that changing fsync on the fly behaves as expected.
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Magnus Hagander
Дата:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Or are you talking about changing the variable "fsync"? If so, doesn't
>> "fsync=off" also change the behavior of other parts of the code, so it's
>> not just WAL, which means it'd be pretty unsafe *anyway* unless you
>> actually "sync" the disks, and not just fsync?
> 
> No, because the other uses of it are controlling whether to issue
> fsync() calls dynamically.  The use in get_sync_bit is the only one
> that sets persistent state.  In fact md.c goes out of its way to ensure
> that changing fsync on the fly behaves as expected.

Not having looked at md.c (I confess...) but don't we have a problem in
case we have closed the file without fsyncing it, and then change the
fsync parameter?

Either way, I see your point, but I doubt it's worth getting upset over.
Funning with fsync=off in the first place is bad, and if it takes you
one WAL segment to "recover", I think that's acceptable...

//Magnus



Re: Fairly serious bug induced by latest guc enum changes

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> Not having looked at md.c (I confess...) but don't we have a problem in
> case we have closed the file without fsyncing it, and then change the
> fsync parameter?

Well, we don't promise to retroactively fsync stuff we didn't before;
and I wouldn't expect that to happen if I were changing the setting.
What I *would* expect is that the system immediately starts to act
according to the new setting, and that's not true as the code stands.

As you say, the whole thing is pretty dubious from a data safety
standpoint anyway.  What I am concerned about here is people trying to
compare performance measurements under different settings, and not being
aware that the system's behavior doesn't change when they tell it to.
        regards, tom lane


Re: Fairly serious bug induced by latest guc enum changes

От
Magnus Hagander
Дата:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Not having looked at md.c (I confess...) but don't we have a problem in
>> case we have closed the file without fsyncing it, and then change the
>> fsync parameter?
> 
> Well, we don't promise to retroactively fsync stuff we didn't before;
> and I wouldn't expect that to happen if I were changing the setting.
> What I *would* expect is that the system immediately starts to act
> according to the new setting, and that's not true as the code stands.
> 
> As you say, the whole thing is pretty dubious from a data safety
> standpoint anyway.  What I am concerned about here is people trying to
> compare performance measurements under different settings, and not being
> aware that the system's behavior doesn't change when they tell it to.

Well, if they're running a performance measure that generates <16Mb
data, I don't think they'll get very usable numbers anyway...

//Magnus