Allow pg_archivecleanup to ignore extensions

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Allow pg_archivecleanup to ignore extensions
Дата
Msg-id 4D50F776.6080701@2ndquadrant.com
обсуждение исходный текст
Ответы Re: Allow pg_archivecleanup to ignore extensions  (Euler Taveira de Oliveira <euler@timbira.com>)
Re: Allow pg_archivecleanup to ignore extensions  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
One bit of feedback I keep getting from people who archive their WAL
files is that the fairly new pg_archivecleanup utility doesn't handle
the case where those archives are compressed.  As the sort of users who
are concerned about compression are also often ones with giant archives
they struggle to cleanup, they would certainly appreciate having a
bundled utility to take care of that.

The attached patch provides an additional option to the utility to
provide this capability.  It just strips a provided extension off any
matching file it considers before running the test for whether it should
be deleted or not.  It includes updates to the usage message and some
docs about how this might be used.  Code by Jaime Casanova and myself.

Here's an example of it working:

$ psql -c "show archive_command"
                  archive_command
----------------------------------------------------
 cp -i %p archive/%f < /dev/null && gzip archive/%f
[Yes, I know that can be written more cleanly.  I call external scripts
with more serious error handling than you can put into a single command
line for this sort of thing in production.]

$ psql -c "select pg_start_backup('test',true)"
$ psql -c "select pg_stop_backup()"
$ psql -c "checkpoint"
$ psql -c "select pg_switch_xlog()"

$ cd $PGDATA/archive
$ ls
000000010000000000000025.gz
000000010000000000000026.gz
000000010000000000000027.gz
000000010000000000000028.00000020.backup.gz
000000010000000000000028.gz
000000010000000000000029.gz

$ pg_archivecleanup -d -x .gz `pwd`
000000010000000000000028.00000020.backup
pg_archivecleanup: keep WAL file
"/home/gsmith/pgwork/data/archivecleanup/archive/000000010000000000000028"
and later
pg_archivecleanup: removing file
"/home/gsmith/pgwork/data/archivecleanup/archive/000000010000000000000025.gz"
pg_archivecleanup: removing file
"/home/gsmith/pgwork/data/archivecleanup/archive/000000010000000000000027.gz"
pg_archivecleanup: removing file
"/home/gsmith/pgwork/data/archivecleanup/archive/000000010000000000000026.gz"
$ ls
000000010000000000000028.00000020.backup.gz
000000010000000000000028.gz
000000010000000000000029.gz

We recenty got some on-list griping that pg_standby doesn't handle
archive files that are compressed, either.  Given how the job I'm
working on this week is going, I'll probably have to add that feature
next.  That's actually an easier source code hack than this one, because
of how the pg_standby code modularizes the concept of a restore command.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


diff --git a/contrib/pg_archivecleanup/pg_archivecleanup.c b/contrib/pg_archivecleanup/pg_archivecleanup.c
index 7989207..a95b659 100644
*** a/contrib/pg_archivecleanup/pg_archivecleanup.c
--- b/contrib/pg_archivecleanup/pg_archivecleanup.c
*************** const char *progname;
*** 36,41 ****
--- 36,42 ----

  /* Options and defaults */
  bool        debug = false;        /* are we debugging? */
+ char       *additional_ext = NULL;    /* Extension to remove from filenames */

  char       *archiveLocation;    /* where to find the archive? */
  char       *restartWALFileName; /* the file from which we can restart restore */
*************** static void
*** 93,105 ****
--- 94,135 ----
  CleanupPriorWALFiles(void)
  {
      int            rc;
+     int            chop_at;
      DIR           *xldir;
      struct dirent *xlde;
+     char        walfile[MAXPGPATH];

      if ((xldir = opendir(archiveLocation)) != NULL)
      {
          while ((xlde = readdir(xldir)) != NULL)
          {
+             strncpy(walfile, xlde->d_name, MAXPGPATH);
+             /*
+              * Remove any specified additional extension from the filename
+              * before testing it against the conditions below.
+              */
+             if (additional_ext)
+             {
+                 chop_at = strlen(walfile) - strlen(additional_ext);
+                 /*
+                  * Only chop if this is long enough to be a file name and the
+                  * extension matches.
+                  */
+                 if ((chop_at >= (XLOG_DATA_FNAME_LEN - 1)) &&
+                     (strcmp(walfile + chop_at,additional_ext)==0))
+                 {
+                     walfile[chop_at] = '\0';
+                     /*
+                      * This is too chatty even for regular debug output, but
+                      * leaving it in for program testing.
+                      */
+                     if (false)
+                         fprintf(stderr,
+                             "removed extension='%s' from file=%s result=%s\n",
+                             additional_ext,xlde->d_name,walfile);
+                 }
+             }
+
              /*
               * We ignore the timeline part of the XLOG segment identifiers in
               * deciding whether a segment is still needed.    This ensures that
*************** CleanupPriorWALFiles(void)
*** 113,122 ****
               * file. Note that this means files are not removed in the order
               * they were originally written, in case this worries you.
               */
!             if (strlen(xlde->d_name) == XLOG_DATA_FNAME_LEN &&
!             strspn(xlde->d_name, "0123456789ABCDEF") == XLOG_DATA_FNAME_LEN &&
!                 strcmp(xlde->d_name + 8, exclusiveCleanupFileName + 8) < 0)
              {
                  snprintf(WALFilePath, MAXPGPATH, "%s/%s",
                           archiveLocation, xlde->d_name);
                  if (debug)
--- 143,156 ----
               * file. Note that this means files are not removed in the order
               * they were originally written, in case this worries you.
               */
!             if (strlen(walfile) == XLOG_DATA_FNAME_LEN &&
!             strspn(walfile, "0123456789ABCDEF") == XLOG_DATA_FNAME_LEN &&
!                 strcmp(walfile + 8, exclusiveCleanupFileName + 8) < 0)
              {
+                 /*
+                  * Use the original file name again now, including any extension
+                  * that might have been chopped off before testing the sequence.
+                  */
                  snprintf(WALFilePath, MAXPGPATH, "%s/%s",
                           archiveLocation, xlde->d_name);
                  if (debug)
*************** usage(void)
*** 214,219 ****
--- 248,254 ----
             "  pg_archivecleanup /mnt/server/archiverdir 000000010000000000000010.00000020.backup\n");
      printf("\nOptions:\n");
      printf("  -d                 generates debug output (verbose mode)\n");
+     printf("  -x EXT             cleanup files if they have this same extension\n");
      printf("  --help             show this help, then exit\n");
      printf("  --version          output version information, then exit\n");
      printf("\nReport bugs to <pgsql-bugs@postgresql.org>.\n");
*************** main(int argc, char **argv)
*** 241,253 ****
          }
      }

!     while ((c = getopt(argc, argv, "d")) != -1)
      {
          switch (c)
          {
              case 'd':            /* Debug mode */
                  debug = true;
                  break;
              default:
                  fprintf(stderr, "Try \"%s --help\" for more information.\n", progname);
                  exit(2);
--- 276,291 ----
          }
      }

!     while ((c = getopt(argc, argv, "x:d")) != -1)
      {
          switch (c)
          {
              case 'd':            /* Debug mode */
                  debug = true;
                  break;
+             case 'x':
+                 additional_ext = optarg;
+                 break;
              default:
                  fprintf(stderr, "Try \"%s --help\" for more information.\n", progname);
                  exit(2);
diff --git a/doc/src/sgml/pgarchivecleanup.sgml b/doc/src/sgml/pgarchivecleanup.sgml
index 725f3ed..0c215fb 100644
*** a/doc/src/sgml/pgarchivecleanup.sgml
--- b/doc/src/sgml/pgarchivecleanup.sgml
*************** pg_archivecleanup:  removing file "archi
*** 98,103 ****
--- 98,118 ----
        </listitem>
       </varlistentry>

+      <varlistentry>
+       <term><option>-x</option> <replaceable>extension</></term>
+       <listitem>
+        <para>
+         When using the program as a standalone utility, provide an extension
+         that will be stripped from all file names before deciding if they
+         should be deleted.  This is typically useful for cleaning up archives
+         that have been compressed during storage, and therefore have had an
+         extension added by the compression program.  Note that the
+         <filename>.backup</> file name passed to the program should not
+         include the extension.
+        </para>
+       </listitem>
+      </varlistentry>
+
      </variablelist>
     </para>


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nick Rudnick
Дата:
Сообщение: Re: [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: SSI patch version 14