Re: pg_dump dump catalog ACLs

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: pg_dump dump catalog ACLs
Дата
Msg-id 20160425043909.GW10850@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: pg_dump dump catalog ACLs  (Noah Misch <noah@leadboat.com>)
Ответы Re: pg_dump dump catalog ACLs  (Noah Misch <noah@leadboat.com>)
Re: pg_dump dump catalog ACLs  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
Noah, all,

* Noah Misch (noah@leadboat.com) wrote:
> On Fri, Apr 22, 2016 at 12:31:41PM -0400, Stephen Frost wrote:
> > After looking through the code a bit, I realized that there are a lot of
> > object types which don't have ACLs at all but which exist in pg_catalog
> > and were being analyzed because the bitmask for pg_catalog included ACLs
> > and therefore was non-zero.
> >
> > Clearing that bit for object types which don't have ACLs improved the
> > performance for empty databases quite a bit (from about 3s to a bit
> > under 1s on my laptop).  That's a 42-line patch, with comment lines
> > being half of that, which I'll push once I've looked into the other
> > concerns which were brought up on this thread.
>
> That's good news.

Attached patch-set includes this change in patch #2.

Patch #1 is the fix for the incorrect WHERE clause.

> > Much of the remaining inefficiancy is how we query for column
> > information one table at a time (that appears to be around 300ms of the
> > 900ms or so total time).  I'm certainly interested in improving that but
> > that would require adding more complex data structures to pg_dump than
> > what we use currently (we'd want to grab all of the columns we care
> > about in an entire schema and store it locally and then provide a way to
> > look up that information, etc), so I'm not sure it'd be appropriate to
> > do now.
>
> I'm not sure, either; I'd need to see more to decide.  If I were you, I would
> draft a patch to the point where the community can see the complexity and the
> performance change.  That should be enough to inform the choice among moving
> forward with the proposed complexity, soliciting other designs, reverting the
> original changes, or accepting for 9.6 the slowdown as it stands at that time.
> Other people may have more definite opinions already.

I'll look at doing this once I'm done with the regression test
improvements (see below).

* Noah Misch (noah@leadboat.com) wrote:
> On Fri, Apr 22, 2016 at 08:24:53PM -0400, Stephen Frost wrote:
> > * Noah Misch (noah@leadboat.com) wrote:
> > > I wrote the attached test script to verify which types of ACLs dump/reload
> > > covers.  Based on the commit message, I expected these results:
> > >
> > >   Dumped: type, class, attribute, proc, largeobject_metadata,
> > >       foreign_data_wrapper, foreign_server,
> > >       language(in-extension), namespace(in-extension)
> > >   Not dumped: database, tablespace,
> > >       language(from-initdb), namespace(from-initdb)
> > >
> > > Did I interpret the commit message correctly?  The script gave these results:
> > >
> > >   Dumped: type, class, attribute, namespace(from-initdb)
> > >   Not dumped: database, tablespace, proc, language(from-initdb)
> >
> > You interpreted the commit message correctly and in a number of cases
> > the correct results are generated, but there's an issue in the WHERE
> > clause being used for some of the object types.
>
> I'm wondering whether it would be a slightly better design to treat
> language(from-initdb) like language(in-extension).  Likewise for namespaces.
> The code apparently already exists to dump from-initdb namespace ACLs.  Is
> there a user interface design reason to want to distinguish the from-initdb
> and in-extension cases?

Reviewing the code, we do record from-initdb namespace privileges, and
we also record in-extension namespace privileges (see
ExecGrant_Namespace()).

We also record from-initdb language ACLs (initdb.c:2071) and
in-extension language ACLs (see ExecGrant_Language()), so I'm not
entirely sure what, if any, issue exists here either.

For both, we also grab the ACL info vs. pg_init_privs in pg_dump.

The issue in these cases is a bit of an interesting one- they are not
part of a namespace; this patch was focused on allowing users to modify,
specifically, the ACLs of objects in the 'pg_catalog' namespace.

For languages, I believe that means that we simply need to modify the
selectDumpableProcLang() function to use the same default we use for the
'pg_catalog' namespace- DUMP_COMPONENT_ACL, instead of
DUMP_COMPONENT_NONE.

What's not clear to me is what, if any, issue there is with namespaces.

Certainly, in my testing at least, if you do:

REVOKE CREATE ON SCHEMA public FROM public;

Then you get the appropriate commands from pg_dump to implement the
resulting ACLs on the public schema.  If you change the permissions back
to match what is there at initdb-time (or you just don't change them),
then there aren't any GRANT or REVOKE commands from pg_dump, but that's
correct, isn't it?

> > That's relatively straight-forward to fix, but I'm going to put
> > together a series of TAP tests to go with these fixes.  While I tested
> > various options and bits and pieces of the code while developing, this
> > really needs a proper test suite that runs through all of these
> > permutations with each change.
>
> Sounds great.  Thanks.
>
> > I'm planning to have that done by the end of the weekend.

In the attached patch-set, patch #3 includes support for

src/bin/pg_dump: make check

implemented using the TAP testing system.  There are a total of 360
tests, generally covering:

Various invalid sets of command-line options.

Valid command-line options (generally independently used):

  (no options- defaults)
  --clean
  --clean --if-exists
  --data-only
  --format=c (tested with pg_restore)
  --format=d (tested with pg_restore)
  --format=t (tested with pg_restore)
  --format=d --jobs=2 (tested with pg_restore)
  --exclude-schema
  --exclude-table
  --no-privileges
  --no-owner
  --schema
  --schema-only

Note that this is only including tests for basic schemas, tables, data,
and privileges, so far.  I believe it's pretty obvious that we want to
include all object types and include testing of extensions, I just
haven't gotten there yet and figured now would be a good time to solicit
feedback on the approach I've used.

I'm not sure how valuable it is to test all the different combinations
of command-line options, though clearly some should be tested (eg:
include-schema, exclude table in that schema, and similar cases).

I'm planning to work on adding in other object types and, once that's
more-or-less complete, adding in a test for extensions and then adding
tests for GRANT/REVOKE on from-initdb and in-extension objects.

Thoughts?

Thanks!

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Why doesn't src/backend/port/win32/socket.c implement bind()?
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: checkpoint_flush_after documentation inconsistency