Обсуждение: vacuumdb -f and -j options (was Question / requests.)

Поиск
Список
Период
Сортировка

vacuumdb -f and -j options (was Question / requests.)

От
Amit Kapila
Дата:
On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
>> I don't know, but it seems like the documentation for vacuumdb
>> currently says, more or less, "Hey, if you use -j with -f, it may not
>> work!", which seems unacceptable to me.  It should be the job of the
>> person writing the feature to make it work in all cases, not the job
>> of the person using the feature to work around the problem when it
>> doesn't.
>
> The most interesting use case of vacuumdb is lazy vacuuming, I think, so
> committing that patch as it was submitted previously was a good step
> forward even if it didn't handle VACUUM FULL 100%.
>
> I agree that it's better to have both modes Just Work in parallel, which
> is the point of this subsequent patch.  So let's move forward.  I
> support Francisco's effort to make -f work with -j.  I don't have a
> strong opinion on which of the various proposals presented so far is the
> best way to implement it, but let's figure that out and get it done.
>

After reading Francisco's proposal [1], I don't think it is directly
trying to make -f and -j work together.  He is proposing to make it
work by providing some new options.  As you are wondering upthread, I
think it seems reasonable to disallow -f with parallel vacuuming if no
tables are specified.


[1] -
https://www.postgresql.org/message-id/CA%2BbJJbx8%2BSKBU%3DXUE%2BHxZHysh9226iMfTnA69AznwRTOEGtR7Q%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: vacuumdb -f and -j options (was Question / requests.)

От
Michael Paquier
Дата:
On Sat, Oct 8, 2016 at 9:12 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Robert Haas wrote:
>>> On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
>>> I don't know, but it seems like the documentation for vacuumdb
>>> currently says, more or less, "Hey, if you use -j with -f, it may not
>>> work!", which seems unacceptable to me.  It should be the job of the
>>> person writing the feature to make it work in all cases, not the job
>>> of the person using the feature to work around the problem when it
>>> doesn't.
>>
>> The most interesting use case of vacuumdb is lazy vacuuming, I think, so
>> committing that patch as it was submitted previously was a good step
>> forward even if it didn't handle VACUUM FULL 100%.
>>
>> I agree that it's better to have both modes Just Work in parallel, which
>> is the point of this subsequent patch.  So let's move forward.  I
>> support Francisco's effort to make -f work with -j.  I don't have a
>> strong opinion on which of the various proposals presented so far is the
>> best way to implement it, but let's figure that out and get it done.
>>
>
> After reading Francisco's proposal [1], I don't think it is directly
> trying to make -f and -j work together.  He is proposing to make it
> work by providing some new options.  As you are wondering upthread, I
> think it seems reasonable to disallow -f with parallel vacuuming if no
> tables are specified.

Instead of restricting completely things, I'd like to think that being
able to make both of them work together is the right move at the end.
From what I recall from the code of vacuumdb, I agree with Alvaro's
position: it would not be much a complicated challenge to vacuum all
the catalogs in one worker and spread the rest of the tables in the
rest of them. We need to be careful of the case where a list of tables
is given by the user via -t though, in the case where user is passing
both catalog and normal relations.
-- 
Michael



Re: vacuumdb -f and -j options (was Question / requests.)

От
Amit Kapila
Дата:
On Sat, Oct 8, 2016 at 5:52 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Oct 8, 2016 at 9:12 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>>> Robert Haas wrote:
>>>> On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
>>>> I don't know, but it seems like the documentation for vacuumdb
>>>> currently says, more or less, "Hey, if you use -j with -f, it may not
>>>> work!", which seems unacceptable to me.  It should be the job of the
>>>> person writing the feature to make it work in all cases, not the job
>>>> of the person using the feature to work around the problem when it
>>>> doesn't.
>>>
>>> The most interesting use case of vacuumdb is lazy vacuuming, I think, so
>>> committing that patch as it was submitted previously was a good step
>>> forward even if it didn't handle VACUUM FULL 100%.
>>>
>>> I agree that it's better to have both modes Just Work in parallel, which
>>> is the point of this subsequent patch.  So let's move forward.  I
>>> support Francisco's effort to make -f work with -j.  I don't have a
>>> strong opinion on which of the various proposals presented so far is the
>>> best way to implement it, but let's figure that out and get it done.
>>>
>>
>> After reading Francisco's proposal [1], I don't think it is directly
>> trying to make -f and -j work together.  He is proposing to make it
>> work by providing some new options.  As you are wondering upthread, I
>> think it seems reasonable to disallow -f with parallel vacuuming if no
>> tables are specified.
>
> Instead of restricting completely things, I'd like to think that being
> able to make both of them work together is the right move at the end.
>

Sure, if somebody can come up with a patch which can safely avoid the
deadlock when both -f and -j options are used, then we should go that
way. Otherwise we can block those options to be used together rather
than just have a note in docs.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: vacuumdb -f and -j options (was Question / requests.)

От
Pavel Stehule
Дата:


2016-10-09 7:54 GMT+02:00 Amit Kapila <amit.kapila16@gmail.com>:
On Sat, Oct 8, 2016 at 5:52 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Oct 8, 2016 at 9:12 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>>> Robert Haas wrote:
>>>> On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
>>>> I don't know, but it seems like the documentation for vacuumdb
>>>> currently says, more or less, "Hey, if you use -j with -f, it may not
>>>> work!", which seems unacceptable to me.  It should be the job of the
>>>> person writing the feature to make it work in all cases, not the job
>>>> of the person using the feature to work around the problem when it
>>>> doesn't.
>>>
>>> The most interesting use case of vacuumdb is lazy vacuuming, I think, so
>>> committing that patch as it was submitted previously was a good step
>>> forward even if it didn't handle VACUUM FULL 100%.
>>>
>>> I agree that it's better to have both modes Just Work in parallel, which
>>> is the point of this subsequent patch.  So let's move forward.  I
>>> support Francisco's effort to make -f work with -j.  I don't have a
>>> strong opinion on which of the various proposals presented so far is the
>>> best way to implement it, but let's figure that out and get it done.
>>>
>>
>> After reading Francisco's proposal [1], I don't think it is directly
>> trying to make -f and -j work together.  He is proposing to make it
>> work by providing some new options.  As you are wondering upthread, I
>> think it seems reasonable to disallow -f with parallel vacuuming if no
>> tables are specified.
>
> Instead of restricting completely things, I'd like to think that being
> able to make both of them work together is the right move at the end.
>

Sure, if somebody can come up with a patch which can safely avoid the
deadlock when both -f and -j options are used, then we should go that
way. Otherwise we can block those options to be used together rather
than just have a note in docs.

+1

Pavel
 

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: vacuumdb -f and -j options (was Question / requests.)

От
Francisco Olarte
Дата:
On Sat, Oct 8, 2016 at 2:22 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Oct 8, 2016 at 9:12 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>>> Robert Haas wrote:
>>>> On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
>>>> I don't know, but it seems like the documentation for vacuumdb
>>>> currently says, more or less, "Hey, if you use -j with -f, it may not
>>>> work!", which seems unacceptable to me.  It should be the job of the
>>>> person writing the feature to make it work in all cases, not the job
>>>> of the person using the feature to work around the problem when it
>>>> doesn't.
>>> The most interesting use case of vacuumdb is lazy vacuuming, I think, so
>>> committing that patch as it was submitted previously was a good step
>>> forward even if it didn't handle VACUUM FULL 100%.
>>>
>>> I agree that it's better to have both modes Just Work in parallel, which
>>> is the point of this subsequent patch.  So let's move forward.  I
>>> support Francisco's effort to make -f work with -j.  I don't have a
>>> strong opinion on which of the various proposals presented so far is the
>>> best way to implement it, but let's figure that out and get it done.
>> After reading Francisco's proposal [1], I don't think it is directly
>> trying to make -f and -j work together.  He is proposing to make it
>> work by providing some new options.  As you are wondering upthread, I
>> think it seems reasonable to disallow -f with parallel vacuuming if no
>> tables are specified.

For me -f & -j is not perfect, but better than not having it. It can
deadlock when given certain sets of catalog tables, either by making
it go for the full db or by a perverse set of -t options. But any DBA
needing them together should, IMO, have resources to write ( or have
someone else write for him ) a 20-liner wrapping and feeding them via
-t. After all, not every tool/option is for everyone, and everything
has it prerequisites.

What I'm trying to do in my patch is just to give vacuumdb the ability
to build a list of -t options by schema-filtering. I felt this was
really easy to do, and it could be used to script-around the problem
in a big portion of the cases, and also felt it could be useful for
other things ( the ability to vacuum some schemas or all but some
schemas seems useful even without having -f or -j ).

> Instead of restricting completely things, I'd like to think that being
> able to make both of them work together is the right move at the end.
> From what I recall from the code of vacuumdb, I agree with Alvaro's
> position: it would not be much a complicated challenge to vacuum all
> the catalogs in one worker and spread the rest of the tables in the
> rest of them. We need to be careful of the case where a list of tables
> is given by the user via -t though, in the case where user is passing
> both catalog and normal relations.

This is something more complicated, needing more complicated patching
than the one I was trying to do. If someone is going to look at it I
would back up myself until done and look at the code again when done,
as I feel it needs more scarce postgres internal talent than mine and
the paths are going to interfere, and as the schema filtering is just
some string manipulation code I feel the proper way to do it would be
the last one.

I also feel against completely disallowing it. May be warn a bit more
tops or perhaps resticting it when not having a table list . I feel
being able to combine -j and -f is like rocket fuel, messy and
dangerous, but sometimes you need to use it, and if you are careful
all should be fine.

Francisco Olarte.



Re: vacuumdb -f and -j options (was Question / requests.)

От
Amit Kapila
Дата:
On Sun, Oct 9, 2016 at 10:59 PM, Francisco Olarte
<folarte@peoplecall.com> wrote:
> On Sat, Oct 8, 2016 at 2:22 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Sat, Oct 8, 2016 at 9:12 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>> After reading Francisco's proposal [1], I don't think it is directly
>>> trying to make -f and -j work together.  He is proposing to make it
>>> work by providing some new options.  As you are wondering upthread, I
>>> think it seems reasonable to disallow -f with parallel vacuuming if no
>>> tables are specified.
>
> For me -f & -j is not perfect, but better than not having it. It can
> deadlock when given certain sets of catalog tables, either by making
> it go for the full db or by a perverse set of -t options. But any DBA
> needing them together should, IMO, have resources to write ( or have
> someone else write for him ) a 20-liner wrapping and feeding them via
> -t. After all, not every tool/option is for everyone, and everything
> has it prerequisites.
>

Okay, but I think that doesn't mean it should deadlock when used by
somewhat naive user.  I am not sure every user who wants to use -f and
-j is smart enough to write a script as you are suggesting.  I think
if more people see your proposal as meaningful and want to leave
current usage of -f and -j as it is, then probably, we should issue a
warning indicating such a risk.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: vacuumdb -f and -j options (was Question / requests.)

От
Francisco Olarte
Дата:
On Mon, Oct 10, 2016 at 3:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sun, Oct 9, 2016 at 10:59 PM, Francisco Olarte
> <folarte@peoplecall.com> wrote:
>> For me -f & -j is not perfect, but better than not having it. It can
>> deadlock when given certain sets of catalog tables, either by making
>> it go for the full db or by a perverse set of -t options. But any DBA
>> needing them together should, IMO, have resources to write ( or have
>> someone else write for him ) a 20-liner wrapping and feeding them via
>> -t. After all, not every tool/option is for everyone, and everything
>> has it prerequisites.
> Okay, but I think that doesn't mean it should deadlock when used by
> somewhat naive user.  I am not sure every user who wants to use -f and
> -j is smart enough to write a script as you are suggesting.  I think
> if more people see your proposal as meaningful and want to leave
> current usage of -f and -j as it is, then probably, we should issue a
> warning indicating such a risk.

I agree. It should NOT deadlock, but sadly it does. And disallowing it
feels wrong. I'm all in for emitting a warning whenever it is used and
even disallowing it when no table list is given, but I was trying to
avoid the sugestion of just disallowing it always because it may
deadlock ( even with a table list it may, as nothing forbids you from
entering a locky set of tables ). And some people wrote what I
interpreted as 'throw it out if it is not perfect, put logic in to
make full paralell work partially in series so it does not deadlock or
forbid it all long' ( which is IMHO not easy, and I would dislike it,
if I order full paralell, vdb does it, if it deadlocks is because I
wanted it to, if someone wants that a new '--auto-serial-as-needed'
switch could be added ).

Francisco Olarte.