Обсуждение: "buffer too small" or "path too long"?

Поиск
Список
Период
Сортировка

"buffer too small" or "path too long"?

От
Kyotaro Horiguchi
Дата:
Recently we added the error messages "buffer for root directory too
small" and siblings to pg_upgrade.  This means "<new_cluster's
pgdata>/pg_upgrade_output.d" was longer than MAXPGPATH.

I feel that the "root directory" is obscure here, and moreover "buffer
is too small" looks pointless since no user can do something on the
buffer length.  At least I can't read out from the message concretely
what I should do next..

The root cause of the errors is that the user-provided directory path
of new cluster's root was too long.  Anywhich one of the four buffers
is overflowed, it doesn't makes any difference for users and doesn't
offer any further detail to suppoerters/developers.  I see "output
directory path of new cluster too long" clear enough.

Above all, this change reduces the number of messages that need
translation:)

# And the messages are missing trailing line breaks.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Вложения

Re: "buffer too small" or "path too long"?

От
Tom Lane
Дата:
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> The root cause of the errors is that the user-provided directory path
> of new cluster's root was too long.  Anywhich one of the four buffers
> is overflowed, it doesn't makes any difference for users and doesn't
> offer any further detail to suppoerters/developers.  I see "output
> directory path of new cluster too long" clear enough.

+1, but I'm inclined to make it read "... is too long".

> # And the messages are missing trailing line breaks.

I was about to question that, but now I remember that pg_upgrade has
its own logging facility with a different idea about who provides
the trailing newline than common/logging.[hc] has.  Undoubtedly
that's the source of this mistake.  We really need to get pg_upgrade
out of the business of having its own logging conventions.

            regards, tom lane



Re: "buffer too small" or "path too long"?

От
Kyotaro Horiguchi
Дата:
At Mon, 13 Jun 2022 13:25:01 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> > The root cause of the errors is that the user-provided directory path
> > of new cluster's root was too long.  Anywhich one of the four buffers
> > is overflowed, it doesn't makes any difference for users and doesn't
> > offer any further detail to suppoerters/developers.  I see "output
> > directory path of new cluster too long" clear enough.
> 
> +1, but I'm inclined to make it read "... is too long".

Yeah, I feel so and it is what I wondered about recently when I saw
some complete error messages.  Is that because of the length of the
subject?

> > # And the messages are missing trailing line breaks.
> 
> I was about to question that, but now I remember that pg_upgrade has
> its own logging facility with a different idea about who provides
> the trailing newline than common/logging.[hc] has.  Undoubtedly
> that's the source of this mistake.  We really need to get pg_upgrade
> out of the business of having its own logging conventions.

Yes... I don't find a written reason excluding pg_upgrade in both the
commit 9a374b77fb and or the thread [1].  But I guess that we decided
that we first provide the facility in the best style ignoring the
current impletent in pg_upgrade then let pg_upgrade use it.  So I
think it should emerge in the next cycle?  I'll give it a shot if no
one is willing to do that for now. (I believe it is straightforward..)

[1] https://www.postgresql.org/message-id/941719.1645587865%40sss.pgh.pa.us

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: "buffer too small" or "path too long"?

От
Kyotaro Horiguchi
Дата:
At Tue, 14 Jun 2022 09:48:26 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> At Mon, 13 Jun 2022 13:25:01 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> > +1, but I'm inclined to make it read "... is too long".
> 
> Yeah, I feel so and it is what I wondered about recently when I saw
> some complete error messages.  Is that because of the length of the
> subject?

And I found that it is alrady done. Thanks!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: "buffer too small" or "path too long"?

От
Michael Paquier
Дата:
On Tue, Jun 14, 2022 at 09:52:52AM +0900, Kyotaro Horiguchi wrote:
> At Tue, 14 Jun 2022 09:48:26 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
>> Yeah, I feel so and it is what I wondered about recently when I saw
>> some complete error messages.  Is that because of the length of the
>> subject?
>
> And I found that it is alrady done. Thanks!

I have noticed this thread and 4e54d23 as a result this morning.  If
you want to spread this style more, wouldn't it be better to do that
in all the places of pg_upgrade where we store paths to files?  I can
see six code paths with log_opts.basedir that could do the same, as of
the attached.  The hardcoded file names have various lengths, and some
of them are quite long making the generated paths more exposed to
being cut in the middle.
--
Michael

Вложения

Re: "buffer too small" or "path too long"?

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> I have noticed this thread and 4e54d23 as a result this morning.  If
> you want to spread this style more, wouldn't it be better to do that
> in all the places of pg_upgrade where we store paths to files?  I can
> see six code paths with log_opts.basedir that could do the same, as of
> the attached.  The hardcoded file names have various lengths, and some
> of them are quite long making the generated paths more exposed to
> being cut in the middle.

Well, I just fixed the ones in make_outputdirs because it seemed weird
that that part of the function was not doing something the earlier parts
did.  I didn't look around for more trouble.

I think that pg_fatal'ing on the grounds of path-too-long once we've
already started the upgrade isn't all that great.  Really we want to
fail on that early on --- so coding make_outputdirs like this is
fine, but maybe we need a different plan for files made later.

            regards, tom lane



Re: "buffer too small" or "path too long"?

От
Tom Lane
Дата:
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> At Mon, 13 Jun 2022 13:25:01 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
>> I was about to question that, but now I remember that pg_upgrade has
>> its own logging facility with a different idea about who provides
>> the trailing newline than common/logging.[hc] has.  Undoubtedly
>> that's the source of this mistake.  We really need to get pg_upgrade
>> out of the business of having its own logging conventions.

> Yes... I don't find a written reason excluding pg_upgrade in both the
> commit 9a374b77fb and or the thread [1].

Well, as far as 9a374b77fb went, Peter had left pg_upgrade out of the
mix in the original creation of common/logging.c, and I didn't think
that dealing with that was a reasonable part of my update patch.

> But I guess that we decided
> that we first provide the facility in the best style ignoring the
> current impletent in pg_upgrade then let pg_upgrade use it.  So I
> think it should emerge in the next cycle?  I'll give it a shot if no
> one is willing to do that for now. (I believe it is straightforward..)

Actually, I spent some time earlier today looking into that, and I can
see why Peter stayed away from it :-(.  There are a few issues:

* The inconsistency with the rest of the world about trailing newlines.
That aspect actually seems fixable fairly easily, and I have a patch
mostly done for it.

* logging.c believes it should prefix every line of output with the
program's name and so on.  This doesn't seem terribly appropriate
for pg_upgrade's use --- at least, not unless we make pg_upgrade
WAY less chatty.  Perhaps that'd be fine, I dunno.

* pg_upgrade's pg_log_v duplicates all (well, most) stdout messages
into the INTERNAL_LOG_FILE log file, something logging.c has no
provision for (and it'd not be too easy to do, because of the C
standard's restrictions on use of va_list).  Personally I'd be okay
with nuking the INTERNAL_LOG_FILE log file from orbit, but I bet
somebody will fight to keep it.

* pg_log_v has also got a bunch of specialized rules around how
to format PG_STATUS message traffic.  Again I wonder how useful
that whole behavior really is, but taking it out would be a big
user-visible change.

In short, it seems like pg_upgrade's logging habits are sufficiently
far out in left field that we couldn't rebase it on top of logging.c
without some seriously large user-visible behavioral changes.
I have better things to spend my time on than advocating for that.

However, the inconsistency in newline handling is a problem:
I found that there are already other bugs with missing or extra
newlines, and it will only get worse if we don't unify that
behavior.  So my inclination for now is to fix that and let the
other issues go.  Patch coming.

            regards, tom lane



Re: "buffer too small" or "path too long"?

От
Bruce Momjian
Дата:
On Mon, Jun 13, 2022 at 10:41:41PM -0400, Tom Lane wrote:
> * logging.c believes it should prefix every line of output with the
> program's name and so on.  This doesn't seem terribly appropriate
> for pg_upgrade's use --- at least, not unless we make pg_upgrade
> WAY less chatty.  Perhaps that'd be fine, I dunno.

pg_upgrade was designed to be chatty because it felt it could fail under
unpredictable circumstances --- I am not sure how true that is today.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson




Re: "buffer too small" or "path too long"?

От
Peter Eisentraut
Дата:
On 14.06.22 03:55, Michael Paquier wrote:
> On Tue, Jun 14, 2022 at 09:52:52AM +0900, Kyotaro Horiguchi wrote:
>> At Tue, 14 Jun 2022 09:48:26 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
>>> Yeah, I feel so and it is what I wondered about recently when I saw
>>> some complete error messages.  Is that because of the length of the
>>> subject?
>>
>> And I found that it is alrady done. Thanks!
> 
> I have noticed this thread and 4e54d23 as a result this morning.  If
> you want to spread this style more, wouldn't it be better to do that
> in all the places of pg_upgrade where we store paths to files?  I can
> see six code paths with log_opts.basedir that could do the same, as of
> the attached.  The hardcoded file names have various lengths, and some
> of them are quite long making the generated paths more exposed to
> being cut in the middle.

We have this problem of long file names being silently truncated all 
over the source code.  Instead of equipping each one of them with a 
length check, why don't we get rid of the fixed-size buffers and 
allocate dynamically, as in the attached patch.
Вложения

Re: "buffer too small" or "path too long"?

От
Robert Haas
Дата:
On Wed, Jun 15, 2022 at 2:51 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> We have this problem of long file names being silently truncated all
> over the source code.  Instead of equipping each one of them with a
> length check, why don't we get rid of the fixed-size buffers and
> allocate dynamically, as in the attached patch.

I've always wondered why we rely on MAXPGPATH instead of dynamic
allocation. It seems pretty lame.

I don't know how much we gain by fixing one place and not all the
others, but maybe it would set a trend.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: "buffer too small" or "path too long"?

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jun 15, 2022 at 2:51 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
>> We have this problem of long file names being silently truncated all
>> over the source code.  Instead of equipping each one of them with a
>> length check, why don't we get rid of the fixed-size buffers and
>> allocate dynamically, as in the attached patch.

> I don't know how much we gain by fixing one place and not all the
> others, but maybe it would set a trend.

Yeah, that was what was bugging me about this proposal.  Removing
one function's dependency on MAXPGPATH isn't much of a step forward.

I note also that the patch leaks quite a lot of memory (a kilobyte or
so per pathname, IIRC).  That's probably negligible in this particular
context, but anyplace that was called more than once per program run
would need to be more tidy.

            regards, tom lane



Re: "buffer too small" or "path too long"?

От
Michael Paquier
Дата:
On Wed, Jun 15, 2022 at 02:02:03PM -0400, Tom Lane wrote:
> Yeah, that was what was bugging me about this proposal.  Removing
> one function's dependency on MAXPGPATH isn't much of a step forward.

This comes down to out-of-memory vs path length at the end.  Changing
only the paths of make_outputdirs() without touching all the paths in
check.c and the one in function.c does not sound good to me, as this
increases the risk of failing pg_upgrade in the middle, and that's
what we should avoid, as said upthread.

> I note also that the patch leaks quite a lot of memory (a kilobyte or
> so per pathname, IIRC).  That's probably negligible in this particular
> context, but anyplace that was called more than once per program run
> would need to be more tidy.

Surely.
--
Michael

Вложения

Re: "buffer too small" or "path too long"?

От
Peter Eisentraut
Дата:
On 15.06.22 19:08, Robert Haas wrote:
> On Wed, Jun 15, 2022 at 2:51 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
>> We have this problem of long file names being silently truncated all
>> over the source code.  Instead of equipping each one of them with a
>> length check, why don't we get rid of the fixed-size buffers and
>> allocate dynamically, as in the attached patch.
> 
> I've always wondered why we rely on MAXPGPATH instead of dynamic
> allocation. It seems pretty lame.

I think it came in before we had extensible string buffers APIs.