Re: [proposal] de-TOAST'ing using a iterator

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: [proposal] de-TOAST'ing using a iterator
Дата
Msg-id CACPNZCsKfHMg4uW5Jk+1Od=U4mcod6fWybeYkz0kHVAoMxWUzg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [proposal] de-TOAST'ing using a iterator  (Binguo Bao <djydewang@gmail.com>)
Ответы Re: [proposal] de-TOAST'ing using a iterator  (Binguo Bao <djydewang@gmail.com>)
Список pgsql-hackers
On Tue, Jul 30, 2019 at 8:20 PM Binguo Bao <djydewang@gmail.com> wrote:
>
> John Naylor <john.naylor@2ndquadrant.com> 于2019年7月29日周一 上午11:49写道:
>>
>> 1). For every needle comparison, text_position_next_internal()
>> calculates how much of the value is needed and passes that to
>> detoast_iterate(), which then calculates if it has to do something or
>> not. This is a bit hard to follow. There might also be a performance
>> penalty -- the following is just a theory, but it sounds plausible:
>> The CPU can probably correctly predict that detoast_iterate() will
>> usually return the same value it did last time, but it still has to
>> call the function and make sure, which I imagine is more expensive
>> than advancing the needle. Ideally, we want to call the iterator only
>> if we have to.
>>
>> In the attached patch (applies on top of your v5),
>> text_position_next_internal() simply compares hptr to the detoast
>> buffer limit, and calls detoast_iterate() until it can proceed. I
>> think this is clearer.
>
>
> Yes, I think this is a general scenario where the caller continually
> calls detoast_iterate until gets enough data, so I think such operations can
> be extracted as a macro, as I did in patch v6. In the macro, the detoast_iterate
> function is called only when the data requested by the caller is greater than the
> buffer limit.

I like the use of a macro here. However, I think we can find a better
location for the definition. See the header comment of fmgr.h:
"Definitions for the Postgres function manager and function-call
interface." Maybe tuptoaster.h is as good a place as any?

>> I also noticed that no one updates or looks at
>> "toast_iter.done" so I removed that as well.
>
>
> toast_iter.done is updated when the buffer limit reached the buffer capacity now.
> So, I added it back.

Okay.

>> 2). detoast_iterate() and fetch_datum_iterate() return a value but we
>> don't check it or do anything with it. Should we do something with it?
>> It's also not yet clear if we should check the iterator state instead
>> of return values. I've added some XXX comments as a reminder. We
>> should also check the return value of pglz_decompress_iterate().
>
>
> IMO, we need to provide users with a simple iterative interface.
> Using the required data pointer to compare with the buffer limit is an easy way.
> And the application scenarios of the iterator are mostly read operations.
> So I think there is no need to return a value, and the iterator needs to throw an
> exception for some wrong calls, such as all the data have been iterated,
> but the user still calls the iterator.

Okay, and see these functions now return void. The orignal
pglz_decompress() returned a value that was check against corruption.
Is there a similar check we can do for the iterative version?

>> 3). Speaking of pglz_decompress_iterate(), I diff'd it with
>> pglz_decompress(), and I have some questions on it:
>>
>> a).
>> + srcend = (const unsigned char *) (source->limit == source->capacity
>> ? source->limit : (source->limit - 4));
>>
>> What does the 4 here mean in this expression?
>
>
> Since we fetch chunks one by one, if we make srcend equals to the source buffer limit,
> In the while loop "while (sp < srcend && dp < destend)", sp may exceed the source buffer limit and read unallocated
bytes.

Why is this? That tells me the limit is incorrect. Can the setter not
determine the right value?

> Giving a four-byte buffer can prevent sp from exceeding the source buffer limit.

Why 4? That's a magic number. Why not 2, or 27?

> If we have read all the chunks, we don't need to be careful to cross the border,
> just make srcend equal to source buffer limit. I've added comments to explain it in patch v6.

That's a good thing to comment on, but it doesn't explain why. This
logic seems like a band-aid and I think a committer would want this to
be handled in a more principled way.

>> Is it possible it's
>> compensating for this bit in init_toast_buffer()?
>>
>> + buf->limit = VARDATA(buf->buf);
>>
>> It seems the initial limit should also depend on whether the datum is
>> compressed, right? Can we just do this:
>>
>> + buf->limit = buf->position;
>
>
> I'm afraid not. buf->position points to the data portion of the buffer, but the beginning of
> the chunks we read may contain header information. For example, for compressed data chunks,
> the first four bytes record the size of raw data, this means that limit is four bytes ahead of position.
> This initialization doesn't cause errors, although the position is less than the limit in other cases.
> Because we always fetch chunks first, then decompress it.

I see what you mean now. This could use a comment or two to explain
the stated constraints may not actually be satisfied at
initialization.

>> b).
>> - while (sp < srcend && dp < destend)
>> ...
>> + while (sp + 1 < srcend && dp < destend &&
>> ...
>>
>> Why is it here "sp + 1"?
>
>
> Ignore it, I set the inactive state of detoast_iter->ctrl to 8 in patch v6 to
> achieve the purpose of parsing ctrl correctly every time.

Please explain further. Was the "sp + 1" correct behavior (and why),
or only for debugging setting ctrl/c correctly? Also, I don't think
the new logic for the ctrl/c variables is an improvement:

1. iter->ctrlc is intialized with '8' (even in the uncompressed case,
which is confusing). Any time you initialize with something not 0 or
1, it's a magic number, and here it's far from where the loop variable
is used. This is harder to read.

2. First time though the loop, iter->ctrlc = 8, which immediately gets
set back to 0.

3. At the end of the loop, iter->ctrl/c are unconditionally set. In
v5, there was a condition which would usually avoid this copying of
values through pointers.

>> 4. Note that varlena.c has a static state variable, and a cleanup
>> function that currently does:
>>
>> static void
>> text_position_cleanup(TextPositionState *state)
>> {
>> /* no cleanup needed */
>> }
>>
>> It seems to be the detoast iterator could be embedded in this state
>> variable, and then free-ing can happen here. That has a possible
>> advantage that the iterator struct would be on the same cache line as
>> the state data. That would also remove the need to pass "iter" as a
>> parameter, since these functions already pass "state". I'm not sure if
>> this would be good for other users of the iterator, so maybe we can
>> hold off on that for now.
>
>
> Good idea. I've implemented it in patch v6.

That's better, and I think we can take it a little bit farther.

1. Notice that TextPositionState is allocated on the stack in
text_position(), which passes both the "state" pointer and the "iter"
pointer to text_position_setup(), and only then sets state->iter =
iter. We can easily set this inside text_position(). That would get
rid of the need for other callers to pass NULL iter to
text_position_setup().

2. DetoastIteratorData is fixed size, so I see no reason to allocate
it on the heap. We could allocate it on the stack in text_pos(), and
pass the pointer to create_detoast_iterator() (in this case maybe a
better name is init_detoast_iterator), which would return a bool to
tell text_pos() whether to pass down the pointer or a NULL. The
allocation of other structs (toast buffer and fetch iterator) probably
can't be changed without more work.

>> 5. Would it be a good idea to add tests (not always practical), or
>> more Assert()'s? You probably already know this, but as a reminder
>> it's good to develop with asserts enabled, but never build with them
>> for performance testing.
>
>
> I've added more Assert()'s to check iterator state.

Okay.

> BTW, I found that iterators come in handy for json/jsonb's find field value or get array elements operations.
> I will continue to optimize the json/jsonb query based on the detoast iterator patch.

That will be an interesting use case.

There are other aspects of the patch I should investigate, but I'll
put that off for another time. Commitfest is over, but note that
review can happen at any time. I'll continue to do so as time permits.

--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Hash join explain is broken
Следующее
От: Sergei Kornilov
Дата:
Сообщение: Re: complier warnings from ecpg tests