Обсуждение: Missing CFI in iterate_word_similarity()

Поиск
Список
Период
Сортировка

Missing CFI in iterate_word_similarity()

От
Robins Tharakan
Дата:
Hi,

For long strings, iterate_word_similarity() can run into long-running
tight-loops without honouring interrupts or statement_timeouts. For
example:

postgres=# set statement_timeout='1s';
SET
postgres=# select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';
?column?
----------
(0 rows)
Time: 29615.842 ms (00:29.616)

The associated perf report:

+ 99.98% 0.00% postgres postgres [.] ExecQual
+ 99.98% 0.00% postgres postgres [.] ExecEvalExprSwitchContext
+ 99.98% 0.00% postgres pg_trgm.so [.] strict_word_similarity_commutator_op
+ 99.98% 0.00% postgres pg_trgm.so [.] calc_word_similarity
+ 99.68% 99.47% postgres pg_trgm.so [.] iterate_word_similarity
0.21% 0.03% postgres postgres [.] pg_qsort
0.16% 0.00% postgres [kernel.kallsyms] [k] asm_sysvec_apic_timer_interrupt
0.16% 0.00% postgres [kernel.kallsyms] [k] sysvec_apic_timer_interrupt
0.16% 0.11% postgres [kernel.kallsyms] [k] __softirqentry_text_start
0.16% 0.00% postgres [kernel.kallsyms] [k] irq_exit_rcu

Adding CHECK_FOR_INTERRUPTS() ensures that such queries respond to
statement_timeout & Ctrl-C signals. With the patch applied, the
above query will interrupt more quickly:

postgres=# select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';
ERROR: canceling statement due to statement timeout
Time: 1000.768 ms (00:01.001)

Please find the patch attached. The patch does not show any performance
regressions when run against the above use-case. Thanks to SQLSmith
for indirectly leading me to this scenario.

-
Robins Tharakan
Amazon Web Services

Вложения

Re: Missing CFI in iterate_word_similarity()

От
Daniel Gustafsson
Дата:
> On 2 Aug 2022, at 04:41, Robins Tharakan <tharakan@gmail.com> wrote:

> For long strings, iterate_word_similarity() can run into long-running
> tight-loops without honouring interrupts or statement_timeouts.

> Adding CHECK_FOR_INTERRUPTS() ensures that such queries respond to
> statement_timeout & Ctrl-C signals. With the patch applied, the
> above query will interrupt more quickly:

Makes sense.  While this might be a bit of a theoretical issue given the
lengths required, the fix is still sane and any such query should honor
statement timeouts (especially in a trusted extension).

> Please find the patch attached. The patch does not show any performance
> regressions when run against the above use-case.

I wasn't able to find one either.

+       CHECK_FOR_INTERRUPTS();
+
        /* Get index of next trigram */
        int         trgindex = trg2indexes[i];

Placing code before declarations will generate a compiler warning, so the check
must go after trgindex is declared.  I've fixed that in the attached to get the
cfbot green.  Marking this ready for committer in the meantime.

Looking at this I also noticed that commit be8a7a68662 changed the check_only
param to instead use a flag value but didn't update all comments.  0002 fixes
that while in there.

--
Daniel Gustafsson        https://vmware.com/


Вложения

Re: Missing CFI in iterate_word_similarity()

От
Tom Lane
Дата:
Daniel Gustafsson <daniel@yesql.se> writes:
> Placing code before declarations will generate a compiler warning, so the check
> must go after trgindex is declared.  I've fixed that in the attached to get the
> cfbot green.  Marking this ready for committer in the meantime.

I noticed the same thing, but sticking the CFI immediately after the
declaration didn't read well either.  I was considering moving it to
the bottom of the loop instead of that.  A possible objection is that
if there's ever a "continue;" in the loop, those iterations would bypass
the CFI; but we don't necessarily need a CFI every time.

            regards, tom lane



Re: Missing CFI in iterate_word_similarity()

От
Daniel Gustafsson
Дата:
> On 2 Sep 2022, at 14:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> Placing code before declarations will generate a compiler warning, so the check
>> must go after trgindex is declared.  I've fixed that in the attached to get the
>> cfbot green.  Marking this ready for committer in the meantime.
>
> I noticed the same thing, but sticking the CFI immediately after the
> declaration didn't read well either.  I was considering moving it to
> the bottom of the loop instead of that.

I was contemplating that too, but kept it at the top after seeing quite a few
examples of that in other contrib modules (like amcheck/verify_nbtree.c and
pg_visibility/pg_visibility.c).  I don't have any strong feelings either way,
I'm happy to move it last.

> A possible objection is that
> if there's ever a "continue;" in the loop, those iterations would bypass
> the CFI; but we don't necessarily need a CFI every time.

Yeah, I don't think we need to worry about that.  If an added continue;
shortcuts the loop to the point where keeping the CFI last becomes a problem
then it's probably time to look at rewriting the loop.

--
Daniel Gustafsson        https://vmware.com/




Re: Missing CFI in iterate_word_similarity()

От
Tom Lane
Дата:
Daniel Gustafsson <daniel@yesql.se> writes:
>> On 2 Sep 2022, at 14:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I noticed the same thing, but sticking the CFI immediately after the
>> declaration didn't read well either.  I was considering moving it to
>> the bottom of the loop instead of that.

> I was contemplating that too, but kept it at the top after seeing quite a few
> examples of that in other contrib modules (like amcheck/verify_nbtree.c and
> pg_visibility/pg_visibility.c).  I don't have any strong feelings either way,
> I'm happy to move it last.

You could keep it at the top, but then I'd be inclined to split up
the existing code:

        int            trgindex;

        CHECK_FOR_INTERRUPTS();

        /* Get index of next trigram */
        trgindex = trg2indexes[i];

        /* Update last position of this trigram */
        ...

What's annoying me about the one-liner fix is that it makes it
look like CFI is part of the "Get index" action.

            regards, tom lane



Re: Missing CFI in iterate_word_similarity()

От
Daniel Gustafsson
Дата:
> On 2 Sep 2022, at 15:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> What's annoying me about the one-liner fix is that it makes it
> look like CFI is part of the "Get index" action.

Thats a good point, I'll split the code up to make it clearer.

--
Daniel Gustafsson        https://vmware.com/




Re: Missing CFI in iterate_word_similarity()

От
Daniel Gustafsson
Дата:
> On 2 Sep 2022, at 15:22, Daniel Gustafsson <daniel@yesql.se> wrote:
> 
>> On 2 Sep 2022, at 15:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
>> What's annoying me about the one-liner fix is that it makes it
>> look like CFI is part of the "Get index" action.
> 
> Thats a good point, I'll split the code up to make it clearer.

Done that way and pushed, thanks!

--
Daniel Gustafsson        https://vmware.com/