Re: inconsistent results querying table partitioned by date

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: inconsistent results querying table partitioned by date
Дата
Msg-id 20190510.153337.236910404.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: inconsistent results querying table partitioned by date  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Ответы Re: inconsistent results querying table partitioned by date  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-bugs
At Fri, 10 May 2019 14:37:34 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<49cd5235-cbe6-e686-8014-85c1e45efe56@lab.ntt.co.jp>
> > because timestamp-against-timestamptz comparison is inherently only
> > stable; the pruning code is way exceeding its authority by supposing
> > that a comparison that holds at plan time will hold at runtime,
> > even with a constant comparison value.
> 
> I looked into it and the problem is not really that plan-time pruning is
> comparing stable expressions against partition bounds.  If it had, it
> wouldn't have pruned dataid_201902 anyway, because its bounding range for
> datadatetime is '2019-02-01' to '2019-03-01', which is clearly less than
> '2019-05-09' (Tom's localtimestamp).
> 
> The real problem seems to be with the way range partition pruning assumes
> an operator strategy to perform pruning with.  Quoted query's WHERE clause
> looks something like this: (firstkey = CONSTANT AND secondkey <
> STABLE_EXPRESSION).  From this set of clauses, a (CONSTANT,
> STABLE_EXPRESSION) tuple is formed to be compared against partition bounds
> using row-comparison-like semantics.  As things stand today, the actual
> comparison function (partprune.c: get_matching_range_bounds()) receives
> only the strategy of the last expression, which in this case is that of a
> LESS operator.  When the tuple (CONSTANT, STABLE_EXPRESSION) passes
> through the last step to extract Datum values to be passed to
> get_matching_range_bounds, it's correctly determined that
> STABLE_EXPRESSION cannot be computed during planning and so the tuple is
> truncated to just (CONSTANT-Datum), but the strategy to assume during
> pruning is still that of the LESS operator, whereas now it should really
> be EQUAL.  With LESS semantics, get_matching_range_bounds() concludes that
> no partition bounds are smaller than 1 (extracted from id=1 in the above
> query), except the default partition, so it prunes 'dataid_201902'.

I concluded the same.

> I've attached a patch to fix that.  Actually, I've attached two patches --
> the 1st one adds a test for the misbehaving case with *wrong* output
> wherein a partition is incorrectly pruned, and the 2nd actually fixes the
> bug and updates the output of the test added by the 1st patch.  Divided
> the patch this way just to show the bug clearly.

But this seems a bit wrong.

If the two partition keys were in reverse order, pruning still
fails.

CREATE TABLE dataid2 (
       datadatetime timestamp without time zone NOT NULL,
       id integer not null,
       CONSTRAINT dataid2_pkey PRIMARY KEY (datadatetime, id)
) PARTITION BY RANGE (datadatetime, id);

CREATE TABLE dataid2_201902 PARTITION OF dataid2 FOR VALUES FROM ('2019-02-01 00:00:00', 1) TO ('2019-03-01 00:00:00',
1);

CREATE TABLE dataid2_default PARTITION OF dataid2 DEFAULT;

insert into dataid2 values ('2019-02-24T00:00:00', 1);

select * from dataid2 where id = 1 and datadatetime <  (('2019-02-26T00:00:00'::timestamp::timestamp at time zone
'America/New_York'+ '2 days'::interval) at time zone 'UTC');
 
 datadatetime | id 
--------------+----
(0 rows)

This is wrong.

The condition is divided into two part (id = 1) and (datadatetime
< ..) and the latter reduces to nothing and the former remains
unchanged. Pruning continues using id = 1 and (I suppose) but
that is not partition_range_datum_bsearch()'s assumption. As the
result all partitions (other than default) are gone.

In passing I found a typo while looking this issue.

|   case BTLessStrategyNumber:
|
|     /*
|      * Look for the greatest bound that is < or <= lookup value and
|      * set minoff to its offset.

I think the "minoff" is typo of "maxoff".


> > Seems to be equally broken in v11 and HEAD.  I didn't try v10.
> 
> v10 is fine, as it uses constraint exclusion.
> 
> Attached patches apply to both v11 and HEAD.

Mmm. This doesn't apply on head on my environment.

> patching file src/test/regress/expected/partition_prune.out
> Hunk #1 FAILED at 951.

git rev-parse --short HEAD
d0bbf871ca

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




В списке pgsql-bugs по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: inconsistent results querying table partitioned by date
Следующее
От: Amit Langote
Дата:
Сообщение: Re: inconsistent results querying table partitioned by date