Re: Fully documenting the design of nbtree row comparison scan keys
| От | Peter Geoghegan |
|---|---|
| Тема | Re: Fully documenting the design of nbtree row comparison scan keys |
| Дата | |
| Msg-id | CAH2-Wz=dCBTZk+a=R1q8tm7MdPshPZiuugRdcZx57KetJRK26A@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Fully documenting the design of nbtree row comparison scan keys (Victor Yegorov <vyegorov@gmail.com>) |
| Ответы |
Re: Fully documenting the design of nbtree row comparison scan keys
|
| Список | pgsql-hackers |
On Fri, Oct 31, 2025 at 5:06 AM Victor Yegorov <vyegorov@gmail.com> wrote: > I took a look at the patch. Proposed comments look highly valuable, especially around NULLs, doesn't look immediately obvious,so definitely requires a comment. > Looks good to commit. Cool. > Wouldn't it be good to add such information also into the user documentation, say into > https://www.postgresql.org/docs/current/functions-comparisons.html#ROW-WISE-COMPARISON I wasn't thinking about the user-facing documentation. But I think that you make a good point. The main problem with that documentation of row compares is that we don't say anything about keyset pagination with a multicolumn index (not anywhere). That is almost the entire reason why this feature exists -- but we don't actually say anything about it to users. No wonder the feature is underused. Separately, there is a risk that NULLs will break applications that implement keyset pagination. Attached test case shows what I mean by this. To summarize the test case: The test case shows that QUERY 1 returns slightly more rows than QUERY 2 due to the presence of NULLs in lower-order index columns. This seems surprising. (The underlying issue is more or less the same issue that makes row comparisons confusing to the implementation, especially in places like _bt_set_startikey). A user might expect that it won't matter how many or how few "keyset pages"/row compare queries were used -- one big query (no pagination) should give the same answer as (say) 4 "keyset pagination" queries (just in smaller pieces/sets of rows). But as the test case shows, when there are NULLs in lower-order index columns, that isn't necessarily true. QUERY 1 and QUERY 2 are only guaranteed to return precisely the same rows if we somehow make sure that there can be no rows returned with NULLs in lower order columns (e.g., by using a primary key for keyset pagination queries, or by always adding IS NOT NULL conditions against lower-order columns like "b" and "c"). -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: