Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
От | Peter Smith |
---|---|
Тема | Re: [WIP]Vertical Clustered Index (columnar store extension) - take2 |
Дата | |
Msg-id | CAHut+PtF0Mu=QPhCyTuUJg0RuGSC7Vjr5f6rsasmr+SeMk7L2g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [WIP]Vertical Clustered Index (columnar store extension) - take2 (Japin Li <japinli@hotmail.com>) |
Список | pgsql-hackers |
Hi Japin, Thanks for your README questions. On Fri, Jul 11, 2025 at 7:18 PM Japin Li <japinli@hotmail.com> wrote: ... > > 3. > In the README, 'TID' seems to have conflicting definitions: > Transaction ID (2.1) vs. tuple physical identifier (2.3.1). > > Could you confirm the intended meaning? Suggest using 'XID' for Transaction ID > if my understanding is correct. > Yes, TID was meant only for the Tuple identifier. Some terms became muddled. Hopefully, those are fixed now. > 4. > -1: TID relation (maps CRID to original TID) > -5: TID-CRID mapping table > > I'm trying to understand the distinctions here. Based on the definition in > vci_tidcrid.h, it seems plausible to use just one relation for the mapping, > suggesting a potential redundancy. > > /* > * TID-CRID pair used for TIDCRID update list > */ > typedef struct vcis_tidcrid_pair_item > { > ItemPointerData page_item_id; /* TID on the original relation */ > vcis_Crid crid; /* CRID */ > } vcis_tidcrid_pair_item_t; > > How they are different? I see the code in vci_tidcrid.c > AFAIK, the distinction is described by the code comments in vci_columns.h: +/** Column ID of special column */ +#define VCI_COLUMN_ID_TID (-1) +#define VCI_COLUMN_ID_NULL (-2) +#define VCI_COLUMN_ID_DELETE (-3) So those are all special columns in the ROS data part. In other words, these internal relations all have data that is indexed by the CRID – e.g “Delete vector” (2.3.3) and “Null information” (2.3.4). So here, the TID relation is the mapping from the CRID back to the original TID. On the other hand, the other relations... +/** The data below are not column-stored data. + * We prepare them for convenience. + */ +#define VCI_COLUMN_ID_TID_CRID (-5) +#define VCI_COLUMN_ID_TID_CRID_UPDATE (-6) +#define VCI_COLUMN_ID_TID_CRID_WRITE (-7) +#define VCI_COLUMN_ID_TID_CRID_CDR (-8) +#define VCI_COLUMN_ID_DATA_WOS (-9) +#define VCI_COLUMN_ID_WHITEOUT_WOS (-10) … are not “column-stored” – In other words, these ones, including the "TID-CRID mapping table” (-5), are *not* indexed by CRID. You may be right about a potential redundancy. But right now we're focused on making these patches ready for open source - removing dead code to shrink the size, improving the PostgreSQL core interface, and fixing bugs. Rewriting or optimising the logic will have to wait. > 5. > Typo in README. > - Each extent can have its own independent compression dictionary or all > extents can share a comon dictionary > --> s/comon/common/g > Fixed. ~~~ Please see the updated README that I attached in the previous post. ====== Kind Regards, Peter Smith. Fujitsu Australia
В списке pgsql-hackers по дате отправления: