Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
От | Japin Li |
---|---|
Тема | Re: [WIP]Vertical Clustered Index (columnar store extension) - take2 |
Дата | |
Msg-id | ME0P300MB04455377D5C3926CE2B47423B654A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM обсуждение исходный текст |
Список | pgsql-hackers |
On Mon, 14 Jul 2025 at 18:47, Peter Smith <smithpb2250@gmail.com> wrote: > Hi Japin, > > Thanks for your README questions. > > On Fri, Jul 11, 2025 at 7:18 PM Japin Li <japinli@hotmail.com> wrote: > ... >> >> 3. >> In the README, 'TID' seems to have conflicting definitions: >> Transaction ID (2.1) vs. tuple physical identifier (2.3.1). >> >> Could you confirm the intended meaning? Suggest using 'XID' for Transaction ID >> if my understanding is correct. >> > > Yes, TID was meant only for the Tuple identifier. Some terms became > muddled. Hopefully, those are fixed now. > Thanks for your confirmation. >> 4. >> -1: TID relation (maps CRID to original TID) >> -5: TID-CRID mapping table >> >> I'm trying to understand the distinctions here. Based on the definition in >> vci_tidcrid.h, it seems plausible to use just one relation for the mapping, >> suggesting a potential redundancy. >> >> /* >> * TID-CRID pair used for TIDCRID update list >> */ >> typedef struct vcis_tidcrid_pair_item >> { >> ItemPointerData page_item_id; /* TID on the original relation */ >> vcis_Crid crid; /* CRID */ >> } vcis_tidcrid_pair_item_t; >> >> How they are different? I see the code in vci_tidcrid.c >> > > AFAIK, the distinction is described by the code comments in vci_columns.h: > > +/** Column ID of special column */ > +#define VCI_COLUMN_ID_TID (-1) > +#define VCI_COLUMN_ID_NULL (-2) > +#define VCI_COLUMN_ID_DELETE (-3) > > So those are all special columns in the ROS data part. In other words, > these internal relations all have data that is indexed by the CRID – > e.g “Delete vector” (2.3.3) and “Null information” (2.3.4). So here, > the TID relation is the mapping from the CRID back to the original > TID. > > On the other hand, the other relations... > > +/** The data below are not column-stored data. > + * We prepare them for convenience. > + */ > +#define VCI_COLUMN_ID_TID_CRID (-5) > +#define VCI_COLUMN_ID_TID_CRID_UPDATE (-6) > +#define VCI_COLUMN_ID_TID_CRID_WRITE (-7) > +#define VCI_COLUMN_ID_TID_CRID_CDR (-8) > +#define VCI_COLUMN_ID_DATA_WOS (-9) > +#define VCI_COLUMN_ID_WHITEOUT_WOS (-10) > > … are not “column-stored” – In other words, these ones, including the > "TID-CRID mapping table” (-5), are *not* indexed by CRID. > > You may be right about a potential redundancy. But right now we're > focused on making these patches ready for open source - removing dead > code to shrink the size, improving the PostgreSQL core interface, and > fixing bugs. Rewriting or optimising the logic will have to wait. > > Appreciate the detailed explanation! I'll dive deeper into it. -- Regards, Japin Li
В списке pgsql-hackers по дате отправления: