Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

Поиск
Список
Период
Сортировка
От Japin Li
Тема Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
Дата
Msg-id ME0P300MB04455377D5C3926CE2B47423B654A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
обсуждение исходный текст
Список pgsql-hackers
On Mon, 14 Jul 2025 at 18:47, Peter Smith <smithpb2250@gmail.com> wrote:
> Hi Japin,
>
> Thanks for your README questions.
>
> On Fri, Jul 11, 2025 at 7:18 PM Japin Li <japinli@hotmail.com> wrote:
> ...
>>
>> 3.
>> In the README, 'TID' seems to have conflicting definitions:
>> Transaction ID (2.1) vs. tuple physical identifier (2.3.1).
>>
>> Could you confirm the intended meaning? Suggest using 'XID' for Transaction ID
>> if my understanding is correct.
>>
>
> Yes, TID was meant only for the Tuple identifier. Some terms became
> muddled. Hopefully, those are fixed now.
>

Thanks for your confirmation.

>> 4.
>> -1:  TID relation (maps CRID to original TID)
>> -5:  TID-CRID mapping table
>>
>> I'm trying to understand the distinctions here. Based on the definition in
>> vci_tidcrid.h, it seems plausible to use just one relation for the mapping,
>> suggesting a potential redundancy.
>>
>> /*
>>  * TID-CRID pair used for TIDCRID update list
>>  */
>> typedef struct vcis_tidcrid_pair_item
>> {
>>     ItemPointerData page_item_id;   /* TID on the original relation */
>>     vcis_Crid   crid;           /* CRID */
>> } vcis_tidcrid_pair_item_t;
>>
>> How they are different? I see the code in vci_tidcrid.c
>>
>
> AFAIK, the distinction is described by the code comments in vci_columns.h:
>
> +/** Column ID of special column */
> +#define VCI_COLUMN_ID_TID          (-1)
> +#define VCI_COLUMN_ID_NULL       (-2)
> +#define VCI_COLUMN_ID_DELETE  (-3)
>
> So those are all special columns in the ROS data part. In other words,
> these internal relations all have data that is indexed by the CRID –
> e.g “Delete vector” (2.3.3)  and “Null information” (2.3.4). So here,
> the TID relation is the mapping from the CRID back to the original
> TID.
>
>  On the other hand, the other relations...
>
> +/**  The data below are not column-stored data.
> + * We prepare them for convenience.
> + */
> +#define VCI_COLUMN_ID_TID_CRID                  (-5)
> +#define VCI_COLUMN_ID_TID_CRID_UPDATE  (-6)
> +#define VCI_COLUMN_ID_TID_CRID_WRITE     (-7)
> +#define VCI_COLUMN_ID_TID_CRID_CDR         (-8)
> +#define VCI_COLUMN_ID_DATA_WOS                (-9)
> +#define VCI_COLUMN_ID_WHITEOUT_WOS     (-10)
>
> … are not “column-stored” – In other words, these ones, including the
> "TID-CRID mapping table” (-5), are *not* indexed by CRID.
>
> You may be right about a potential redundancy. But right now we're
> focused on making these patches ready for open source - removing dead
> code to shrink the size, improving the PostgreSQL core interface, and
> fixing bugs. Rewriting or optimising the logic will have to wait.
>
>

Appreciate the detailed explanation!  I'll dive deeper into it.

--
Regards,
Japin Li



В списке pgsql-hackers по дате отправления: