Re: [v9.3] writable foreign tables

Поиск
Список
Период
Сортировка
От Kohei KaiGai
Тема Re: [v9.3] writable foreign tables
Дата
Msg-id CADyhKSVvKz+YKVJ91uBBOZzT1QfAc-QrdtdrjwSfxuXZ0JMDCw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [v9.3] writable foreign tables  ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
Ответы Re: [v9.3] writable foreign tables
Re: [v9.3] writable foreign tables
Список pgsql-hackers
2012/8/27 Albe Laurenz <laurenz.albe@wien.gv.at>:
> Kohei KaiGai wrote:
>> 2012/8/25 Robert Haas <robertmhaas@gmail.com>:
>>> On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai <kaigai@kaigai.gr.jp>
> wrote:
>>>> It is a responsibility of FDW extension (and DBA) to ensure each
>>>> foreign-row has a unique identifier that has 48-bits width integer
>>>> data type in maximum.
>
>>> It strikes me as incredibly short-sighted to decide that the row
>>> identifier has to have the same format as what our existing heap AM
>>> happens to have.  I think we need to allow the row identifier to be
> of
>>> any data type, and even compound.  For example, the foreign side
> might
>>> have no equivalent of CTID, and thus use primary key.  And the
> primary
>>> key might consist of an integer and a string, or some such.
>
>> I assume it is a task of FDW extension to translate between the pseudo
>> ctid and the primary key in remote side.
>>
>> For example, if primary key of the remote table is Text data type, an
> idea
>> is to use a hash table to track the text-formed primary being
> associated
>> with a particular 48-bits integer. The pseudo ctid shall be utilized
> to track
>> the tuple to be modified on the scan-stage, then FDW can reference the
>> hash table to pull-out the primary key to be provided on the prepared
>> statement.
>
> And what if there is a hash collision?  Then you would not be able to
> determine which row is meant.
>
Even if we had a hash collision, each hash entry can have the original
key itself to be compared. But anyway, I love the idea to support
an opaque pointer to track particular remote-row rather.

> I agree with Robert that this should be flexible enough to cater for
> all kinds of row identifiers.  Oracle, for example, uses ten byte
> identifiers which would give me a headache with your suggested design.
>
>> Do we have some other reasonable ideas?
>
> Would it be too invasive to introduce a new pointer in TupleTableSlot
> that is NULL for anything but virtual tuples from foreign tables?
>
I'm not certain whether the duration of TupleTableSlot is enough to
carry a private datum between scan and modify stage.
For example, the TupleTableSlot shall be cleared at ExecNestLoop
prior to the slot being delivered to ExecModifyTuple.

postgres=# EXPLAIN UPDATE t1 SET b = 'abcd' WHERE a IN (SELECT x FROM
t2 WHERE x % 2 = 0);                                 QUERY PLAN
-------------------------------------------------------------------------------Update on t1  (cost=0.00..54.13 rows=6
width=16) ->  Nested Loop  (cost=0.00..54.13 rows=6 width=16)        ->  Seq Scan on t2  (cost=0.00..28.45 rows=6
width=10)             Filter: ((x % 2) = 0)        ->  Index Scan using t1_pkey on t1  (cost=0.00..4.27 rows=1
width=10)             Index Cond: (a = t2.x)
 
(6 rows)

Is it possible to utilize ctid field to move a private pointer?
TID data type is internally represented as a pointer to ItemPointerData,
so it has enough width to track an opaque formed remote-row identifier;
including string, int64 or others.

One disadvantage is "ctid" system column shows a nonsense value
when user explicitly references this system column. But it does not
seems to me a fundamental problem, because we didn't give any
special meaning on the "ctid" field of foreign table.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ants Aasma
Дата:
Сообщение: Re: Timing overhead and Linux clock sources
Следующее
От: Kohei KaiGai
Дата:
Сообщение: Re: [v9.3] writable foreign tables