connecting multiple INSERT CTEs to same record?

Поиск
Список
Период
Сортировка
От Assaf Gordon
Тема connecting multiple INSERT CTEs to same record?
Дата
Msg-id 68708e28-dd17-13d6-2acd-9ca4f9735379@gmail.com
обсуждение исходный текст
Ответы Re: connecting multiple INSERT CTEs to same record?  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-general
Hello,

I'm looking for a way to insert items to multiple tables (connected with 
foreign keys) using CTEs.

I found few similar questions on stack-overflow and the mailing list,
but no solution.

Please consider this contrived example:
A table of students, and a table of classes they are in:
====
   create temp table students(
         id serial primary key,
         name varchar);

   create temp table classes(
         id serial primary key,
         student_id int not null references students(id),
         subject varchar);
====

And,
I am given a list of new students and their classes (from an external 
source, imported into a temp table):

====
create temp table new_data (
         name varchar,
         subject varchar);

insert into new_data values
      ('John','Math'),
      ('Jane','Physics'),
      ('Moe','Science'),
      ('John','English'); -- different student with same name
====


I want to first create new 'students' record, and then create a
corresponding 'classes' record. For that - I need both the newly
assigned 'id', and they corresponding source record from 'new_data'.

A trivial usage of "insert ... returning *" doesn't work - it can't
return columns that were not directly inserted (and the 'subject' string 
wasn't inserted):

===
   with
      new_students as (
        insert into students(name)
        select name from new_data
        returning *
       )

    insert into classes(student_id, subject)
    select new_students.id,
            -- how to get the subject matching to
            --- the newly inserted student?
            ??????
   from new_students ;
===

I found a work-around, but it relies on a flimsy assumption - that 
within this transaction, the CTIDs for the 'new_data' and the newly
inserted 'students' rows maintain the same order.
Using this assumption, I calculate a unique value for each row with 
'row_number() over (order by ctid)',
assuming that the order stays the same,
then join the tables to match the new student to its 'subject':

===
with
   new_data_with_order as (
        select
            row_number() over (order by ctid) as new_order,
            *
        from new_data )
   ,
   new_students as (
       insert into students(name)
       select name from new_data_with_order
       returning ctid,*
       )
   ,
   new_students_with_order as (
        select
            row_number() over (order by ctid) as new_order,
            *
        from new_students
   )
   ,

   merged_data as (
      -- Is this correct??
      -- it is based on the assumption that the order
      -- of inserted rows into 'students' (and hence, their CTID order)
      -- is the same as the order of the rows in 'new_data'
      -- and their CTID.
      select *
      from new_students_with_order
      join new_data_with_order
      on
        new_students_with_order.new_order =
          new_data_with_order.new_order
   )

   -- for troubleshooting:
   -- select * from merged_data ;

   insert into classes(student_id, subject)
   select
      id, -- merged_data.id is the newly assigned student.id primary key.
      subject
   from merged_data ;

===


So my question is:
Is this assumption about CTID maintaining order (within the transaction) 
correct? Even when considering concurrency?
Even if in strange coincidence 'VACCUUM' is running in parallel to this 
query?

Or,
Is there another way to achieve this? consider that this is a contrived
example, in my use-case I need to update several tables.
Of course it can be done on the application-level, one-by-one inserting 
new 'student' then inserting its 'classes' - but I prefer to avoid that.

Sadly, I can't assume the student name is unique, so I can't "join" on it.


Thanks!
  - Assaf Gordon




В списке pgsql-general по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Where is the tsrange() function documented?
Следующее
От: Pawan Sharma
Дата:
Сообщение: spannerdb migration to PostgreSQL