От: Matthew Wakeling
Тема: plpgsql arrays
Дата: ,
Msg-id: alpine.DEB.2.00.0904031420470.21772@aragorn.flymine.org
(см: обсуждение, исходный текст)
Ответы: Re: plpgsql arrays  (Robert Haas)
Re: plpgsql arrays  (Tom Lane)
Re: plpgsql arrays  (Merlin Moncure)
Список: pgsql-performance

Скрыть дерево обсуждения

plpgsql arrays  (Matthew Wakeling, )
 Re: plpgsql arrays  (Robert Haas, )
  Re: plpgsql arrays  (Matthew Wakeling, )
   Re: plpgsql arrays  (Tom Lane, )
    Re: plpgsql arrays  (Matthew Wakeling, )
     Re: plpgsql arrays  (Tom Lane, )
      Re: plpgsql arrays  (Matthew Wakeling, )
       Re: plpgsql arrays  (Matthew Wakeling, )
       Re: plpgsql arrays  (Tom Lane, )
        Re: plpgsql arrays  (Matthew Wakeling, )
        Re: plpgsql arrays  (Nathan Boley, )
    Re: plpgsql arrays  (Simon Riggs, )
     Re: plpgsql arrays  (Alvaro Herrera, )
     Re: plpgsql arrays  (Tom Lane, )
      Re: plpgsql arrays  (Matthew Wakeling, )
     Re: plpgsql arrays  (Matthew Wakeling, )
      Re: plpgsql arrays  (Robert Haas, )
 Re: plpgsql arrays  (Tom Lane, )
  Re: plpgsql arrays  (Matthew Wakeling, )
   Re: plpgsql arrays  (justin, )
    Re: plpgsql arrays  (Matthew Wakeling, )
     Re: plpgsql arrays  (justin, )
     Re: plpgsql arrays  (Tom Lane, )
      Re: plpgsql arrays  (Matthew Wakeling, )
   Re: plpgsql arrays  (Merlin Moncure, )
    Re: plpgsql arrays  (Tom Lane, )
     Re: plpgsql arrays  (Matthew Wakeling, )
      Re: plpgsql arrays  (Tom Lane, )
 Re: plpgsql arrays  (Merlin Moncure, )
  Re: plpgsql arrays  (Merlin Moncure, )
   Re: plpgsql arrays  (Matthew Wakeling, )
    Re: plpgsql arrays  (Merlin Moncure, )

I'm writing a plpgsql function that effectively does a merge join on the
results of two queries. Now, it appears that I cannot read the results of
two queries as streams in plpgsql, so I need to copy the contents of one
query into an array first, and then iterate over the second query
afterwards.

I have discovered that creating large arrays in plpgql is rather slow. In
fact, it seems to be O(n^2). The following code fragment is incredibly
slow:

  genes = '{}';
  next_new = 1;
  FOR loc IN SELECT location.* FROM location, gene WHERE location.subjectid = gene.id ORDER BY objectid,
intermine_start,intermine_end LOOP 
      genes[next_new] = loc;
      IF (next_new % 10000 = 0) THEN
          RAISE NOTICE 'Scanned % gene locations', next_new;
      END IF;
      next_new = next_new + 1;
  END LOOP;
  genes_size = coalesce(array_upper(genes, 1), 0);
  RAISE NOTICE 'Scanned % gene locations', genes_size;

For 200,000 rows it takes 40 minutes.

So, is there a way to dump the results of a query into an array quickly in
plpgsql, or alternatively is there a way to read two results streams
simultaneously?

Matthew

--
 I would like to think that in this day and age people would know better than
 to open executables in an e-mail. I'd also like to be able to flap my arms
 and fly to the moon.                                    -- Tim Mullen


В списке pgsql-performance по дате сообщения:

От: Nathan Boley
Дата:
Сообщение: Re: plpgsql arrays
От: David Kerr
Дата:
Сообщение: Question on pgbench output