On Fri, Apr 3, 2009 at 11:02 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling <matthew@flymine.org> wrote:
>> genes = '{}';
>> next_new = 1;
>> FOR loc IN SELECT location.* FROM location, gene WHERE location.subjectid =
>> gene.id ORDER BY objectid, intermine_start, intermine_end LOOP
>> genes[next_new] = loc;
>> IF (next_new % 10000 = 0) THEN
>> RAISE NOTICE 'Scanned % gene locations', next_new;
>> END IF;
>> next_new = next_new + 1;
>> END LOOP;
>> genes_size = coalesce(array_upper(genes, 1), 0);
>> RAISE NOTICE 'Scanned % gene locations', genes_size;
>>
>> For 200,000 rows it takes 40 minutes.
>>
>> So, is there a way to dump the results of a query into an array quickly in
>> plpgsql, or alternatively is there a way to read two results streams
>> simultaneously?
>
> try this:
> select array(SELECT location.* FROM location, gene WHERE
> location.subjectid = gene.id ORDER BY objectid, intermine_start,
> intermine_end)) into genes;
one more time:
select array(SELECT location FROM location, gene WHERE
location.subjectid = gene.id ORDER BY objectid, intermine_start,
intermine_end)) into genes;
this will make array of location records. when you access the records
to do the merge, make sure to use () noation:
if (genes[x]).field > something then
...
merlin