Re: Alternative to UPDATE (As COPY to INSERT)

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: Alternative to UPDATE (As COPY to INSERT)
Дата
Msg-id ECEC31F4-B351-4C48-8455-34559EB3AD73@solfertje.student.utwente.nl
обсуждение исходный текст
Ответ на Alternative to UPDATE (As COPY to INSERT)  (Yan Cheng CHEOK <yccheok@yahoo.com>)
Список pgsql-general
On 23 Feb 2010, at 10:26, Yan Cheng CHEOK wrote:

> I realize update operation speed in PostgreSQL doesn't meet my speed expectation.
>
> Is there any fast alternative to UPDATE? as using fast COPY to INSERT operation.

Well, since an UPDATE is just a DELETE + INSERT and you're already doing this in one transaction, you could use DELETE
+COPY instead. That's not as easy to do with your current approach though - converting those arrays to something COPY
understandswill probably take about as much time as your function is taking now. 

It would probably be convenient to use a staging table (maybe temporary) to put the new values in before you act on
them,so you don't need that hassle with arrays. I don't know where your data is coming from (I recall you work with a
measurementmachine in a capacitor plant?), but you could use COPY to fill the staging table (no constraints, yay!) and
then:

  BEGIN;
  DELETE FROM statistics WHERE (fk_lot_id, measurement_type, statistic_type) IN (SELECT fk_lot_id, measurement_type,
statistic_typeFROM staging_table); 
  INSERT INTO statistics (value, fk_lot_id, measurement_type, statistic_type) SELECT value, fk_lot_id,
measurement_type,statistic_type FROM staging_table); 
  TRUNCATE staging_table;
  COMMIT;

This isn't concurrency-safe, so you need to make sure no values are added to the staging table while you're doing the
above.

I'm not sure you'll get to <1 ms doing this, that's a pretty steep requirement, but considering you can do it the
"slow"way in 20ms it might just work. 

Of course, if you have a staging table you could choose to operate on it less frequently; that would give you more time
tooperate on it and reduces the amount of overhead a little. I'm guessing your data comes in 24/7, so finding the right
batch-sizeis part of your problem. 

> Thanks!
>
> I am using update in the following case :
>
> CREATE OR REPLACE FUNCTION update_or_insert_statistic(integer, text[], text[], double precision[])
>  RETURNS void AS
> $BODY$DECLARE
>  _lotID ALIAS FOR $1;
>  _measurementTypes ALIAS FOR $2;
>  _statisticTypes ALIAS FOR $3;
>  _values ALIAS FOR $4;
>  _row_count int;
>  i int;
> BEGIN
>    -- Parameters validation.
>    IF array_upper(_measurementTypes, 1) != array_upper(_statisticTypes, 1) OR array_upper(_measurementTypes, 1) !=
array_upper(_values,1) THEN 
>        RAISE EXCEPTION 'Inconsistency in array size';
>    END IF;
>
>    FOR i IN SELECT generate_subscripts(_measurementTypes, 1)
>    LOOP
>        EXECUTE 'UPDATE statistic SET value = $1 WHERE fk_lot_id = $2 AND measurement_type = $3 AND statistic_type =
$4'
>        USING _values[i], _lotID, _measurementTypes[i], _statisticTypes[i];
>
>        GET DIAGNOSTICS _row_count = ROW_COUNT;
>
>        IF _row_count = 0 THEN
>            EXECUTE 'INSERT INTO statistic(fk_lot_id, "value", measurement_type, statistic_type) VALUES ($1, $2, $3,
$4)'
>            USING _lotID, _value, _measurementType, _statisticType;
>        END IF;
>    END LOOP;
> END;$BODY$
>
> I use the following "minimal" version
>
> SELECT * FROM update_or_insert_statistic(1,array['Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1
Area','Pad1 
> Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Area','Pad1 Pitch','Pad1 Pitch','Pad1
Pitch','Pad1Pitch','Pad1  
> Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch','Pad1 Pitch'],
>
array['LSL','USL','Nominal','Average','StdDev','StdDev3','CPK','Min','Max','Total','Valid','Invalid','Variance','LSL','USL','Nominal','Av
> erage','StdDev','StdDev3','CPK','Min','Max','Total','Valid','Invalid','Variance'],
>
array[0,0,0,4.94422589800714,3.16063453753607,0,0,1.01620532853175,9.98406933805353,20,20,0,9.98961067986587,0,0,0,6.56297341837825,2.512
> 73949943937,0,0,1.69188512833033,9.56794946134831,20,20,0,6.31385979204282])
>
> It takes around 20ms :(
>
> I am expecting < 1ms
>
> Or shall I just go back to plain text in this case?
>
> Thanks and Regards
> Yan Cheng CHEOK
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4b83c7c410447773417439!



В списке pgsql-general по дате отправления:

Предыдущее
От: "A. Kretschmer"
Дата:
Сообщение: Re: Subqueries or Joins? Problems with multiple table query
Следующее
От: Alban Hertroys
Дата:
Сообщение: Re: pg_dump new version