Re: Performance problem with query

Поиск

Список

Период

Сортировка

От	Merlin Moncure
Тема	Re: Performance problem with query
Дата	18 июля 2006 г. 19:39:39
Msg-id	b42b73150607181239t60e17e19ub54b21d7dadc6bc7@mail.gmail.com обсуждение исходный текст
Ответ на	Antw: Performance problem with query ("Christian Rengstl" <Christian.Rengstl@klinik.uni-regensburg.de>)
Ответы	Re: Performance problem with query
Список	pgsql-general

Дерево обсуждения

On 7/18/06, Christian Rengstl
<Christian.Rengstl@klinik.uni-regensburg.de> wrote:
> now finally after a long time i have the query plan for the whole filled table. I hope somebody can explain me why it
takesso much longer... 
>                                                                             QUERY PLAN
>
> --------------------------------------------------------------------------------
> --------------------------------------------------------------------------------
> -----
> Result  (cost=0.09..254643.09 rows=8044000 width=39) (actual time=0.043..0.043
> rows=0 loops=1)
>   One-Time Filter: ((((split_part(($1)::text, '_'::text, 2))::smallint = 1) IS
> NOT TRUE) AND (((split_part(($2)::text, '_'::text, 2))::smallint = 22) IS NOT TR
> UE))
>   InitPlan
>     ->  Limit  (cost=0.00..0.02 rows=1 width=10) (never executed)
>           ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
> 10) (never executed)
>     ->  Limit  (cost=0.00..0.02 rows=1 width=10) (actual time=0.006..0.006 rows
> =1 loops=1)
>           ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
> 10) (actual time=0.004..0.004 rows=1 loops=1)
>     ->  Limit  (cost=0.00..0.02 rows=1 width=10) (never executed)
>           ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
> 10) (never executed)
>   ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=39) (nev
> er executed)
> Total runtime: 0.238 ms
>
> Result  (cost=0.06..254643.06 rows=8044000 width=39) (actual time=0.056..655772
> .273 rows=8044000 loops=1)
>   One-Time Filter: ((split_part(($1)::text, '_'::text, 2))::smallint = 1)
>   InitPlan
>     ->  Limit  (cost=0.00..0.02 rows=1 width=10) (actual time=0.003..0.003 rows
> =1 loops=1)
>           ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
> 10) (actual time=0.002..0.002 rows=1 loops=1)
>     ->  Limit  (cost=0.00..0.02 rows=1 width=10) (actual time=0.006..0.007 rows
> =1 loops=1)
>           ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
> 10) (actual time=0.004..0.004 rows=1 loops=1)
>   ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=39) (act
> ual time=0.002..191672.344 rows=8044000 loops=1)
> Total runtime: 62259544.896 ms
>
> Result  (cost=0.06..254643.06 rows=8044000 width=39) (actual time=1.245..1.245
> rows=0 loops=1)
>   One-Time Filter: ((split_part(($1)::text, '_'::text, 2))::smallint = 22)
>   InitPlan
>     ->  Limit  (cost=0.00..0.02 rows=1 width=10) (never executed)
>           ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
> 10) (never executed)
>     ->  Limit  (cost=0.00..0.02 rows=1 width=10) (actual time=0.013..0.014 rows
> =1 loops=1)
>           ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
> 10) (actual time=0.009..0.009 rows=1 loops=1)
>   ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=39) (nev
> er executed)
> Total runtime: 22.270 ms
> (31 Zeilen)
>
>
>
> >>> "Christian Rengstl" <Christian.Rengstl@klinik.uni-regensburg.de> 13.07.06 8.37 Uhr >>>
> Good morning list,
>
> the following query takes about 15 to 20 minutes for around 2 million lines in the file myfile.txt, but with 8
millionlines it takes around 5 hours and i just don't understand why there is such a huge discrepancy in performance. 
>
> COPY public.temp_table FROM 'myfile.txt' WITH DELIMITER '\t';
>
> INSERT INTO public.master(pid,smid, val1, val2, chr)
>             SELECT pid, smid, val1, val12, CAST(split_part((SELECT chr from public.temp_table LIMIT 1), '_', 2) as
int2)
>            FROM public.temp_table;
>
> INSERT INTO public.values(smid, pos, chr)
>         SELECT DISTINCT smid, pos, CAST(split_part((SELECT chr from public.temp_table LIMIT 1), '_', 2) as int2)
>         FROM public.temp_table;
>

what is this phrase doing exactly?
CAST(split_part((SELECT chr from public.temp_table LIMIT 1), '_', 2)

it looks fishy.
merlin

В списке pgsql-general по дате отправления:

Предыдущее

От: Robert Treat
Дата: 18 июля 2006 г., 19:12:47
Сообщение: Re: databases hidden in phppgadmin

Следующее

От: Michael Fuhr
Дата: 18 июля 2006 г., 20:10:06
Сообщение: Re: function taking a long time. any options o speed it up.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Performance problem with query

Предыдущее

Следующее