Re: Performance problem with query
От | Christian Rengstl |
---|---|
Тема | Re: Performance problem with query |
Дата | |
Msg-id | 44BE0A36.0AD0.0080.0@klinik.uni-regensburg.de обсуждение исходный текст |
Ответ на | Re: Antw: Performance problem with query (Q <qdolan@gmail.com>) |
Ответы |
Re: Performance problem with query
(Q <qdolan@gmail.com>)
|
Список | pgsql-general |
The analyze is from the exact query and i dropped the indexes before the insert as well without imrpvement. The target tableis as well completely empty and the insert is supposed to write, in this case, more or less 8 million lines in the table.There is a rule though, because i have inheritance table structure with one master table and around 20 child tables. Q <qdolan@gmail.com> wrote on 07/19/06 4:37 am: > > On 19/07/2006, at 4:24 AM, Christian Rengstl wrote: > >> now finally after a long time i have the query plan for the whole >> filled table. I hope somebody can explain me why it takes so much >> longer... > > > These explain analyze results don't appear to be from the queries you > posted previously. For these results to mean anything you need to > include the EXACT queries you used to generate them. > >> Result (cost=0.06..254643.06 rows=8044000 width=39) (actual >> time=0.056..655772 >> .273 rows=8044000 loops=1) >> One-Time Filter: ((split_part(($1)::text, '_'::text, >> 2))::smallint = 1) >> InitPlan >> -> Limit (cost=0.00..0.02 rows=1 width=10) (actual >> time=0.003..0.003 rows >> =1 loops=1) >> -> Seq Scan on temp_table (cost=0.00..194313.00 >> rows=8044000 width= >> 10) (actual time=0.002..0.002 rows=1 loops=1) >> -> Limit (cost=0.00..0.02 rows=1 width=10) (actual >> time=0.006..0.007 rows >> =1 loops=1) >> -> Seq Scan on temp_table (cost=0.00..194313.00 >> rows=8044000 width= >> 10) (actual time=0.004..0.004 rows=1 loops=1) >> -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 >> width=39) (act >> ual time=0.002..191672.344 rows=8044000 loops=1) >> Total runtime: 62259544.896 ms > > > This is the query you want to be interested in, the others took no > time at all. > > As a guess I would say the query is an INSERT INTO ... FROM > SELECT ... WHERE (split_part(???, '_', 2))::smallint = 1 statement. > The majority of the time appears to be taken up on the actual INSERT > and not the SELECT part. > > How many rows are in the target table and what indexes does it have? > Does it have any triggers, check constraints, or rules applied to it? > All these things can make the insert take longer as the number of > rows you have already in the table increases. > > More than likely you have a target table with a LOT of rows and a > bunch of indexes on it and your disks are being thrashed because the > indexes are not able to stay cached in RAM. At this point you should > ensure your machine is not swapping do disk, and at the very least > you should go through one of the many tuning guidelines available and > ensure you have allocated the appropriate amount of memory to > postgresql for your needs. > > You may also want to consider dropping the indexes before you do the > INSERT and recreate them afterwards. > > > >>>>> "Christian Rengstl" <Christian.Rengstl@klinik.uni-regensburg.de> >>>>> 13.07.06 8.37 Uhr >>> >> Good morning list, >> >> the following query takes about 15 to 20 minutes for around 2 >> million lines in the file myfile.txt, but with 8 million lines it >> takes around 5 hours and i just don't understand why there is such >> a huge discrepancy in performance. >> >> COPY public.temp_table FROM 'myfile.txt' WITH DELIMITER '\t'; >> >> INSERT INTO public.master(pid,smid, val1, val2, chr) >> SELECT pid, smid, val1, val12, CAST(split_part((SELECT >> chr from public.temp_table LIMIT 1), '_', 2) as int2) >> FROM public.temp_table; >> >> INSERT INTO public.values(smid, pos, chr) >> SELECT DISTINCT smid, pos, CAST(split_part((SELECT chr from >> public.temp_table LIMIT 1), '_', 2) as int2) >> FROM public.temp_table; >> >> I came up with this query, because i wanted to use the COPY command >> to load huge files into the db, but i don't want to load all the >> columns contained in the file in only one table but copy some of >> them into one table and some in a second table. As i found out with >> logging, the data is loaded into temp_table within 15 minutes, but >> to transfer it from the temp_table toagain only something like 10 >> minutes. Can it be that the cast takes up so much more time than >> when reading and transferring 2 million lines? > > > -- > Seeya...Q > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > _____ / Quinton Dolan - qdolan@gmail.com > __ __/ / / __/ / / > / __ / _/ / / Gold Coast, QLD, Australia > __/ __/ __/ ____/ / - / Ph: +61 419 729 806 > _______ / > _\ > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Christian Rengstl M.A. Klinik und Poliklinik für Innere Medizin II Kardiologie - Forschung Universitätsklinikum Regensburg B3 1.388 Franz-Josef-Strauss-Allee 11 93053 Regensburg Tel.: +49-941-944-7230
В списке pgsql-general по дате отправления: