Re: Performance problem with query
От | Q |
---|---|
Тема | Re: Performance problem with query |
Дата | |
Msg-id | 8ACBB832-F600-4C90-AFD0-621CB966E082@gmail.com обсуждение исходный текст |
Ответ на | Re: Performance problem with query ("Christian Rengstl" <Christian.Rengstl@klinik.uni-regensburg.de>) |
Ответы |
Re: Performance problem with query
|
Список | pgsql-general |
On 19/07/2006, at 8:49 PM, Christian Rengstl wrote: > Obviously it had something to do with the rule, because now > everything finished within 20 minutes. the problem is just that i > don't really want to give up the inheritance design. is there a way > to maintain the inheritance that doesn't cause this huge > performance problem? That is hard to say unless you post the rule and table schema you are currently using. > Q <qdolan@gmail.com> wrote on 07/19/06 11:54 am: >> On 19/07/2006, at 6:32 PM, Christian Rengstl wrote: >> >>> The analyze is from the exact query and i dropped the indexes >>> before the insert as well without imrpvement. The target table is >>> as well completely empty and the insert is supposed to write, in >>> this case, more or less 8 million lines in the table. There is a >>> rule though, because i have inheritance table structure with one >>> master table and around 20 child tables. >> >> I would say the problem is in the rule. Try doing the insert into a >> duplicate table with no rules or inheritance and see how long it >> takes. >> >> Perhaps you should provide the actual schema of tables and rules that >> are involved in the query in question. >> >>> Q <qdolan@gmail.com> wrote on 07/19/06 4:37 am: >>>> >>>> On 19/07/2006, at 4:24 AM, Christian Rengstl wrote: >>>> >>>>> now finally after a long time i have the query plan for the whole >>>>> filled table. I hope somebody can explain me why it takes so much >>>>> longer... >>>> >>>> >>>> These explain analyze results don't appear to be from the >>>> queries you >>>> posted previously. For these results to mean anything you need to >>>> include the EXACT queries you used to generate them. >>>> >>>>> Result (cost=0.06..254643.06 rows=8044000 width=39) (actual >>>>> time=0.056..655772 >>>>> .273 rows=8044000 loops=1) >>>>> One-Time Filter: ((split_part(($1)::text, '_'::text, >>>>> 2))::smallint = 1) >>>>> InitPlan >>>>> -> Limit (cost=0.00..0.02 rows=1 width=10) (actual >>>>> time=0.003..0.003 rows >>>>> =1 loops=1) >>>>> -> Seq Scan on temp_table (cost=0.00..194313.00 >>>>> rows=8044000 width= >>>>> 10) (actual time=0.002..0.002 rows=1 loops=1) >>>>> -> Limit (cost=0.00..0.02 rows=1 width=10) (actual >>>>> time=0.006..0.007 rows >>>>> =1 loops=1) >>>>> -> Seq Scan on temp_table (cost=0.00..194313.00 >>>>> rows=8044000 width= >>>>> 10) (actual time=0.004..0.004 rows=1 loops=1) >>>>> -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 >>>>> width=39) (act >>>>> ual time=0.002..191672.344 rows=8044000 loops=1) >>>>> Total runtime: 62259544.896 ms >>>> >>>> >>>> This is the query you want to be interested in, the others took no >>>> time at all. >>>> >>>> As a guess I would say the query is an INSERT INTO ... FROM >>>> SELECT ... WHERE (split_part(???, '_', 2))::smallint = 1 statement. >>>> The majority of the time appears to be taken up on the actual >>>> INSERT >>>> and not the SELECT part. >>>> >>>> How many rows are in the target table and what indexes does it >>>> have? >>>> Does it have any triggers, check constraints, or rules applied >>>> to it? >>>> All these things can make the insert take longer as the number of >>>> rows you have already in the table increases. >>>> >>>> More than likely you have a target table with a LOT of rows and a >>>> bunch of indexes on it and your disks are being thrashed because >>>> the >>>> indexes are not able to stay cached in RAM. At this point you >>>> should >>>> ensure your machine is not swapping do disk, and at the very least >>>> you should go through one of the many tuning guidelines >>>> available and >>>> ensure you have allocated the appropriate amount of memory to >>>> postgresql for your needs. >>>> >>>> You may also want to consider dropping the indexes before you do >>>> the >>>> INSERT and recreate them afterwards. >>>> >>>> >>>> >>>>>>>> "Christian Rengstl" <Christian.Rengstl@klinik.uni- >>>>>>>> regensburg.de> >>>>>>>> 13.07.06 8.37 Uhr >>> >>>>> Good morning list, >>>>> >>>>> the following query takes about 15 to 20 minutes for around 2 >>>>> million lines in the file myfile.txt, but with 8 million lines it >>>>> takes around 5 hours and i just don't understand why there is such >>>>> a huge discrepancy in performance. >>>>> >>>>> COPY public.temp_table FROM 'myfile.txt' WITH DELIMITER '\t'; >>>>> >>>>> INSERT INTO public.master(pid,smid, val1, val2, chr) >>>>> SELECT pid, smid, val1, val12, CAST(split_part((SELECT >>>>> chr from public.temp_table LIMIT 1), '_', 2) as int2) >>>>> FROM public.temp_table; >>>>> >>>>> INSERT INTO public.values(smid, pos, chr) >>>>> SELECT DISTINCT smid, pos, CAST(split_part((SELECT chr >>>>> from >>>>> public.temp_table LIMIT 1), '_', 2) as int2) >>>>> FROM public.temp_table; >>>>> >>>>> I came up with this query, because i wanted to use the COPY >>>>> command >>>>> to load huge files into the db, but i don't want to load all the >>>>> columns contained in the file in only one table but copy some of >>>>> them into one table and some in a second table. As i found out >>>>> with >>>>> logging, the data is loaded into temp_table within 15 minutes, but >>>>> to transfer it from the temp_table toagain only something like 10 >>>>> minutes. Can it be that the cast takes up so much more time than >>>>> when reading and transferring 2 million lines? >>>> >>>> >>>> -- >>>> Seeya...Q >>>> >>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>> >>>> _____ / Quinton Dolan - >>>> qdolan@gmail.com >>>> __ __/ / / __/ / / >>>> / __ / _/ / / Gold Coast, QLD, Australia >>>> __/ __/ __/ ____/ / - / Ph: +61 419 729 806 >>>> _______ / >>>> _\ >>>> >>>> >>>> >>>> >>>> ---------------------------(end of >>>> broadcast)--------------------------- >>>> TIP 2: Don't 'kill -9' the postmaster >>> >>> >>> -- >>> Christian Rengstl M.A. >>> Klinik und Poliklinik für Innere Medizin II >>> Kardiologie - Forschung >>> Universitätsklinikum Regensburg >>> B3 1.388 >>> Franz-Josef-Strauss-Allee 11 >>> 93053 Regensburg >>> Tel.: +49-941-944-7230 >> >> >> -- >> Seeya...Q >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >> >> _____ / Quinton Dolan - qdolan@gmail.com >> __ __/ / / __/ / / >> / __ / _/ / / Gold Coast, QLD, Australia >> __/ __/ __/ ____/ / - / Ph: +61 419 729 806 >> _______ / >> _\ >> >> > > > -- > Christian Rengstl M.A. > Klinik und Poliklinik für Innere Medizin II > Kardiologie - Forschung > Universitätsklinikum Regensburg > B3 1.388 > Franz-Josef-Strauss-Allee 11 > 93053 Regensburg > Tel.: +49-941-944-7230 -- Seeya...Q -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- _____ / Quinton Dolan - qdolan@gmail.com __ __/ / / __/ / / / __ / _/ / / Gold Coast, QLD, Australia __/ __/ __/ ____/ / - / Ph: +61 419 729 806 _______ / _\
В списке pgsql-general по дате отправления: