Re: Possible future performance improvement: sort updates/deletes by ctid

Поиск
Список
Период
Сортировка
От Stephen Denne
Тема Re: Possible future performance improvement: sort updates/deletes by ctid
Дата
Msg-id F0238EBA67824444BC1CB4700960CB48048E111F@dmpeints002.isotach.com
обсуждение исходный текст
Ответ на Re: Possible future performance improvement: sort updates/deletes by ctid  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> At the planner level that would be entirely the wrong way to go about
> it, because that's forcing the equivalent of a nestloop join, which is
> very unlikely to be faster for the numbers of rows that we're talking
> about here.  The reason it looks faster to you is that the benefits of
> updating the document_file rows in ctid order outweigh the
> costs of the
> dumb join strategy ... but what we want to achieve here is to
> have both
> benefits, or at least to give the planner the opportunity to make a
> cost-driven decision about what to do.

Ok.

Here are some more data points, using a smaller table, v8.2.6:


Seq Scan on document_file df  (cost=0.00..208480.85 rows=25101 width=662) (actual time=0.239..773.834 rows=25149
loops=1)SubPlan   ->  Index Scan using pk_document_id on document d  (cost=0.00..8.27 rows=1 width=4) (actual
time=0.011..0.015rows=1 loops=25149)         Index Cond: (id = $0) 
Total runtime: 4492.363 ms



vs


Hash Join  (cost=1048.85..6539.32 rows=25149 width=666) (actual time=575.079..1408.363 rows=25149 loops=1) Hash Cond:
(df.document_id= d.id) ->  Seq Scan on document_file df  (cost=0.00..4987.49 rows=25149 width=662) (actual
time=60.724..824.195rows=25149 loops=1) ->  Hash  (cost=734.49..734.49 rows=25149 width=8) (actual time=40.271..40.271
rows=25149loops=1)       ->  Seq Scan on document d  (cost=0.00..734.49 rows=25149 width=8) (actual time=0.055..22.559
rows=25149loops=1) 
Total runtime: 34961.504 ms


These are fairly repeatable for me after doing a vacuum full analyze of the two tables.


Have I simply not tuned postgres so that it knows it has everything on a single old IDE drive, not split over a few
setsof raided SSD drives, hence random_page_cost should perhaps be larger than 4.0? Would that make the second estimate
largerthan the first estimate? 

Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 

__________________________________________________________________ This email has been scanned by the DMZGlobal
BusinessQuality              Electronic Messaging Suite. 
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Possible future performance improvement: sort updates/deletes by ctid
Следующее
От: "Dann Corbit"
Дата:
Сообщение: Will PostgreSQL get ported to CUDA?