Re: parallel joins, and better parallel explain

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: parallel joins, and better parallel explain
Дата
Msg-id CAFiTN-ti8PS7Ku8a63P=ePVWtjCAYvpidfD1+sEs+GAfjeJeKw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: parallel joins, and better parallel explain  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Wed, Jan 6, 2016 at 10:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 4, 2016 at 8:52 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> One strange behaviour, after increasing number of processor for VM,
> max_parallel_degree=0; is also performing better.

So, you went from 6 vCPUs to 8?  In general, adding more CPUs means
that there is less contention for CPU time, but if you already had 6
CPUs and nothing else running, I don't know why the backend running
the query would have had a problem getting a whole CPU to itself.  If
you previously only had 1 or 2 CPUs then there might have been some
CPU competition with background processes, but if you had 6 then I
don't know why the max_parallel_degree=0 case got faster with 8.

I am really not sure about this case, may be CPU allocation in virtual machine had problem.. but can't say anything
 
Anyway, I humbly suggest that this query isn't the right place to put
our attention.  There's no reason why we can't improve things further
in the future, and if it turns out that lots of people have problems
with the cost estimates on multi-batch parallel hash joins, then we
can revise the cost model.  We wouldn't treat a single query where a
non-parallel multi-batch hash join run slower than the costing would
suggest as a reason to revise the cost model for that case, and I
don't think this patch should be held to a higher standard.  In this
particular case, you can easily make the problem go away by tuning
configuration parameters, which seems like an acceptable answer for
people who run into this,

Yes, i agree with this point, cost model can always be improved. And anyway in most of the queries even in TPC-H benchmark we have seen big improvement with parallel join.

I have done further testing for observing the plan time, using TPC-H queries and some other many table join queries(7-8 tables)..

I did not find any visible regression in planning time...

*There are many combinations of queries i have tested, and because of big size of query and result did not attached in the mail... let me know if anybody want to know the details of queries...


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Relation extension scalability
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: code to deparse parameter in postgres_fdw is duplicated