Re: asynchronous and vectorized execution

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: asynchronous and vectorized execution
Дата
Msg-id 20160511005016.j3m7wkk6cafx2ccr@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: asynchronous and vectorized execution  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: asynchronous and vectorized execution  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 2016-05-10 12:56:17 -0400, Robert Haas wrote:
> I suspect the number of queries that are being hurt by fmgr overhead
> is really large, and I think it would be nice to attack that problem
> more directly.  It's a bit hard to discuss what's worthwhile in the
> abstract, without performance numbers, but when you vectorize, how
> much is the benefit from using SIMD instructions and how much is the
> benefit from just not going through the fmgr every time?

I think fmgr overhead is an issue, but in most profiles of execution
heavy loads I've seen the bottlenecks are elsewhere. They often seem to
roughly look like
+   15.47%  postgres  postgres           [.] slot_deform_tuple
+   12.99%  postgres  postgres           [.] slot_getattr
+   10.36%  postgres  postgres           [.] ExecMakeFunctionResultNoSets
+    9.76%  postgres  postgres           [.] heap_getnext
+    6.34%  postgres  postgres           [.] HeapTupleSatisfiesMVCC
+    5.09%  postgres  postgres           [.] heapgetpage
+    4.59%  postgres  postgres           [.] hash_search_with_hash_value
+    4.36%  postgres  postgres           [.] ExecQual
+    3.30%  postgres  postgres           [.] ExecStoreTuple
+    3.29%  postgres  postgres           [.] ExecScan

or

-   33.67%  postgres  postgres           [.] ExecMakeFunctionResultNoSets  - ExecMakeFunctionResultNoSets     + 99.11%
ExecEvalOr    + 0.89% ExecQual
 
+   14.32%  postgres  postgres           [.] slot_getattr
+    5.66%  postgres  postgres           [.] ExecEvalOr
+    5.06%  postgres  postgres           [.] check_stack_depth
+    5.02%  postgres  postgres           [.] slot_deform_tuple
+    4.05%  postgres  postgres           [.] pgstat_end_function_usage
+    3.69%  postgres  postgres           [.] heap_getnext
+    3.41%  postgres  postgres           [.] ExecEvalScalarVarFast
+    3.36%  postgres  postgres           [.] ExecEvalConst


with a healthy dose of _bt_compare, heap_hot_search_buffer in more index
heavy workloads.

(yes, I just pulled these example profiles from somewhere, but I've more
often seen them look like this, than very fmgr heavy).


That seems to suggest that we need to restructure how we get to calling
fmgr functions, before worrying about the actual fmgr call.


Tomas, Mark, IIRC you'd both generated perf profiles for TPC-H (IIRC?)
queries at some point. Any chance the results are online somewhere?

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: ALTER TABLE lock downgrades have broken pg_upgrade