Обсуждение: Re: [HACKERS] Parallel Select query performance and shared buffers

Поиск
Список
Период
Сортировка

Re: [HACKERS] Parallel Select query performance and shared buffers

От
Metin Doslu
Дата:
>   I think all of this data cannot fit in shared_buffers, you might want to increase shared_buffers
>   to larger size (not 30GB but close to your data size) to see how it behaves.

When I use shared_buffers larger than my data size such as 10 GB, results scale nearly as expected at least for this instance type.

> You queries have Aggregation, ORDER/GROUP BY, so there is a chance
> that I/O can happen for those operation's
> if PG doesn't have sufficient memory (work_mem) to perform such operation.

I used work_mem as 32 MB, this should be enough for these queries. I also tested with higher values of work_mem, and didn't obverse any difference.

> Can you simplify your queries (simple scan or in other words no
> aggregation or other things) to see how
> they behave in your env., once you are able to see simple queries
> scaling as per your expectation, you
> can try with complex one's.

Actually we observe problem when queries start to get simpler such as select count(*). Here is the results table in more compact format:

                  select count(*) TPC-H Simple(#6) TPC-H Complex(#1)
1 Table / 1 query      1.5 s            2.5 s           8.4 s
2 Tables/ 2 queries    1.5 s            2.5 s           8.4 s
4 Tables/ 4 queries    2.0 s            2.9 s           8.8 s
8 Tables/ 8 queries    3.3 s            4.0 s           9.6 s

> Can we have the explain analyze of those queries, postgres
> configuration, perhaps vmstat output during execution?

postgres=# explain analyze SELECT count(*) from lineitem_1;
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=199645.01..199645.02 rows=1 width=0) (actual time=11317.391..11317.393 rows=1 loops=1)
   ->  Seq Scan on lineitem_1  (cost=0.00..184641.81 rows=6001281 width=0) (actual time=0.011..5805.255 rows=6001215 loops=1)
 Total runtime: 11317.440 ms
(3 rows)

postgres=# explain analyze SELECT
postgres-#     sum(l_extendedprice * l_discount) as revenue
postgres-# FROM
postgres-#     lineitem_1
postgres-# WHERE
postgres-#     l_shipdate >= date '1994-01-01'
postgres-#     AND l_shipdate < date '1994-01-01' + interval '1' year
postgres-#     AND l_discount between 0.06 - 0.01 AND 0.06 + 0.01
postgres-#     AND l_quantity < 24;
                                                 QUERY PLAN                                                                           
------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=260215.36..260215.37 rows=1 width=16) (actual time=1751.775..1751.776 rows=1 loops=1)
   ->  Seq Scan on lineitem_1  (cost=0.00..259657.82 rows=111508 width=16) (actual time=0.031..1630.449 rows=114160 loops=1)
         Filter: ((l_shipdate >= '1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp without time zone) AND (l_discount >= 0.05::double precision) AND (l_discount <= 0.07::double precision) AND
 (l_quantity < 24::double precision))
         Rows Removed by Filter: 5887055
 Total runtime: 1751.830 ms
(5 rows)

postgres=# explain analyze SELECT
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order
FROM
    lineitem_1
WHERE
    l_shipdate <= date '1998-12-01' - interval '90' day
GROUP BY
    l_returnflag,
    l_linestatus
ORDER BY
    l_returnflag,
    l_linestatus;
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=436342.68..436342.69 rows=6 width=36) (actual time=18720.932..18720.936 rows=4 loops=1)
   Sort Key: l_returnflag, l_linestatus
   Sort Method: quicksort  Memory: 25kB
   ->  HashAggregate  (cost=436342.49..436342.60 rows=6 width=36) (actual time=18720.887..18720.892 rows=4 loops=1)
         ->  Seq Scan on lineitem_1  (cost=0.00..199645.01 rows=5917437 width=36) (actual time=0.011..6754.619 rows=5916591 loops=1)
               Filter: (l_shipdate <= '1998-09-02 00:00:00'::timestamp without time zone)
               Rows Removed by Filter: 84624
 Total runtime: 18721.021 ms
(8 rows)


Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple (#6) queries:  Although there is no need for I/O, "wa" fluctuates between 0 and 1. 

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 30093568  84892 38723896    0    0     0     0   22   14  0  0 100  0  0
 8  1      0 30043056  84892 38723896    0    0     0     0 27080 52708 16 14 70  0  0
 8  1      0 30006600  84892 38723896    0    0     0     0 44952 118286 43 44 12  1  0
 8  0      0 29986264  84900 38723896    0    0     0    20 28043 95934 49 42  8  1  0
 7  0      0 29991976  84900 38723896    0    0     0     0 8308 73641 52 42  6  0  0
 0  0      0 30091828  84900 38723896    0    0     0     0 3996 30978 23 24 53  0  0
 0  0      0 30091968  84900 38723896    0    0     0     0   17   23  0  0 100  0  0

I installed PostgreSQL 9.3.1 from source and in postgres configuration file I only changed shared buffers (4 GB) and work_mem (32 MB).

Re: [HACKERS] Parallel Select query performance and shared buffers

От
Claudio Freire
Дата:
On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu <metin@citusdata.com> wrote:
>
> Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple
> (#6) queries:  Although there is no need for I/O, "wa" fluctuates between 0
> and 1.
>
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu-----
>  r  b   swpd   free   buff    cache     si   so    bi    bo    in     cs us sy  id wa st
>  0  0      0 30093568  84892 38723896    0    0     0     0    22     14  0  0 100  0  0
>  8  1      0 30043056  84892 38723896    0    0     0     0 27080  52708 16 14  70  0  0
>  8  1      0 30006600  84892 38723896    0    0     0     0 44952 118286 43 44  12  1  0
>  8  0      0 29986264  84900 38723896    0    0     0    20 28043  95934 49 42   8  1  0
>  7  0      0 29991976  84900 38723896    0    0     0     0  8308  73641 52 42   6  0  0
>  0  0      0 30091828  84900 38723896    0    0     0     0  3996  30978 23 24  53  0  0
>  0  0      0 30091968  84900 38723896    0    0     0     0    17    23   0  0 100  0  0


Notice the huge %sy

What kind of VM are you using? HVM or paravirtual?


Re: [HACKERS] Parallel Select query performance and shared buffers

От
Andres Freund
Дата:
On 2013-12-04 14:27:10 -0200, Claudio Freire wrote:
> On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu <metin@citusdata.com> wrote:
> >
> > Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple
> > (#6) queries:  Although there is no need for I/O, "wa" fluctuates between 0
> > and 1.
> >
> > procs -----------memory---------- ---swap-- -----io---- --system--
> > -----cpu-----
> >  r  b   swpd   free   buff    cache     si   so    bi    bo    in     cs us sy  id wa st
> >  0  0      0 30093568  84892 38723896    0    0     0     0    22     14  0  0 100  0  0
> >  8  1      0 30043056  84892 38723896    0    0     0     0 27080  52708 16 14  70  0  0
> >  8  1      0 30006600  84892 38723896    0    0     0     0 44952 118286 43 44  12  1  0
> >  8  0      0 29986264  84900 38723896    0    0     0    20 28043  95934 49 42   8  1  0
> >  7  0      0 29991976  84900 38723896    0    0     0     0  8308  73641 52 42   6  0  0
> >  0  0      0 30091828  84900 38723896    0    0     0     0  3996  30978 23 24  53  0  0
> >  0  0      0 30091968  84900 38723896    0    0     0     0    17    23   0  0 100  0  0
>
>
> Notice the huge %sy

My bet is on transparent hugepage defragmentation. Alternatively it's
scheduler overhead, due to superflous context switches around the buffer
mapping locks.

I'd strongly suggest doing a "perf record -g -a <wait a bit, ctrl-c>;
perf report" run to check what's eating up the time.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Parallel Select query performance and shared buffers

От
Metin Doslu
Дата:
>Notice the huge %sy
>What kind of VM are you using? HVM or paravirtual?

This instance is paravirtual.

Re: [HACKERS] Parallel Select query performance and shared buffers

От
Metin Doslu
Дата:
> I'd strongly suggest doing a "perf record -g -a <wait a bit, ctrl-c>;
> perf report" run to check what's eating up the time.

Here is one example:

+  38.87%   swapper  [kernel.kallsyms]   [k] hypercall_page
+   9.32%  postgres  [kernel.kallsyms]   [k] hypercall_page
+   6.80%  postgres  [kernel.kallsyms]   [k] xen_set_pte_at
+   5.83%  postgres  [kernel.kallsyms]   [k] copy_user_generic_string
+   2.06%  postgres  [kernel.kallsyms]   [k] file_read_actor
+   1.89%  postgres  postgres            [.] heapgettup_pagemode
+   1.83%  postgres  postgres            [.] hash_search_with_hash_value
+   1.33%  postgres  [kernel.kallsyms]   [k] get_phys_to_machine
+   1.25%  postgres  [kernel.kallsyms]   [k] find_get_page
+   1.00%  postgres  postgres            [.] heapgetpage
+   0.99%  postgres  [kernel.kallsyms]   [k] radix_tree_lookup_element
+   0.98%  postgres  postgres            [.] advance_aggregates
+   0.96%  postgres  postgres            [.] ExecProject
+   0.94%  postgres  postgres            [.] advance_transition_function
+   0.88%  postgres  postgres            [.] ExecScan
+   0.87%  postgres  postgres            [.] HeapTupleSatisfiesMVCC
+   0.86%  postgres  postgres            [.] LWLockAcquire
+   0.82%  postgres  [kernel.kallsyms]   [k] put_page
+   0.82%  postgres  postgres            [.] MemoryContextReset
+   0.80%  postgres  postgres            [.] SeqNext
+   0.78%  postgres  [kernel.kallsyms]   [k] pte_mfn_to_pfn
+   0.69%  postgres  postgres            [.] ExecClearTuple
+   0.57%  postgres  postgres            [.] ExecProcNode
+   0.54%  postgres  postgres            [.] heap_getnext
+   0.53%  postgres  postgres            [.] LWLockRelease
+   0.53%  postgres  postgres            [.] ExecStoreTuple
+   0.51%  postgres  libc-2.12.so        [.] __GI___libc_read
+   0.42%  postgres  [kernel.kallsyms]   [k] xen_spin_lock
+   0.40%  postgres  postgres            [.] ReadBuffer_common
+   0.38%  postgres  [kernel.kallsyms]   [k] __do_fault
+   0.37%  postgres  [kernel.kallsyms]   [k] shmem_fault
+   0.37%  postgres  [kernel.kallsyms]   [k] unmap_single_vma
+   0.35%  postgres  [kernel.kallsyms]   [k] __wake_up_bit
+   0.33%  postgres  postgres            [.] StrategyGetBuffer
+   0.33%  postgres  [kernel.kallsyms]   [k] set_page_dirty
+   0.33%  postgres  [kernel.kallsyms]   [k] handle_pte_fault
+   0.33%  postgres  postgres            [.] ExecAgg
+   0.31%  postgres  postgres            [.] XidInMVCCSnapshot
+   0.31%  postgres  [kernel.kallsyms]   [k] __audit_syscall_entry
+   0.31%  postgres  postgres            [.] CheckForSerializableConflictOut
+   0.29%  postgres  [kernel.kallsyms]   [k] handle_mm_fault
+   0.25%  postgres  [kernel.kallsyms]   [k] shmem_getpage_gfp



On Wed, Dec 4, 2013 at 6:33 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-12-04 14:27:10 -0200, Claudio Freire wrote:
> On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu <metin@citusdata.com> wrote:
> >
> > Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple
> > (#6) queries:  Although there is no need for I/O, "wa" fluctuates between 0
> > and 1.
> >
> > procs -----------memory---------- ---swap-- -----io---- --system--
> > -----cpu-----
> >  r  b   swpd   free   buff    cache     si   so    bi    bo    in     cs us sy  id wa st
> >  0  0      0 30093568  84892 38723896    0    0     0     0    22     14  0  0 100  0  0
> >  8  1      0 30043056  84892 38723896    0    0     0     0 27080  52708 16 14  70  0  0
> >  8  1      0 30006600  84892 38723896    0    0     0     0 44952 118286 43 44  12  1  0
> >  8  0      0 29986264  84900 38723896    0    0     0    20 28043  95934 49 42   8  1  0
> >  7  0      0 29991976  84900 38723896    0    0     0     0  8308  73641 52 42   6  0  0
> >  0  0      0 30091828  84900 38723896    0    0     0     0  3996  30978 23 24  53  0  0
> >  0  0      0 30091968  84900 38723896    0    0     0     0    17    23   0  0 100  0  0
>
>
> Notice the huge %sy

My bet is on transparent hugepage defragmentation. Alternatively it's
scheduler overhead, due to superflous context switches around the buffer
mapping locks.

I'd strongly suggest doing a "perf record -g -a <wait a bit, ctrl-c>;
perf report" run to check what's eating up the time.

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Parallel Select query performance and shared buffers

От
Andres Freund
Дата:
On 2013-12-04 18:43:35 +0200, Metin Doslu wrote:
> > I'd strongly suggest doing a "perf record -g -a <wait a bit, ctrl-c>;
> > perf report" run to check what's eating up the time.
>
> Here is one example:
>
> +  38.87%   swapper  [kernel.kallsyms]   [k] hypercall_page
> +   9.32%  postgres  [kernel.kallsyms]   [k] hypercall_page
> +   6.80%  postgres  [kernel.kallsyms]   [k] xen_set_pte_at

All that time is spent in your virtualization solution. One thing to try
is to look on the host system, sometimes profiles there can be more
meaningful.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Parallel Select query performance and shared buffers

От
Claudio Freire
Дата:
On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-12-04 18:43:35 +0200, Metin Doslu wrote:
>> > I'd strongly suggest doing a "perf record -g -a <wait a bit, ctrl-c>;
>> > perf report" run to check what's eating up the time.
>>
>> Here is one example:
>>
>> +  38.87%   swapper  [kernel.kallsyms]   [k] hypercall_page
>> +   9.32%  postgres  [kernel.kallsyms]   [k] hypercall_page
>> +   6.80%  postgres  [kernel.kallsyms]   [k] xen_set_pte_at
>
> All that time is spent in your virtualization solution. One thing to try
> is to look on the host system, sometimes profiles there can be more
> meaningful.

You cannot profile the host on EC2.

You could try HVM. I've noticed it fare better  under heavy CPU load,
and it's not fully-HVM (it still uses paravirtualized network and
I/O).


Re: [HACKERS] Parallel Select query performance and shared buffers

От
Metin Doslu
Дата:
> You could try HVM. I've noticed it fare better  under heavy CPU load,
> and it's not fully-HVM (it still uses paravirtualized network and
> I/O).

I already tried with HVM (cc2.8xlarge instance on Amazon EC2) and observed same problem.

Re: [HACKERS] Parallel Select query performance and shared buffers

От
Andres Freund
Дата:
On 2013-12-04 16:00:40 -0200, Claudio Freire wrote:
> On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > All that time is spent in your virtualization solution. One thing to try
> > is to look on the host system, sometimes profiles there can be more
> > meaningful.
>
> You cannot profile the host on EC2.

Didn't follow the thread from the start. So, this is EC2? Have you
checked, with a recent enough version of top or whatever, how much time
is reported as "stolen"?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Parallel Select query performance and shared buffers

От
Metin Doslu
Дата:
> Didn't follow the thread from the start. So, this is EC2? Have you
> checked, with a recent enough version of top or whatever, how much time
> is reported as "stolen"?

Yes, this EC2. "stolen" is randomly reported as 1, mostly as 0.