On 2015-11-25 19:35:15 +0300, Evgeniy Shishkin wrote:
> Fast:
>
> Sort (cost=193101.41..195369.80 rows=907357 width=129) (actual time=3828.176..3831.261 rows=43615 loops=1)
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_vendedor.a3_nome,(btrim((dim_cliente.tipocliente)::text))
> Sort Key: (btrim((dim_cliente.tipocliente)::text)), dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome
> Sort Method: quicksort Memory: 13121kB
> -> HashAggregate (cost=91970.52..103312.49 rows=907357 width=129) (actual time=2462.690..2496.729 rows=43615
loops=1)
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_vendedor.a3_nome,btrim((dim_cliente.tipocliente)::text)
> -> Hash Join (cost=856.30..80628.56 rows=907357 width=129) (actual time=29.524..1533.880 rows=907357
loops=1)
>
>
> Slow:
>
> Group (cost=170417.48..184027.84 rows=907357 width=129) (actual time=36649.329..37235.158 rows=43615 loops=1)
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_vendedor.a3_nome
> -> Sort (cost=170417.48..172685.88 rows=907357 width=129) (actual time=36649.315..36786.760 rows=907357 loops=1)
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_vendedor.a3_nome
> Sort Key: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome,
dim_vendedor.a3_nome
> Sort Method: quicksort Memory: 265592kB
> -> Hash Join (cost=856.30..80628.56 rows=907357 width=129) (actual time=26.719..1593.693 rows=907357
loops=1)
>
>
> The difference is in the top of plans.
> As we see, hashjoin time is practically the same.
> But fast plan uses hashagg first and only 43k rows require sorting.
> Slow plan dominated by sorting 900k rows.
>
> I wonder if increasing cpu_tuple_cost will help.
> As cost difference between two plans is negligible now.
Seems plausible. Also I'm wondering what CPU this is: 36 seconds for an
in-memory sort of 900k rows seems slow to me. I tested this on my PC at
home (1.8 GHz Core2 Dual, so a rather old and slow box) and I could sort
1E6 rows of 128 random bytes in 5.6 seconds. Even if I kept the first 96
bytes constant (so only the last 32 were random), it took only 21
seconds. Either this CPU is really slow or the data is heavily skewed -
is it possible that all dimensions except dim_vendedor.a3_nome have only
one or very few values? In that case changing the sort order might help.
hp
--
_ | Peter J. Holzer | I want to forget all about both belts and
|_|_) | | suspenders; instead, I want to buy pants
| | | hjp@hjp.at | that actually fit.
__/ | http://www.hjp.at/ | -- http://noncombatant.org/