Обсуждение: Performance problems with postgres and null Values?

Поиск
Список
Период
Сортировка

Performance problems with postgres and null Values?

От
"Sven Kerkling"
Дата:

After remodeling a table we have some performance problems.

 

The Original tables have much more fields and we thought it came from these many fields. After some testing I tried these test layout and the performance problem is not solved.

 

Postgresql 9.3.12

 

Former DB-Layout was table _masterOld_ with 2 tables inherits from _masterOld_: _part1Old_ and _part2Old_. In _masterOld_ were 7 million rows (_part1Old_: 5 millions, _part2Old_: 2 millions).

 

Now we have only one new table _masterNew_ with 7 million rows.

 

DDL: 

export:

 

    CREATE TABLE public.export (

      id_firma BIGINT,

      status VARCHAR(32)

    )

    WITH (oids = false);

 

    CREATE INDEX export_idx ON public.export

      USING btree (id_firma);

 

masterNew:

  

    CREATE TABLE public."masterNew" (

    id_firma BIGINT,

    id_bestand BIGINT NOT NULL,

    status VARCHAR(32),

    sperre VARCHAR(32),

    CONSTRAINT "masterNew_2016_pkey" PRIMARY KEY(id_bestand)

    )

    WITH (oids = false);

 

    CREATE INDEX "masterNew_2016_pi_idx" ON public."masterNew"

    USING btree (id_firma)

    WHERE ((status IS NULL) AND (sperre IS NULL));

 

    CREATE INDEX "masterNew_sperre_2016" ON public."masterNew"

    USING btree (sperre COLLATE pg_catalog."default");

 

    CREATE INDEX "masterNew_status_2016" ON public."masterNew"

    USING btree (status COLLATE pg_catalog."default");

 

masterOld:

 

    CREATE TABLE public."masterOld" (

      id_firma BIGINT,

      id_bestand BIGINT NOT NULL,

      status VARCHAR(32),

      sperre VARCHAR(32),

      CONSTRAINT "masterOld_pkey" PRIMARY KEY(id_bestand)

    )

    WITH (oids = false);

 

    CREATE INDEX "masterOld_idx" ON public."masterOld"

      USING btree (id_firma);

 

    CREATE INDEX "masterOld_sperre" ON public."masterOld"

      USING btree (sperre COLLATE pg_catalog."default");

 

    CREATE INDEX "masterOld_status" ON public."masterOld"

      USING btree (status COLLATE pg_catalog."default");

 

part1Old:

 

    CREATE TABLE public."part1Old" (

      CONSTRAINT "part1Old_idx" PRIMARY KEY(id_bestand)

    ) INHERITS (public."masterOld")

 

    WITH (oids = false);

 

    CREATE INDEX "part1Old_idx1" ON public."part1Old"

      USING btree (id_firma);

 

    CREATE INDEX "part1Old_idx2" ON public."part1Old"

      USING btree (status COLLATE pg_catalog."default");

 

    CREATE INDEX "part1Old_idx3" ON public."part1Old"

      USING btree (sperre COLLATE pg_catalog."default");

 

part2Old:

 

    CREATE TABLE public."part2Old" (

      CONSTRAINT "part2Old_idx" PRIMARY KEY(id_bestand)

    ) INHERITS (public."masterOld")

 

    WITH (oids = false);

 

    CREATE INDEX "part2Old_idx1" ON public."part2Old"

      USING btree (id_firma);

 

    CREATE INDEX "part2Old_idx2" ON public."part2Old"

      USING btree (status COLLATE pg_catalog."default");

 

    CREATE INDEX "part2Old_idx3" ON public."part2Old"

      USING btree (sperre COLLATE pg_catalog."default");

 

In the _export_ table are 1.2 million rows.

 

Old:

 

    EXPLAIN

    SELECT b.id, b.status

    FROM export b, masterOld mb

    WHERE mb.sperre IS NULL

      AND mb.status IS NULL

      AND b.id_firma = mb.id_firma

    LIMIT 100; 

 

<a href="http://explain.depesz.com/s/SCBo">Plan on explain.depesz.com</a>

 

    - Plan:

    Node Type: "Limit"

    Startup Cost: 0.00

    Total Cost: 0.09

    Plan Rows: 100

    Plan Width: 90

    Plans:

      - Node Type: "Nested Loop"

        Parent Relationship: "Outer"

        Join Type: "Inner"

        Startup Cost: 0.00

        Total Cost: 118535034.59

        Plan Rows: 126126068850

        Plan Width: 90

        Plans:

          - Node Type: "Seq Scan"

            Parent Relationship: "Outer"

            Relation Name: "export"

            Alias: "b"

            Startup Cost: 0.00

            Total Cost: 79129.80

            Plan Rows: 5485680

            Plan Width: 90

          - Node Type: "Append"

            Parent Relationship: "Inner"

            Startup Cost: 0.00

            Total Cost: 21.56

            Plan Rows: 3

            Plan Width: 8

            Plans:

              - Node Type: "Seq Scan"

                Parent Relationship: "Member"

                Relation Name: "masterOld"

                Alias: "mb"

                Startup Cost: 0.00

                Total Cost: 1.10

                Plan Rows: 1

                Plan Width: 8

                Filter: "((sperre IS NULL) AND (status IS NULL) AND (b.id = id))"

              - Node Type: "Index Scan"

                Parent Relationship: "Member"

                Scan Direction: "Forward"

                Index Name: "part1Old_idx9"

                Relation Name: "part1Old"

                Alias: "mb_1"

                Startup Cost: 0.43

                Total Cost: 12.20

                Plan Rows: 1

                Plan Width: 8

                Index Cond: "(id = b.id)"

                Filter: "((sperre IS NULL) AND (status IS NULL))"

              - Node Type: "Index Scan"

                Parent Relationship: "Member"

                Scan Direction: "Forward"

                Index Name: "part2Old_idx"

                Relation Name: "part2Old"

                Alias: "mb_2"

                Startup Cost: 0.43

                Total Cost: 8.26

                Plan Rows: 1

                Plan Width: 8

                Index Cond: "(id = b.id)"

                Filter: "((sperre IS NULL) AND (status IS NULL))"

 

There were no speed problems.

 

New:

 

    EXPLAIN

    SELECT b.id, b.status

    FROM export b, masterNew mb

    WHERE mb.sperre IS NULL

      AND mb.status IS NULL

      AND b.id = mb.id

    LIMIT 100;

 

<a href="http://explain.depesz.com/s/eAqG">Plan on explain.depesz.com</a>   

 

    - Plan:

    Node Type: "Limit"

    Startup Cost: 5.38

    Total Cost: 306.99

    Plan Rows: 100

    Plan Width: 90

    Plans:

      - Node Type: "Nested Loop"

        Parent Relationship: "Outer"

        Join Type: "Inner"

        Startup Cost: 5.38

        Total Cost: 14973468.06

        Plan Rows: 4964540

        Plan Width: 90

        Join Filter: "(b.id = mb.id)"

        Plans:

          - Node Type: "Seq Scan"

            Parent Relationship: "Outer"

            Relation Name: "export"

            Alias: "b"

            Startup Cost: 0.00

            Total Cost: 79129.80

            Plan Rows: 5485680

            Plan Width: 90

          - Node Type: "Materialize"

            Parent Relationship: "Inner"

            Startup Cost: 5.38

            Total Cost: 717.51

            Plan Rows: 181

            Plan Width: 8

            Plans:

              - Node Type: "Bitmap Heap Scan"

                Parent Relationship: "Outer"

                Relation Name: "masterNew"

                Alias: "mb"

                Startup Cost: 5.38

                Total Cost: 716.61

                Plan Rows: 181

                Plan Width: 8

                Recheck Cond: "((status IS NULL) AND (sperre IS NULL))"

                Plans:

                  - Node Type: "Bitmap Index Scan"

                    Parent Relationship: "Outer"

                    Index Name: "masterNew_2016_pi_idx"

                    Startup Cost: 0.00

                    Total Cost: 5.34

                    Plan Rows: 181

                    Plan Width: 0

 

There we have our problem.

We have tried to fix it using a partial Index on _id_ with `status is null and sperre is null` .

If we don't use `sperre is null` in this query it is quick. I think we have these problems because _sperre_ and _status_ have much null values. _status_: 67% null and _sperre_: 97% null .

On each table there are btree indexes on _id_, _sperre_ and _status_.

On _masterNew_ there is a partial Index on _id_ with `sperre is null and status is null`.

 

Can somebody help me with these performance Problem.

What can I try to solve this?

 

Best regards,

Sven Kerkling

Re: Performance problems with postgres and null Values?

От
Merlin Moncure
Дата:
On Thu, Apr 21, 2016 at 4:49 AM, Sven Kerkling <kerkling@bds-online.com> wrote:
> Can somebody help me with these performance Problem.
>
> What can I try to solve this?

can you explain what the problem actually is?  Which query is running
slow and how fast do you think it should run?

merlin


Re: Performance problems with postgres and null Values?

От
"Sven Kerkling"
Дата:
This one is quick, running in 20ms:

    SELECT b.id, b.status
    FROM export b, masterOld mb
    WHERE mb.sperre IS NULL
      AND mb.status IS NULL
      AND b.id_firma = mb.id_firma
    LIMIT 100;

http://explain.depesz.com/s/SCBo

This one ist the burden, running at least 100 seconds:

    SELECT b.id, b.status
    FROM export b, masterNew mb
    WHERE mb.sperre IS NULL
      AND mb.status IS NULL
      AND b.id = mb.id
    LIMIT 100;

http://explain.depesz.com/s/eAqG

There should be only slight differences between them.

Sven

-----Ursprüngliche Nachricht-----
Von: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] Im Auftrag von Merlin
Moncure
Gesendet: Samstag, 23. April 2016 00:11
An: Sven Kerkling
Cc: postgres performance list
Betreff: Re: [PERFORM] Performance problems with postgres and null Values?

On Thu, Apr 21, 2016 at 4:49 AM, Sven Kerkling <kerkling@bds-online.com> wrote:
> Can somebody help me with these performance Problem.
>
> What can I try to solve this?

can you explain what the problem actually is?  Which query is running slow and how fast do you think it should run?

merlin


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Re: Performance problems with postgres and null Values?

От
Albe Laurenz
Дата:
Sven Kerkling wrote:
> This one ist the burden, running at least 100 seconds:
> 
>     SELECT b.id, b.status
>     FROM export b, masterNew mb
>     WHERE mb.sperre IS NULL
>       AND mb.status IS NULL
>       AND b.id = mb.id
>     LIMIT 100;
> 
> http://explain.depesz.com/s/eAqG

I think the problem is here:

Bitmap Index Scan on masterNew_2016_pi_idx (cost=0.00..5.34 rows=181 width=0) (actual time=805.225..805.225
rows=4,764,537loops=1)
 

Perhaps you should ANALYZE "masterNew" to get better statistics.

Yours,
Laurenz Albe

Re: Performance problems with postgres and null Values?

От
"Sven Kerkling"
Дата:
Thx.

All queries are now running as usual.

Thx for helping me.

Best regards
Sven

-----Ursprüngliche Nachricht-----
Von: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] Im Auftrag von Albe Laurenz
Gesendet: Montag, 25. April 2016 10:08
An: 'Sven Kerkling *EXTERN*'; 'Merlin Moncure'
Cc: 'postgres performance list'
Betreff: Re: [PERFORM] Performance problems with postgres and null Values?

Sven Kerkling wrote:
> This one ist the burden, running at least 100 seconds:
>
>     SELECT b.id, b.status
>     FROM export b, masterNew mb
>     WHERE mb.sperre IS NULL
>       AND mb.status IS NULL
>       AND b.id = mb.id
>     LIMIT 100;
>
> http://explain.depesz.com/s/eAqG

I think the problem is here:

Bitmap Index Scan on masterNew_2016_pi_idx (cost=0.00..5.34 rows=181 width=0) (actual time=805.225..805.225
rows=4,764,537loops=1) 

Perhaps you should ANALYZE "masterNew" to get better statistics.

Yours,
Laurenz Albe

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance