Citext Performance

Поиск
Список
Период
Сортировка
От Deepak Somaiya
Тема Citext Performance
Дата
Msg-id 1048175283.1009672.1523999154175@mail.yahoo.com
обсуждение исходный текст
Ответ на Re: Unexplainable execution time difference between two testfunctions...one using IF (SELECT COUNT(*) FROM...) and the other using IFEXISTS (SELECT 1 FROM...)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-performance
Apology for sending you emails directly but I do see you guys responding on email related to performance so thought of copying you folks.

Folks,
 I read following (PostgreSQL: Documentation: 9.6: citext) and it does not hold true in my testing.. i.e citext is not performing better than lower.Am I missing something? help is appreciated.




"citext is not as efficient as text because the operator functions and the B-tree comparison functions must make copies of the data and convert it to lower case for comparisons. It is, however, slightly more efficient than using lower to get case-insensitive matching."


Here is what I have done 

drop table test;
drop table testci;

CREATE TABLE test (
id INTEGER PRIMARY KEY,
name character varying(254)
);
CREATE TABLE testci (
id INTEGER PRIMARY KEY,
name citext

);

INSERT INTO test(id, name)
SELECT generate_series(1000001,2000000), (md5(random()::text));

INSERT INTO testci(id, name)
SELECT generate_series(1,1000000), (md5(random()::text));


Now, I have done sequential search

explain (analyze on, format yaml) select * from test where lower(name)=lower('f6d7d5be1d0bed1cca11540d3a2667de');
- Plan: 
    Node Type: "Seq Scan"
    Parallel Aware: false
    Relation Name: "test"
    Alias: "test"
    Startup Cost: 0.00
    Total Cost: 23334.00
    Plan Rows: 5000
    Plan Width: 37
    Actual Startup Time: 0.016
    Actual Total Time: 680.199
    Actual Rows: 1
    Actual Loops: 1
    Filter: "(lower((name)::text) = 'f6d7d5be1d0bed1cca11540d3a2667de'::text)"
    Rows Removed by Filter: 999999
  Planning Time: 0.045
  Triggers: 
  Execution Time: 680.213


explain (analyze on, format yaml) select * from testci where name='956d692092f0b9f85f36bf2b2501f3ad';
- Plan: 
    Node Type: "Seq Scan"
    Parallel Aware: false
    Relation Name: "testci"
    Alias: "testci"
    Startup Cost: 0.00
    Total Cost: 20834.00
    Plan Rows: 1
    Plan Width: 37
    Actual Startup Time: 0.017
    Actual Total Time: 1184.485
    Actual Rows: 1
    Actual Loops: 1
    Filter: "(name = '956d692092f0b9f85f36bf2b2501f3ad'::citext)"
    Rows Removed by Filter: 999999
  Planning Time: 0.029
  Triggers: 
  Execution Time: 1184.496



You can see sequential searches with lower working twice as fast as citext.

Now I added index on citext and equivalent functional index (lower) on text.


CREATE INDEX textlowerindex ON test (lower(name));
create index textindex on test(name);


Index creation took longer with citext v/s creating lower functional index.


Now here comes execution with indexes

explain (analyze on, format yaml) select * from test where lower(name)=lower('f6d7d5be1d0bed1cca11540d3a2667de');

- Plan: 
    Node Type: "Bitmap Heap Scan"
    Parallel Aware: false
    Relation Name: "test"
    Alias: "test"
    Startup Cost: 187.18
    Total Cost: 7809.06
    Plan Rows: 5000
    Plan Width: 37
    Actual Startup Time: 0.020
    Actual Total Time: 0.020
    Actual Rows: 1
    Actual Loops: 1
    Recheck Cond: "(lower((name)::text) = 'f6d7d5be1d0bed1cca11540d3a2667de'::text)"
    Rows Removed by Index Recheck: 0
    Exact Heap Blocks: 1
    Lossy Heap Blocks: 0
    Plans: 
      - Node Type: "Bitmap Index Scan"
        Parent Relationship: "Outer"
        Parallel Aware: false
        Index Name: "textlowerindex"
        Startup Cost: 0.00
        Total Cost: 185.93
        Plan Rows: 5000
        Plan Width: 0
        Actual Startup Time: 0.016
        Actual Total Time: 0.016
        Actual Rows: 1
        Actual Loops: 1
        Index Cond: "(lower((name)::text) = 'f6d7d5be1d0bed1cca11540d3a2667de'::text)"
  Planning Time: 0.051
  Triggers: 
  Execution Time: 0.035




explain (analyze on, format yaml) select * from testci where name='956d692092f0b9f85f36bf2b2501f3ad'; 

- Plan: 
    Node Type: "Index Scan"
    Parallel Aware: false
    Scan Direction: "Forward"
    Index Name: "citextindex"
    Relation Name: "testci"
    Alias: "testci"
    Startup Cost: 0.42
    Total Cost: 8.44
    Plan Rows: 1
    Plan Width: 37
    Actual Startup Time: 0.049
    Actual Total Time: 0.050
    Actual Rows: 1
    Actual Loops: 1
    Index Cond: "(name = '956d692092f0b9f85f36bf2b2501f3ad'::citext)"
    Rows Removed by Index Recheck: 0
  Planning Time: 0.051
  Triggers: 
  Execution Time: 0.064


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Unexplainable execution time difference between two testfunctions...one using IF (SELECT COUNT(*) FROM...) and the other using IFEXISTS (SELECT 1 FROM...)
Следующее
От: "Ahmed, Nawaz"
Дата:
Сообщение: RE: Installing PostgreSQL 9.5 in centos 6 using YUM