Custom Operator for citext LIKE predicates question

Поиск
Список
Период
Сортировка
От Efrain J. Berdecia
Тема Custom Operator for citext LIKE predicates question
Дата
Msg-id 1197272596.186456.1642049516576@mail.yahoo.com
обсуждение исходный текст
Ответы Re: Custom Operator for citext LIKE predicates question  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
After attempting to use gin and gist indexes for our queries that run against citext columns, our team has come up with the following to make our queries run from 2 mins to 25ms;
CREATE EXTENSION pg_trgm
CREATE EXTENSION btree_gin --may not be needed, checking

CREATE OPERATOR CLASS gin_trgm_ops_ci_new
FOR TYPE citext USING gin
AS
OPERATOR 1 % (text, text),
FUNCTION 1 btint4cmp (int4, int4),
FUNCTION 2 gin_extract_value_trgm (text, internal),
FUNCTION 3 gin_extract_query_trgm (text, internal, int2, internal, internal, internal, internal),
FUNCTION 4 gin_trgm_consistent (internal,int2, text, int4, internal, internal, internal, internal),
STORAGE int4;

ALTER OPERATOR FAMILY gin_trgm_ops_ci_new USING gin ADD
OPERATOR 3 ~~ (citext, citext),
OPERATOR 4 ~~* (citext, citext);
ALTER OPERATOR FAMILY gin_trgm_ops_ci_new USING gin ADD
OPERATOR 7 %> (text, text),
FUNCTION 6 (text,text) gin_trgm_triconsistent (internal, int2, text, int4, internal, internal, internal);


Our question is, does anyone see any flaw on this? 

Also, could this not be incorporated into postgres natively?

I'm posting the old and new explain plans;

New explain;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------
 Aggregate  (cost=874327.76..874327.77 rows=1 width=8) (actual time=21.952..21.954 rows=1 loops=1)
->  Nested Loop  (cost=1620.95..874284.13 rows=17449 width=0) (actual time=6.259..21.948 rows=9 loops=1)
->  Bitmap Heap Scan on t775 b1  (cost=1620.39..525029.25 rows=45632 width=35) (actual time=6.212..8.189 rows=13 loops=1)
Recheck Cond: ((c240001002 ~~ 'smp%'::citext) OR (c200000020 ~~ 'smp%'::citext) OR (c200000001 ~~ 'smp%'::citext))
Rows Removed by Index Recheck: 259
Filter: ((c400079600 <> 'ABC_BUSINESSSERVICE'::citext) AND (c400127400 = 'ABC.ASSET'::citext) AND ((c1000000001 = 'Mrictton Global'::citext) OR (c1000000001 = 'ABCOpsMonitoring'::citext) OR (c100000
0001 = 'Mrictton'::citext) OR (c1000000001 = 'Mrictton EITTE'::citext) OR (c1000000001 = 'Mrictton Finance'::citext) OR (c1000000001 = 'Mrictton Generic Services and Support'::citext) OR (c1000000001 = 'Mrictton G
lobal'::citext) OR (c1000000001 = 'Mrictton Global Demo Solutions'::citext) OR (c1000000001 = 'Mrictton HR Direct'::citext) OR (c1000000001 = 'Mrictton Marketing and Communications'::citext) OR (c1000000001 = 'Eri
csson Master Data Management'::citext) OR (c1000000001 = 'Mrictton OHS'::citext) OR (c1000000001 = 'Mrictton Patents and Licensing'::citext) OR (c1000000001 = 'Mrictton Sales'::citext) OR (c1000000001 = 'Mrictton
Security'::citext) OR (c1000000001 = 'Mrictton Shared Services'::citext) OR (c1000000001 = 'Mrictton Sourcing'::citext) OR (c1000000001 = 'Mrictton Supply ROD'::citext) OR (c1000000001 = 'Mrictton SW Supply Operat
ions'::citext) OR (c1000000001 = 'Remedy,a ABC Software Company'::citext)) AND (c400079600 = ANY ('{ABC_DATABASE,ABC_ACCOUNT,ABC_MEDIA,ABC.CORE:ABC_CONCRETECOLLECTION,ABC_PACKAGE,ABC_BIOS,ABC_SYSTEMSOFTWARE,ABC_KEYBOARD,ABC_LAN,ABC_LOGICALSYSTEMCOMPONENT,ABC_LNSGROUP,ABC_PHYSICALLOCATION,ABC_FLOPPYDRIVE,ABC_DOCUMENT,ABC_BUSINESSSERVICE,ABC_DATABASESTORAGE,ABC_NETWORKPORT,ABC_VIRTUALSYSTEMENABLER,ABC_POINTINGDEVICE,ABC_PRINTER,ABC_SYSTEMRESOURCE,ABC_CONNECTIVITYSEGMENT,ABC.CORE:ABC_BUSINESSPROCESS,ABC_PROTOCOLENDPOINT,ABC_TRANSACTION,ABC_APPLICATIONINFRASTRUCTURE,ABC_SOFTWARESERVER,ABC_UPS,ABC_ACTIVITY,ABC_CDROMDRIVE,ABC.CORE:ABC_RASD,ABC_PRODUCT,ABC_REMOTEFILESYSTEM,ABC_IPENDPOINT,ABC_LOCALFILESYSTEM,ABC_APPLICATION,ABC_IPCONNECTIVITYSUBNET,ABC_CLUSTER,ABC_CHASSIS,ABC_WAN,ABC_PATCH,ABC_ADMINDOMAIN,ABC.CORE:ABC_RESOURCEPOOL,ABC_IPXCONNECTIVITYNETWORK,ABC_HARDWARESYSTEMCOMPONENT,ABC_FILESYSTEM,ABC_MONITOR,ABC_CONNECTIVITYGROUP,ABC_EQUIPMENT,ABC_MAINFRAME,ABC_RACK,ABC_OPERATINGSYSTEM,ABC_PROCESSOR,ABC_SHARE,ABC_LANENDPOINT,ABC_HARDWAREPACKAGE,ABC_TAPEDRIVE,ABC_COMMUNICATIONENDPOINT,ABC_APPLICATIONSYSTEM,ABC_CARD,ABC_DISKPARTITION,ABC.CORE:ABC_VIRTUALSYSTEMSETTINGDATA,ABC_MEMORY,ABC_NTDOMAIN,ABC_COMPUTERSYSTEM,ABC_DISKDRIVE,ABC_SERVICEOFFERINGINSTANCE,ABC_ROLE,ABC_APPLICATIONSERVICE}'::citext[])))
Rows Removed by Filter: 62
Heap Blocks: exact=313
->  BitmapOr  (cost=1620.39..1620.39 rows=163489 width=0) (actual time=5.703..5.704 rows=0 loops=1)
->  Bitmap Index Scan on oto2  (cost=0.00..528.72 rows=54496 width=0) (actual time=0.724..0.724 rows=41 loops=1)
Index Cond: (c240001002 ~~ 'smp%'::citext)
->  Bitmap Index Scan on oto3  (cost=0.00..528.72 rows=54496 width=0) (actual time=4.852..4.852 rows=331 loops=1)
Index Cond: (c200000020 ~~ 'smp%'::citext)
->  Bitmap Index Scan on oto4  (cost=0.00..528.72 rows=54496 width=0) (actual time=0.127..0.127 rows=0 loops=1)
Index Cond: (c200000001 ~~ 'smp%'::citext)
->  Index Scan using i1279_0_400129200_t1279 on t1279 b2  (cost=0.56..7.64 rows=1 width=35) (actual time=1.057..1.058 rows=1 loops=13)
Index Cond: (c400129200 = b1.c400129200)
Filter: ((c7 <> 6) AND (c7 <> 8))
Rows Removed by Filter: 0
Planning Time: 2.478 ms
Execution Time: 22.059 ms
(21 rows)

Time: 26.510 ms

Old explain with slow plan;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1926420.44..1926420.70 rows=102 width=1199) (actual time=16396.091..16569.194 rows=9 loops=1)
->  Sort  (cost=1926420.44..1926458.76 rows=15326 width=1199) (actual time=16396.089..16569.190 rows=9 loops=1)
Sort Key: b1.c200000020 NULLS FIRST, ((concat((concat(b1.c1, '|'))::citext, COALESCE(b2.c1, ''::citext)))::citext)
Sort Method: quicksort  Memory: 29kB
->  WindowAgg  (cost=1000.56..1925832.51 rows=15326 width=1199) (actual time=16396.025..16569.138 rows=9 loops=1)
->  Gather  (cost=1000.56..1925564.30 rows=15326 width=1191) (actual time=4288.742..16569.068 rows=9 loops=1)
Workers Planned: 6
Workers Launched: 6
->  Nested Loop  (cost=0.56..1923031.70 rows=2554 width=1191) (actual time=9430.362..16387.794 rows=1 loops=7)
->  Parallel Seq Scan on t1279 b2  (cost=0.00..530806.15 rows=416134 width=910) (actual time=0.016..575.311 rows=353200 loops=7)
Filter: ((c7 <> 6) AND (c7 <> 8))
Rows Removed by Filter: 574840
->  Index Scan using efrain_test_ix_t775_2 on t775 b1  (cost=0.56..3.34 rows=1 width=316) (actual time=0.044..0.044 rows=0 loops=2472402)
Index Cond: ((c400129200 = b2.c400129200) AND (c400127400 = 'ABC.ASSET'::citext))
Filter: ((c400079600 <> 'ABC_BUSINESSSERVICE'::citext) AND ((c240001002 ~~ 'smp%'::citext) OR (c200000020 ~~ 'smp%'::citext) OR (c200000001 ~~ 'smp%'::citext)) AND ((c1000000001 =
'Mrictton Global'::citext) OR (c1000000001 = 'ABCOpsMonitoring'::citext) OR (c1000000001 = 'Mrictton'::citext) OR (c1000000001 = 'Mrictton EITTE'::citext) OR (c1000000001 = 'Mrictton Finance'::citext) OR (c1000000001 = 'Mrictton Generic Services and Support'::citext) OR (c1000000001 = 'Mrictton Global'::citext) OR (c1000000001 = 'Mrictton Global Demo Solutions'::citext) OR (c1000000001 = 'Mrictton HR Direct'::citext) OR (c1000000001 = 'Mrictton Marketing and Communications'::citext) OR (c1000000001 = 'Mrictton Master Data Management'::citext) OR (c1000000001 = 'Mrictton OHS'::citext) OR (c1000000001 = 'Mrictton Patents and Licensing'::citext) OR (c1000000001 = 'Mrictton Sales'::citext) OR (c1000000001 = 'Mrictton Security'::citext) OR (c1000000001 = 'Mrictton Shared Services'::citext) OR (c1000000001 = 'Mrictton Sourcing'::citext) OR (c1000000001 = 'Mrictton Supply ROD'::citext) OR (c1000000001 = 'Mrictton SW Supply Operations'::citext) OR (c1000000001 = 'Remedy,a ABC Software Company'::citext)) AND (c400079600 = ANY ('{ABC_DATABASE,ABC_ACCOUNT,ABC_MEDIA,ABC.CORE:ABC_CONCRETECOLLECTION,ABC_PACKAGE,ABC_BIOS,ABC_SYSTEMSOFTWARE,ABC_KEYBOARD,ABC_LAN,ABC_LOGICALSYSTEMCOMPONENT,ABC_LNSGROUP,ABC_PHYSICALLOCATION,ABC_FLOPPYDRIVE,ABC_DOCUMENT,ABC_BUSINESSSERVICE,ABC_DATABASESTORAGE,ABC_NETWORKPORT,ABC_VIRTUALSYSTEMENABLER,ABC_POINTINGDEVICE,ABC_PRINTER,ABC_SYSTEMRESOURCE,ABC_CONNECTIVITYSEGMENT,ABC.CORE:ABC_BUSINESSPROCESS,ABC_PROTOCOLENDPOINT,ABC_TRANSACTION,ABC_APPLICATIONINFRASTRUCTURE,ABC_SOFTWARESERVER,ABC_UPS,ABC_ACTIVITY,ABC_CDROMDRIVE,ABC.CORE:ABC_RASD,ABC_PRODUCT,ABC_REMOTEFILESYSTEM,ABC_IPENDPOINT,ABC_LOCALFILESYSTEM,ABC_APPLICATION,ABC_IPCONNECTIVITYSUBNET,ABC_CLUSTER,ABC_CHASSIS,ABC_WAN,ABC_PATCH,ABC_ADMINDOMAIN,ABC.CORE:ABC_RESOURCEPOOL,ABC_IPXCONNECTIVITYNETWORK,ABC_HARDWARESYSTEMCOMPONENT,ABC_FILESYSTEM,ABC_MONITOR,ABC_CONNECTIVITYGROUP,ABC_EQUIPMENT,ABC_MAINFRAME,ABC_RACK,ABC_OPERATINGSYSTEM,ABC_PROCESSOR,ABC_SHARE,ABC_LANENDPOINT,ABC_HARDWAREPACKAGE,ABC_TAPEDRIVE,ABC_COMMUNICATIONENDPOINT,ABC_APPLICATIONSYSTEM,ABC_CARD,ABC_DISKPARTITION,ABC.CORE:ABC_VIRTUALSYSTEMSETTINGDATA,ABC_MEMORY,
ABC_NTDOMAIN,ABC_COMPUTERSYSTEM,ABC_DISKDRIVE,ABC_SERVICEOFFERINGINSTANCE,ABC_ROLE,ABC_APPLICATIONSERVICE}'::citext[])))
Rows Removed by Filter: 1
Planning Time: 3.205 ms
Execution Time: 16569.351 ms
(18 rows)

Time: 16577.806 ms (00:16.578)

Products
PostgreSQL Community Edition
Product Version
PostgreSQL 12


Thanks.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Windows vs recovery tests
Следующее
От: Amit Langote
Дата:
Сообщение: Re: a misbehavior of partition row movement (?)