Re: oracle to psql migration - slow query in postgres
От | Mladen Gogala |
---|---|
Тема | Re: oracle to psql migration - slow query in postgres |
Дата | |
Msg-id | 4CB86A84.1000101@vmsinfo.com обсуждение исходный текст |
Ответ на | Re: oracle to psql migration - slow query in postgres (Samuel Gendler <sgendler@ideasculptor.com>) |
Список | pgsql-performance |
Samuel Gendler wrote: > > > On Thu, Oct 14, 2010 at 8:59 PM, Mladen Gogala > <mladen.gogala@vmsinfo.com <mailto:mladen.gogala@vmsinfo.com>> wrote: > > If working with partitioning, be very aware that PostgreSQL > optimizer has certain problems with partitions, especially with > group functions. If you want speed, everything must be prefixed > with partitioning column: indexes, expressions, joins. There is no > explicit star schema and creating hash indexes will not buy you > much, as a matter of fact, Postgres community is extremely > suspicious of the hash indexes and I don't see them widely used. > Having said that, I was able to solve the problems with my speed > and partitioning. > > > Could you elaborate on this, please? What do you mean by 'everythin > must be prefixed with partitioning column?' > > --sam If you have partitioned table part_tab, partitioned on the column item_date and if there is a global primary key in Oracle, let's call it item_id, then queries like "select * from part_tab where item_id=12345" will perform worse than queries with item_date" select * from part_tab where item_id=12345 and item_date='2010-10-15' This also applies to inserts and updates. Strictly speaking, the item_date column in the query above is not necessary, after all, the item_id column is the primary key. However, with range scans you will get much better results if you include the item_date column than if you use combination of columns without. The term "prefixed indexes" is borrowed from Oracle RDBMS and means that the beginning column in the index is the column on which the table is partitioned. Oracle, as opposed to Postgres, has global indexes, the indexes that span all partitions. PostgreSQL only maintains indexes on each of the partitions separately. Oracle calls such indexes "local indexes" and defines them on the partitioned table level. Here is a brief and rather succinct explanation of the terminology: http://www.oracle-base.com/articles/8i/PartitionedTablesAndIndexes.php Of, course, there are other differences between Oracle partitioning and PostgreSQL partitioning. The main difference is $10000/CPU. I am talking from experience: news=> \d moreover_documents Table "moreover.moreover_documents" Column | Type | Modifiers ----------------------+-----------------------------+----------- document_id | bigint | not null dre_reference | bigint | not null headline | character varying(4000) | author | character varying(200) | url | character varying(1000) | rank | bigint | content | text | stories_like_this | character varying(1000) | internet_web_site_id | bigint | not null harvest_time | timestamp without time zone | valid_time | timestamp without time zone | keyword | character varying(200) | article_id | bigint | not null media_type | character varying(20) | source_type | character varying(20) | created_at | timestamp without time zone | autonomy_fed_at | timestamp without time zone | language | character varying(150) | Indexes: "moreover_documents_pkey" PRIMARY KEY, btree (document_id) Triggers: insert_moreover_trigger BEFORE INSERT ON moreover_documents FOR EACH ROW EXE CUTE PROCEDURE moreover_insert_trgfn() Number of child tables: 8 (Use \d+ to list them.) The child tables are, of course, partitions. Here is the original: Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options SQL> desc moreover_documents Name Null? Type ----------------------------------------- -------- ---------------------------- DOCUMENT# NOT NULL NUMBER DRE_REFERENCE NOT NULL NUMBER HEADLINE VARCHAR2(4000) AUTHOR VARCHAR2(200) URL VARCHAR2(1000) RANK NUMBER CONTENT CLOB STORIES_LIKE_THIS VARCHAR2(1000) INTERNET_WEB_SITE# NOT NULL NUMBER HARVEST_TIME DATE VALID_TIME DATE KEYWORD VARCHAR2(200) ARTICLE_ID NOT NULL NUMBER MEDIA_TYPE VARCHAR2(20) CREATED_AT DATE SOURCE_TYPE VARCHAR2(50) PUBLISH_DATE DATE AUTONOMY_FED_AT DATE LANGUAGE VARCHAR2(150) SQL> I must say that it took me some time to get things right. -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 http://www.vmsinfo.com The Leader in Integrated Media Intelligence Solutions
В списке pgsql-performance по дате отправления: