Обсуждение: [GENERAL] Question about paritioning
Hi
I have a large table about 3B rows, that I would like to partition on a column called _received which is of type timestamp
I followed this
and came up with
CREATE OR REPLACE FUNCTION create_partition_and_insert() RETURNS trigger AS
$BODY$
DECLARE
partition_date TEXT;
partition TEXT;
BEGIN
partition_date := to_char(NEW._received,'YYYYMM');
partition := TG_TABLE_NAME || '_' || partition_date;
IF NOT EXISTS(SELECT relname FROM pg_class WHERE relname=partition) THEN
RAISE NOTICE 'A partition has been created %',partition;
EXECUTE 'CREATE TABLE ' || partition || ' (check ( to_char(_received, '''YYYYMM''') = ''' || partition_date || ''')) INHERITS (' || TG_TABLE_NAME || ');';
END IF;
EXECUTE 'INSERT INTO ' || partition || ' SELECT(' || TG_TABLE_NAME || ' ' || quote_literal(NEW) || ').* RETURNING patent_id;';
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
Which I believe generates something like this
CREATE TABLE feedintra_201707( check (to_char(_received,'YYYYMM') = '201707' )) INHERITS (XXXXXX);
My concern is the cost of doing to_char on every row.
So
Is this going to work
Is this the right way to do it.
I was thinking if I could change the check to be something like
check ( _recieved >= YYYYMM 1 00:00 and _recieved <= YYYYMM <last day of the month> 23:59:59.999 )
so I am not sure how to make up the above line based on a timestamp
Thanks
Alex
On 7/26/2017 10:08 PM, Alex Samad wrote: > I have a large table about 3B rows, that I would like to partition on > a column called _received which is of type timestamp > a good goal is to have no more than about 100 partitions max, and ideally more like 25. when we partition on time stamp, we typically do it by the week, as we're doing 6 month data retention. IIRC, we're using DATE_TRUNC('week', timestamp)::DATE for use as the partition label and key. -- john r pierce, recycling bits in santa cruz
So is date_trunc better than to_char ? I'm thinking it probably is
as for the number of partitions, well we don't plan on deleting anything, but from my reading as long as the queries stay on a small amount of parts that we should be okay.
A
On 27 July 2017 at 15:33, John R Pierce <pierce@hogranch.com> wrote:
On 7/26/2017 10:08 PM, Alex Samad wrote:I have a large table about 3B rows, that I would like to partition on a column called _received which is of type timestamp
a good goal is to have no more than about 100 partitions max, and ideally more like 25.
when we partition on time stamp, we typically do it by the week, as we're doing 6 month data retention.
IIRC, we're using DATE_TRUNC('week', timestamp)::DATE for use as the partition label and key.
--
john r pierce, recycling bits in santa cruz
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 7/27/2017 12:43 AM, Alex Samad wrote: > > ... as long as the queries stay on a small amount of parts that we > should be okay. thats true as long as the planner can restrict the queries to the correct partition... but there's lots of queries that end up having to hit all partitions because the planner can't guess correctly. -- john r pierce, recycling bits in santa cruz
so I have a pgsql function that will create tables based on YYYYMD for x years in advance and inherits of base table.
with this
CHECK (_received >= '2013-01-01 00:00:00+11'::timestamp with time zone AND _received < '2013-02-01 01:00:00+11'::timestamp with time zone)
now for the insert, do I create / update this monthly have had a 2 or 3 level if then check before inserting
or do I create a programatic insert that works out the table name
On 27 July 2017 at 18:36, John R Pierce <pierce@hogranch.com> wrote:
On 7/27/2017 12:43 AM, Alex Samad wrote:
... as long as the queries stay on a small amount of parts that we should be okay.
thats true as long as the planner can restrict the queries to the correct partition... but there's lots of queries that end up having to hit all partitions because the planner can't guess correctly.
--
john r pierce, recycling bits in santa cruz
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general