Обсуждение: index over timestamp not being used
Hi all, I've got the following two tables running on postgresql 8.1.4 transactions Column | Type | Modifiers ----------------------+-----------------------------+--------------- transaction_id | character varying(32) | not null user_id | bigint | not null timestamp_in | timestamp without time zone | default now() type_id | integer | technology_id | integer | Indexes: "pk_phusrtrans_transid" PRIMARY KEY, btree (transaction_id) "idx_phusrtrans_paytyptech" btree (type_id, technology_id) "idx_putrnsctns_tstampin" btree (timestamp_in) statistics Column | Type | Modifiers ----------------------+-----------------------------+------------------- statistic_id | bigint | not null duration | bigint | transaction_id | character varying(32) | Indexes: "pk_phstat_statid" PRIMARY KEY, btree (statistic_id) "idx_phstat_transid" btree (transaction_id) the idea is to have a summary of how many transactions, duration, and type for every date. To do so, I've done the following query: SELECT count(t.transaction_id) AS num_transactions , SUM(s.duration) AS duration , date(t.timestamp_in) as date , t.type_id FROM transactions t LEFT OUTER JOIN statistics s ON t.transaction_id = s.transaction_id WHERE t.timestamp_in >= to_timestamp('20070101', 'YYYYMMDD') GROUP BY date, t.type_id; I think this could be speed up if the index idx_putrnsctns_tstampin (index over the timestamp) could be used, but I haven't been able to do it. Any suggestion? Thanks all -- Arnau
Arnau <arnaulist@andromeiberica.com> writes: > timestamp_in | timestamp without time zone | default now() > SELECT ... > FROM > transactions t > LEFT OUTER JOIN statistics s ON t.transaction_id = s.transaction_id > WHERE > t.timestamp_in >= to_timestamp('20070101', 'YYYYMMDD') > GROUP BY date, t.type_id; to_timestamp() produces timestamp *with* timezone, so your WHERE query is effectively t.timestamp_in::timestamptz >= to_timestamp('20070101', 'YYYYMMDD') which doesn't match the index. The first question you should ask yourself is whether you picked the right datatype for the column. IMHO timestamp with tz is the more appropriate choice in the majority of cases. If you do want to stick with timestamp without tz, you'll need to cast the result of to_timestamp to that. Alternatively, do you really need to_timestamp at all? The standard timestamp input routine won't have any problem with that format: t.timestamp_in >= '20070101' regards, tom lane
Hi Tom, > > Alternatively, do you really need to_timestamp at all? The standard > timestamp input routine won't have any problem with that format: > t.timestamp_in >= '20070101' This is always I think I'm worried, what happens if one day the internal format in which the DB stores the date/timestamps changes. I mean, if instead of being stored as YYYYMMDD is stored as DDMMYYYY, should we have to change all the queries? I thought the to_char/to_date/to_timestamp functions were intented for this purposes -- Arnau
Arnau <arnaulist@andromeiberica.com> writes: >> Alternatively, do you really need to_timestamp at all? The standard >> timestamp input routine won't have any problem with that format: >> t.timestamp_in >= '20070101' > This is always I think I'm worried, what happens if one day the internal > format in which the DB stores the date/timestamps changes. I mean, if > instead of being stored as YYYYMMDD is stored as DDMMYYYY, should we > have to change all the queries? You are confusing internal storage format with the external representation. > I thought the > to_char/to_date/to_timestamp functions were intented for this purposes No, they're intended for dealing with wacky formats that the regular input/output routines can't understand or produce. regards, tom lane
Am Dienstag 24 Juli 2007 schrieb Tom Lane: > > I thought the > > to_char/to_date/to_timestamp functions were intented for this purposes > > No, they're intended for dealing with wacky formats that the regular > input/output routines can't understand or produce. Really? I use them alot, because of possible problems with different date formats. 20070503 means May 3, 2007 for germans, I don't know what it means to US citizens, but following the strange logic of having the month first (8/13/2005) it might mean March 5, 2007 too. Therefore, using to_timestamp seems to be a safe choice for me, working in any environment regardless of the "date_style" setting.
On 2007-07-25 Mario Weilguni wrote: > Am Dienstag 24 Juli 2007 schrieb Tom Lane: >>> I thought the to_char/to_date/to_timestamp functions were intented >>> for this purposes >> >> No, they're intended for dealing with wacky formats that the regular >> input/output routines can't understand or produce. > > Really? I use them alot, because of possible problems with different > date formats. 20070503 means May 3, 2007 for germans, Actually, no. 20070503 is the condensed form of the ISO international calendar date format (see ISO 8601). German formats would be 03.05.2007 or 3. Mai 2007. Regards Ansgar Wiechers -- "The Mac OS X kernel should never panic because, when it does, it seriously inconveniences the user." --http://developer.apple.com/technotes/tn2004/tn2118.html