Обсуждение: Query Performance with Indexes on Integer type vs. Date type.

Поиск
Список
Период
Сортировка

Query Performance with Indexes on Integer type vs. Date type.

От
Dhimant Patel
Дата:
Hi All,

I am a new comer on postgres world and now using it for some serious (at least for me)  projects. I have a need where I am running some analytical + aggregate functions on data where ordering is done on Date type column.

From my initial read on documentation I believe internally a date type is represented by integer type of data. This makes me wonder would it make any good to create additional  column of Integer type and update it as data gets added and use this integer column for all ordering purposes for my sqls - or should I not hasitate using Date type straight into my sql for ordering? 

Better yet, is there anyway I can verify impact of ordering on Date type vs. Integer type, apart from using \timing and explain plan?


Thanks for sharing your insights.
-DP.

Re: Query Performance with Indexes on Integer type vs. Date type.

От
Tom Lane
Дата:
Dhimant Patel <drp4kri@gmail.com> writes:
> From my initial read on documentation I believe internally a date type is
> represented by integer type of data. This makes me wonder would it make any
> good to create additional  column of Integer type and update it as data gets
> added and use this integer column for all ordering purposes for my sqls - or
> should I not hasitate using Date type straight into my sql for ordering?

Don't overcomplicate things.  Comparison of dates is just about as fast as
comparison of integers, anyway.

            regards, tom lane

Re: Query Performance with Indexes on Integer type vs. Date type.

От
Maciek Sakrejda
Дата:
> This makes me wonder would it make any good to create additional  column of Integer type and update it as data gets
addedand use this integer column for all ordering purposes for my sqls - or should I not hasitate using Date type
straightinto my sql for ordering? 

Keep in mind what Michael A. Jackson (among others) had to say on
this: "The First Rule of Program Optimization: Don't do it. The Second
Rule of Program Optimization (for experts only!): Don't do it yet."
For one thing, adding an extra column to your data would mean more
data you need to cram in the cache as you query, so even if the *raw*
integer versus date ordering is faster, the "optimization" could still
be a net loss due to the fatter tuples. If you're willing to live with
*only* integer-based dates, that could help, but that seems
exceptionally painful and not worth considering unless you run into
trouble.

> Better yet, is there anyway I can verify impact of ordering on Date type vs. Integer type, apart from using \timing
andexplain plan? 

Remember to use explain analyze (and not just explain) when validating
these sorts of things. Explain is really just a guess. Also remember
to ensure that stats are up to date before you test this.

---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

Re: Query Performance with Indexes on Integer type vs. Date type.

От
"Kevin Grittner"
Дата:
Dhimant Patel <drp4kri@gmail.com> wrote:

> I am a new comer on postgres world and now using it for some
> serious (at least for me)  projects. I have a need where I am
> running some analytical + aggregate functions on data where
> ordering is done on Date type column.
>
> From my initial read on documentation I believe internally a date
> type is represented by integer type of data. This makes me wonder
> would it make any good to create additional column of Integer type
> and update it as data gets added and use this integer column for
> all ordering purposes for my sqls - or should I not hasitate using
> Date type straight into my sql for ordering?

I doubt that this will improve performance, particularly if you ever
want to see your dates formatted as dates.

> Better yet, is there anyway I can verify impact of ordering on
> Date type vs. Integer type, apart from using \timing and explain
> plan?

You might be better off just writing the code in the most natural
way, using the date type for dates, and then asking about any
queries which aren't performing as you hope they would.  Premature
optimization is often counter-productive.  If you really want to do
some benchmarking of relative comparison speeds, though, see the
generate_series function -- it can be good at generating test tables
for such things.

-Kevin

Re: Query Performance with Indexes on Integer type vs. Date type.

От
Phoenix Kiula
Дата:
On Thu, Apr 28, 2011 at 12:17 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>
> Dhimant Patel <drp4kri@gmail.com> wrote:
>
> > I am a new comer on postgres world and now using it for some
> > serious (at least for me)  projects. I have a need where I am
> > running some analytical + aggregate functions on data where
> > ordering is done on Date type column.
> >
> > From my initial read on documentation I believe internally a date
> > type is represented by integer type of data. This makes me wonder
> > would it make any good to create additional column of Integer type
> > and update it as data gets added and use this integer column for
> > all ordering purposes for my sqls - or should I not hasitate using
> > Date type straight into my sql for ordering?
>
> I doubt that this will improve performance, particularly if you ever
> want to see your dates formatted as dates.
>
> > Better yet, is there anyway I can verify impact of ordering on
> > Date type vs. Integer type, apart from using \timing and explain
> > plan?
>
> You might be better off just writing the code in the most natural
> way, using the date type for dates, and then asking about any
> queries which aren't performing as you hope they would.  Premature
> optimization is often counter-productive.  If you really want to do
> some benchmarking of relative comparison speeds, though, see the
> generate_series function -- it can be good at generating test tables
> for such things.




There is a lot of really good advice here already. I'll just add one thought.

If the dates in your tables are static based only on creation (as in
only a CREATE_DATE, which will never be modified per row like a
MODIFY_DATE for each record), then your thought might have made sense.
But in that case you can already use the ID field if you have one?

In most real world cases however the DATE field will likely be storing
an update time as well. Which would make your thought about numbering
with integers pointless.

Re: Query Performance with Indexes on Integer type vs. Date type.

От
Dhimant Patel
Дата:
Thanks for all valuable insights. I decided to drop the idea of adding additional column and 
will just rely on Date column for all ordering.

Tom - thanks for clear answer on the issue I was concerned about.
Maciek,Kevin - 
thanks for ideas, hint on generate_series() - I will have to go through cpl of times of postgres documentation before I will have better grasp of all available tools but this forum is very valuable.


-DP.


On Wed, Apr 27, 2011 at 12:46 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
On Thu, Apr 28, 2011 at 12:17 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>
> Dhimant Patel <drp4kri@gmail.com> wrote:
>
> > I am a new comer on postgres world and now using it for some
> > serious (at least for me)  projects. I have a need where I am
> > running some analytical + aggregate functions on data where
> > ordering is done on Date type column.
> >
> > From my initial read on documentation I believe internally a date
> > type is represented by integer type of data. This makes me wonder
> > would it make any good to create additional column of Integer type
> > and update it as data gets added and use this integer column for
> > all ordering purposes for my sqls - or should I not hasitate using
> > Date type straight into my sql for ordering?
>
> I doubt that this will improve performance, particularly if you ever
> want to see your dates formatted as dates.
>
> > Better yet, is there anyway I can verify impact of ordering on
> > Date type vs. Integer type, apart from using \timing and explain
> > plan?
>
> You might be better off just writing the code in the most natural
> way, using the date type for dates, and then asking about any
> queries which aren't performing as you hope they would.  Premature
> optimization is often counter-productive.  If you really want to do
> some benchmarking of relative comparison speeds, though, see the
> generate_series function -- it can be good at generating test tables
> for such things.




There is a lot of really good advice here already. I'll just add one thought.

If the dates in your tables are static based only on creation (as in
only a CREATE_DATE, which will never be modified per row like a
MODIFY_DATE for each record), then your thought might have made sense.
But in that case you can already use the ID field if you have one?

In most real world cases however the DATE field will likely be storing
an update time as well. Which would make your thought about numbering
with integers pointless.