Обсуждение: Database design advice

Поиск
Список
Период
Сортировка

Database design advice

От
Neil Saunders
Дата:
Hi all,

I'm writing a property rental web application, and one of the metrics
I wish to track is the number of page views per month for each
property.

I originally envisaged a table with a column for each month (one row
per property), for which the relevant column would be incremented each
time a property is viewed, depending on the month. But this raises
questions as to the best way to maintain this table (i.e. Create a new
month column each month) I'd prefer to keep all logic in the database,
so would prefer not to use a cron job to do this.

I then thought of using a trigger to check if the relevant column
existed before attempting to increment, but this would only be
utilised once, and then just create overhead at every update in the
future. Should I just create a table with 10 years worth of columns?
Is there a different alternative entirely?

None of the above strike me as architecturally sound, and so any
advice from someone more seasoned in database design would be
gratefully received.

Kind Regards,

Neil.

Re: Database design advice

От
"Daniel T. Staal"
Дата:
On Fri, October 21, 2005 11:47 am, Neil Saunders said:
> Hi all,
>
> I'm writing a property rental web application, and one of the metrics
> I wish to track is the number of page views per month for each
> property.
>
> I originally envisaged a table with a column for each month (one row
> per property), for which the relevant column would be incremented each
> time a property is viewed, depending on the month. But this raises
> questions as to the best way to maintain this table (i.e. Create a new
> month column each month) I'd prefer to keep all logic in the database,
> so would prefer not to use a cron job to do this.
>
> I then thought of using a trigger to check if the relevant column
> existed before attempting to increment, but this would only be
> utilised once, and then just create overhead at every update in the
> future. Should I just create a table with 10 years worth of columns?
> Is there a different alternative entirely?
>
> None of the above strike me as architecturally sound, and so any
> advice from someone more seasoned in database design would be
> gratefully received.

Are you tracking anything else per page/property view?

My first thought would be to have a separate table of 'view' information:
When, what property, and whatever else you want to track.  Then, if you
want to know how many times a particular page/property was viewed you just
do a select on that table for all the records related to it during that
time period and count the rows.

As a bonus, you aren't limited to any particular time interval, and can
add fields in the future.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


Re: Database design advice

От
Дата:
--- "Daniel T. Staal" <DStaal@usa.net> wrote:

> On Fri, October 21, 2005 11:47 am, Neil Saunders
> said:
> > Hi all,
> >
> > I'm writing a property rental web application, and
> one of the metrics
> > I wish to track is the number of page views per
> month for each
> > property.
> >
> > I originally envisaged a table with a column for
> each month (one row
> > per property), for which the relevant column would
> be incremented each
> > time a property is viewed, depending on the month.
> But this raises
> > questions as to the best way to maintain this
> table (i.e. Create a new
> > month column each month) I'd prefer to keep all
> logic in the database,
> > so would prefer not to use a cron job to do this.
> >
> > I then thought of using a trigger to check if the
> relevant column
> > existed before attempting to increment, but this
> would only be
> > utilised once, and then just create overhead at
> every update in the
> > future. Should I just create a table with 10 years
> worth of columns?
> > Is there a different alternative entirely?
> >
> > None of the above strike me as architecturally
> sound, and so any
> > advice from someone more seasoned in database
> design would be
> > gratefully received.
>
> Are you tracking anything else per page/property
> view?
>
> My first thought would be to have a separate table
> of 'view' information:
> When, what property, and whatever else you want to
> track.  Then, if you
> want to know how many times a particular
> page/property was viewed you just
> do a select on that table for all the records
> related to it during that
> time period and count the rows.
>
> As a bonus, you aren't limited to any particular
> time interval, and can
> add fields in the future.
>
> Daniel T. Staal

i'd just add that you could take the table that Daniel
discusses and populate a second table with addresses
and monthly hits.  each address and month would be a
new record in the table (as opposed to a new column).
off hand, i see a link table connecting a table of
months to a table of addresses and a table of years -
but i'm not sure this would be the optimal design.

this would probably only be necessary if you got to
the point that your query on the main table became
time consuming - and you hope to have that problem!
;-)

good luck.



__________________________________
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com

Importing data

От
"Wes Williams"
Дата:
I've been using MySQL for about the past year and am trying to convert to
PostgreSQL now but don't know the simplest way to import my table data into
Postgre.  There are many free tools (phpmyadmin, Sqlyog, MySQL Query
Browser, etc.) for MySQL that let you import Excel, CSV, Text, XML, and many
more into the database but I can't find anything other than the EMS Data
Import for Postgre.  Are there any free tools for Postgre that I can use for
uploading data into the database?

Is I am very new to Postgre, I would also appreciate correction if I am
overlooking something simple or just going about populating my database the
wrong way.

Thanks!


Re: Importing data

От
Дата:
--- Wes Williams <wes_williams@fcbonline.net> wrote:

> I've been using MySQL for about the past year and am
> trying to convert to
> PostgreSQL now but don't know the simplest way to
> import my table data into
> Postgre.  There are many free tools (phpmyadmin,
> Sqlyog, MySQL Query
> Browser, etc.) for MySQL that let you import Excel,
> CSV, Text, XML, and many
> more into the database but I can't find anything
> other than the EMS Data
> Import for Postgre.  Are there any free tools for
> Postgre that I can use for
> uploading data into the database?
>
> Is I am very new to Postgre, I would also appreciate
> correction if I am
> overlooking something simple or just going about
> populating my database the
> wrong way.
>
> Thanks!

Wes, there is definitely a way to do it.  i think it
is through the command line, though.  i had some data
in an openoffice.org's calc and i imported it into a
table via cygwin (which means the commandline was
involved).  i'm trying to research my old posts that
addessed this.

update:

the save you data to a csv format and then use the
copy command (copy from, specifically), to get the job
done.

i had a problem with the encoding being wrong out of
the default csv export in OOo, however, when i
exported to ASCII/US, the copy from command worked
great.

good luck.




__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com

Re: Database design advice

От
Michael Glaesemann
Дата:
On Oct 22, 2005, at 0:47 , Neil Saunders wrote:

> I'm writing a property rental web application, and one of the metrics
> I wish to track is the number of page views per month for each
> property.
>
> I originally envisaged a table with a column for each month (one row
> per property), for which the relevant column would be incremented each
> time a property is viewed, depending on the month. But this raises
> questions as to the best way to maintain this table (i.e. Create a new
> month column each month) I'd prefer to keep all logic in the database,
> so would prefer not to use a cron job to do this.

I strongly suggest *not* making a column for each month, but rather a
row for each month--or even for each page view for greater
granularity. Databases are a great way of storing raw information and
transforming it into different kinds of summaries, one of which could
be a cross tab of properties and the hits per month. In this case
your raw data is tracking when a page view occurred, so you'd want to
have a table that captures this information: property, timestamp (or
date). Take a look at tablefunc in contrib about how to generate
cross tabs. I've found them very helpful for generating this kind of
summary (though in my case it's been orders per month).

Depending on your performance needs, you may want to generate an
interim table that pre-calculates your totals per month. For example,
if your property view tracking table is

create table property_views (
     property_id integer not null references properties (property_id)
     , view_timestamp timestamp(0) with time zone not null
) without oids;

you'd have a table

create table property_views_per_month as
select property_id
     , date_trunc('month', view_timestamp) as view_month
     , count(property_id) as view_count
from property_views
group by property_id, view_month;

You'd then generate your crosstab from the property_views_per_month
table. (Of course, you'd have to drop and recreate or otherwise
update this table periodically, as it doesn't capture up-to-date
data). This is an optimization step, however, so unless you find that
you need the data faster, you can just use a view and generate the
crosstab from the view, e.g.,

create view property_views_per_month_view as
select property_id
     , date_truc('month', view_timestamp) as view_month
     , count(property_id) as view_count
from property_views
group by property_id, view_month;

Hope this helps!

Michael Glaesemann
grzm myrealbox com


Re: Database design advice

От
Neil Saunders
Дата:
Apologies for the delayed reply - Thank you all for the advice.

I'll get on to installing crosstab tonight. I really like the idea of
dumping each page view to a table and creating a view to present that
data, but since this data will be used to render a graph every time a
user logs in I've got this feeling that this will prove a bottleneck
if (when!!) the site becomes successful and the query starts to slow
to a crawl. Of course my fears are totally without substance, but I'd
rather spend more time thinking about the design than undo-ing a bad
design mistake in the future.

Thanks once again for you input!

Kind Regards,

Neil.


On 10/21/05, Michael Glaesemann <grzm@myrealbox.com> wrote:
>
> On Oct 22, 2005, at 0:47 , Neil Saunders wrote:
>
> > I'm writing a property rental web application, and one of the metrics
> > I wish to track is the number of page views per month for each
> > property.
> >
> > I originally envisaged a table with a column for each month (one row
> > per property), for which the relevant column would be incremented each
> > time a property is viewed, depending on the month. But this raises
> > questions as to the best way to maintain this table (i.e. Create a new
> > month column each month) I'd prefer to keep all logic in the database,
> > so would prefer not to use a cron job to do this.
>
> I strongly suggest *not* making a column for each month, but rather a
> row for each month--or even for each page view for greater
> granularity. Databases are a great way of storing raw information and
> transforming it into different kinds of summaries, one of which could
> be a cross tab of properties and the hits per month. In this case
> your raw data is tracking when a page view occurred, so you'd want to
> have a table that captures this information: property, timestamp (or
> date). Take a look at tablefunc in contrib about how to generate
> cross tabs. I've found them very helpful for generating this kind of
> summary (though in my case it's been orders per month).
>
> Depending on your performance needs, you may want to generate an
> interim table that pre-calculates your totals per month. For example,
> if your property view tracking table is
>
> create table property_views (
>     property_id integer not null references properties (property_id)
>     , view_timestamp timestamp(0) with time zone not null
> ) without oids;
>
> you'd have a table
>
> create table property_views_per_month as
> select property_id
>     , date_trunc('month', view_timestamp) as view_month
>     , count(property_id) as view_count
> from property_views
> group by property_id, view_month;
>
> You'd then generate your crosstab from the property_views_per_month
> table. (Of course, you'd have to drop and recreate or otherwise
> update this table periodically, as it doesn't capture up-to-date
> data). This is an optimization step, however, so unless you find that
> you need the data faster, you can just use a view and generate the
> crosstab from the view, e.g.,
>
> create view property_views_per_month_view as
> select property_id
>     , date_truc('month', view_timestamp) as view_month
>     , count(property_id) as view_count
> from property_views
> group by property_id, view_month;
>
> Hope this helps!
>
> Michael Glaesemann
> grzm myrealbox com
>
>

Re: Database design advice

От
Neil Saunders
Дата:
Following the advice given in this thread, I'd tought I'd just post
back what I'd come up with for the benefit of interested parties.

After a bit of searching I found a little program called pgLogd
http://www.digitalstratum.com/pglogd/

Which is a lightweight deamon that saves Apache access logs to a
PostgreSQL table. In order to generate statistics for property views
(A single page, passed a GET variable), I will use an intermediate
table as suggseted by Daniel which would be a updated via script that
is cronned to run every night.

With appropriate indexes on the date field in the primary log table,
speed shouldn't be significantly affected as the table grows. It also
has the advantage of retaining all data to allow for wider offline
analysis to assist with site trend analysis etc. The 'filtered' data
table will stay lean and mean, containing only number of hits per
property per month.


Knid Regards,

Neil Saunders.

On 10/25/05, Neil Saunders <n.j.saunders@gmail.com> wrote:
> Apologies for the delayed reply - Thank you all for the advice.
>
> I'll get on to installing crosstab tonight. I really like the idea of
> dumping each page view to a table and creating a view to present that
> data, but since this data will be used to render a graph every time a
> user logs in I've got this feeling that this will prove a bottleneck
> if (when!!) the site becomes successful and the query starts to slow
> to a crawl. Of course my fears are totally without substance, but I'd
> rather spend more time thinking about the design than undo-ing a bad
> design mistake in the future.
>
> Thanks once again for you input!
>
> Kind Regards,
>
> Neil.
>
>
> On 10/21/05, Michael Glaesemann <grzm@myrealbox.com> wrote:
> >
> > On Oct 22, 2005, at 0:47 , Neil Saunders wrote:
> >
> > > I'm writing a property rental web application, and one of the metrics
> > > I wish to track is the number of page views per month for each
> > > property.
> > >
> > > I originally envisaged a table with a column for each month (one row
> > > per property), for which the relevant column would be incremented each
> > > time a property is viewed, depending on the month. But this raises
> > > questions as to the best way to maintain this table (i.e. Create a new
> > > month column each month) I'd prefer to keep all logic in the database,
> > > so would prefer not to use a cron job to do this.
> >
> > I strongly suggest *not* making a column for each month, but rather a
> > row for each month--or even for each page view for greater
> > granularity. Databases are a great way of storing raw information and
> > transforming it into different kinds of summaries, one of which could
> > be a cross tab of properties and the hits per month. In this case
> > your raw data is tracking when a page view occurred, so you'd want to
> > have a table that captures this information: property, timestamp (or
> > date). Take a look at tablefunc in contrib about how to generate
> > cross tabs. I've found them very helpful for generating this kind of
> > summary (though in my case it's been orders per month).
> >
> > Depending on your performance needs, you may want to generate an
> > interim table that pre-calculates your totals per month. For example,
> > if your property view tracking table is
> >
> > create table property_views (
> >     property_id integer not null references properties (property_id)
> >     , view_timestamp timestamp(0) with time zone not null
> > ) without oids;
> >
> > you'd have a table
> >
> > create table property_views_per_month as
> > select property_id
> >     , date_trunc('month', view_timestamp) as view_month
> >     , count(property_id) as view_count
> > from property_views
> > group by property_id, view_month;
> >
> > You'd then generate your crosstab from the property_views_per_month
> > table. (Of course, you'd have to drop and recreate or otherwise
> > update this table periodically, as it doesn't capture up-to-date
> > data). This is an optimization step, however, so unless you find that
> > you need the data faster, you can just use a view and generate the
> > crosstab from the view, e.g.,
> >
> > create view property_views_per_month_view as
> > select property_id
> >     , date_truc('month', view_timestamp) as view_month
> >     , count(property_id) as view_count
> > from property_views
> > group by property_id, view_month;
> >
> > Hope this helps!
> >
> > Michael Glaesemann
> > grzm myrealbox com
> >
> >
>