Обсуждение: Discovering time of last database write

Поиск
Список
Период
Сортировка

Discovering time of last database write

От
"Andy Dale"
Дата:
Hi,

I need to be able to determine the last time (and date) that a database was written to.  I know it could be possible just to check the last modified dates in the PGDATA directory, but i need to compare the last write time of 3 databases (connecting via JDBC).  Hopefully the last write date is contained somewhere in a system table (information schema) but i have no idea of the table(s) i would need to query.

Thanks in advance,

Andy

Re: Discovering time of last database write

От
Scott Marlowe
Дата:
On Thu, 2007-01-04 at 11:11, Andy Dale wrote:
> Hi,
>
> I need to be able to determine the last time (and date) that a
> database was written to.  I know it could be possible just to check
> the last modified dates in the PGDATA directory, but i need to compare
> the last write time of 3 databases (connecting via JDBC).  Hopefully
> the last write date is contained somewhere in a system table
> (information schema) but i have no idea of the table(s) i would need
> to query.

Bad news, it's not generally stored.

Good news, it's not that hard to implement.

Perhaps if you give us the bigger picture we can make more logical
suggestions on how to accomplish it.

Re: Discovering time of last database write

От
"Andy Dale"
Дата:
Hi,

Sorry for the slight delay in my response.

I am using 3 PostgreSQL databases and writing to them using an SQL proxy.  These databases have a high write volume.  On rebooting all 3 servers for OS/Software updates, i would like to figure out which was the last written to DB (this is assuming the DB/Servers are not all taken down at the same time), the times are kept in sync with NTP.

I imagine it is possible to get this behaviour with after triggers, but this means i have to attach the same trigger to each table ??

Thanks,

Andy

On 04/01/07, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
On Thu, 2007-01-04 at 11:11, Andy Dale wrote:
> Hi,
>
> I need to be able to determine the last time (and date) that a
> database was written to.  I know it could be possible just to check
> the last modified dates in the PGDATA directory, but i need to compare
> the last write time of 3 databases (connecting via JDBC).  Hopefully
> the last write date is contained somewhere in a system table
> (information schema) but i have no idea of the table(s) i would need
> to query.

Bad news, it's not generally stored.

Good news, it's not that hard to implement.

Perhaps if you give us the bigger picture we can make more logical
suggestions on how to accomplish it.

Re: Discovering time of last database write

От
Bruno Wolff III
Дата:
On Mon, Jan 08, 2007 at 09:22:05 +0100,
  Andy Dale <andy.dale@gmail.com> wrote:
> Hi,
>
> Sorry for the slight delay in my response.
>
> I am using 3 PostgreSQL databases and writing to them using an SQL proxy.
> These databases have a high write volume.  On rebooting all 3 servers for
> OS/Software updates, i would like to figure out which was the last written
> to DB (this is assuming the DB/Servers are not all taken down at the same
> time), the times are kept in sync with NTP.
>
> I imagine it is possible to get this behaviour with after triggers, but this
> means i have to attach the same trigger to each table ??

I think what Scott was suggesting was that you tell us what you are planning
to do with the time. Depending on what you are trying to do, there may be
better ways of doing things.

Also the time of last update for an MVCC is a bit nebulous and to get it
in the database might not be possible with the semantics you want. For example
getting the time a transaction is committed is going to be hard without
modifying the backend, as any triggers will run before a transaction is
committed and can't know the precise time of the commit.

>
> Thanks,
>
> Andy
>
> On 04/01/07, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
> >
> >On Thu, 2007-01-04 at 11:11, Andy Dale wrote:
> >> Hi,
> >>
> >> I need to be able to determine the last time (and date) that a
> >> database was written to.  I know it could be possible just to check
> >> the last modified dates in the PGDATA directory, but i need to compare
> >> the last write time of 3 databases (connecting via JDBC).  Hopefully
> >> the last write date is contained somewhere in a system table
> >> (information schema) but i have no idea of the table(s) i would need
> >> to query.
> >
> >Bad news, it's not generally stored.
> >
> >Good news, it's not that hard to implement.
> >
> >Perhaps if you give us the bigger picture we can make more logical
> >suggestions on how to accomplish it.
> >

Re: Discovering time of last database write

От
"Andy Dale"
Дата:
Ok.

The SQL Proxy i am using (HA-JDBC) has some limitations with regard to getting it's "cluster" back into sync.  If ha-jdbc uses the wrong DB (one that has been out of action for a while) as the starting point for the cluster it will then try and delete stuff from the other DB's on their introduction to the cluster.

I thought the easiest way to control a complete "cluster" restart would be to extract the last write date and introduce the one with the last write date first, this will make certain the above scenario does not happen.

Thanks,

Andy

On 08/01/07, Bruno Wolff III <bruno@wolff.to> wrote:
On Mon, Jan 08, 2007 at 09:22:05 +0100,
  Andy Dale <andy.dale@gmail.com> wrote:
> Hi,
>
> Sorry for the slight delay in my response.
>
> I am using 3 PostgreSQL databases and writing to them using an SQL proxy.
> These databases have a high write volume.  On rebooting all 3 servers for
> OS/Software updates, i would like to figure out which was the last written
> to DB (this is assuming the DB/Servers are not all taken down at the same
> time), the times are kept in sync with NTP.
>
> I imagine it is possible to get this behaviour with after triggers, but this
> means i have to attach the same trigger to each table ??

I think what Scott was suggesting was that you tell us what you are planning
to do with the time. Depending on what you are trying to do, there may be
better ways of doing things.

Also the time of last update for an MVCC is a bit nebulous and to get it
in the database might not be possible with the semantics you want. For example
getting the time a transaction is committed is going to be hard without
modifying the backend, as any triggers will run before a transaction is
committed and can't know the precise time of the commit.

>
> Thanks,
>
> Andy
>
> On 04/01/07, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
> >
> >On Thu, 2007-01-04 at 11:11, Andy Dale wrote:
> >> Hi,
> >>
> >> I need to be able to determine the last time (and date) that a
> >> database was written to.  I know it could be possible just to check
> >> the last modified dates in the PGDATA directory, but i need to compare
> >> the last write time of 3 databases (connecting via JDBC).  Hopefully
> >> the last write date is contained somewhere in a system table
> >> (information schema) but i have no idea of the table(s) i would need
> >> to query.
> >
> >Bad news, it's not generally stored.
> >
> >Good news, it's not that hard to implement.
> >
> >Perhaps if you give us the bigger picture we can make more logical
> >suggestions on how to accomplish it.
> >

Re: Discovering time of last database write

От
Scott Marlowe
Дата:
On Mon, 2007-01-08 at 02:22, Andy Dale wrote:
> Hi,
>
> Sorry for the slight delay in my response.
>
> I am using 3 PostgreSQL databases and writing to them using an SQL
> proxy.  These databases have a high write volume.  On rebooting all 3
> servers for OS/Software updates, i would like to figure out which was
> the last written to DB (this is assuming the DB/Servers are not all
> taken down at the same time), the times are kept in sync with NTP.
>
> I imagine it is possible to get this behaviour with after triggers,
> but this means i have to attach the same trigger to each table ??

Hmmmm.   Still not completely clear on what exactly you're doing, but I
think I am getting an idea.

You could set each table to have a field for a timestamp, and build a
simple rule / trigger that updates it with the current time stamp for
every row as it's inserted / updated.  Then you could select
max(timestampcolumn) from a table to see which db had the latest
version.

I still think there might be a better solution to your problem.  Can we
get a more generic overview of what you're trying to do.  The 10,000
foot high view, so to speak.

Re: Discovering time of last database write

От
Scott Marlowe
Дата:
On Mon, 2007-01-08 at 03:26, Andy Dale wrote:
> Ok.
>
> The SQL Proxy i am using (HA-JDBC) has some limitations with regard to
> getting it's "cluster" back into sync.  If ha-jdbc uses the wrong DB
> (one that has been out of action for a while) as the starting point
> for the cluster it will then try and delete stuff from the other DB's
> on their introduction to the cluster.
>
> I thought the easiest way to control a complete "cluster" restart
> would be to extract the last write date and introduce the one with the
> last write date first, this will make certain the above scenario does
> not happen.

Sorry, I hadn't seen this post when I wrote my lost one.

Yeah, I think having a timestamp column with a rule so it has the
current timestamp when written to and then selecting for the max in each
table would work out.  You could probably get fancier, but I'm guessing
that cluster startup is a pretty rare thing, so it's probably easier to
write a script that selects all the tablenames from pg_tables (???) in
your schema and checks for the highest time in each table and selects
the master from that.

Re: Discovering time of last database write

От
Erik Jones
Дата:
Scott Marlowe wrote:
> On Mon, 2007-01-08 at 03:26, Andy Dale wrote:
>
>> Ok.
>>
>> The SQL Proxy i am using (HA-JDBC) has some limitations with regard to
>> getting it's "cluster" back into sync.  If ha-jdbc uses the wrong DB
>> (one that has been out of action for a while) as the starting point
>> for the cluster it will then try and delete stuff from the other DB's
>> on their introduction to the cluster.
>>
>> I thought the easiest way to control a complete "cluster" restart
>> would be to extract the last write date and introduce the one with the
>> last write date first, this will make certain the above scenario does
>> not happen.
>>
>
> Sorry, I hadn't seen this post when I wrote my lost one.
>
> Yeah, I think having a timestamp column with a rule so it has the
> current timestamp when written to and then selecting for the max in each
> table would work out.  You could probably get fancier, but I'm guessing
> that cluster startup is a pretty rare thing, so it's probably easier to
> write a script that selects all the tablenames from pg_tables (???)
pg_class


--
erik jones <erik@myemma.com>
software development
emma(r)


Re: Discovering time of last database write

От
"Andy Dale"
Дата:
Hi,

I am still not so certain about adding a timestamp column to each table, as within a few months the table will be quite big.  My current thinking is to have a trigger per table that overwrties a single value in a single utility table after every write, this will be far quicker to select when working with large tables ??? HA-JDBC does not care about the individual table last write date/time as it is currently not spphisticated enough to do per table sync with different masters per table, it just loops through each table in the first db activated treating it as being the most up to date. So HA-JDBC (i) just need the last write time of  the database on a whole, as this will hopefully mean it is the most up to date.

Andy

On 08/01/07, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
On Mon, 2007-01-08 at 03:26, Andy Dale wrote:
> Ok.
>
> The SQL Proxy i am using (HA-JDBC) has some limitations with regard to
> getting it's "cluster" back into sync.  If ha-jdbc uses the wrong DB
> (one that has been out of action for a while) as the starting point
> for the cluster it will then try and delete stuff from the other DB's
> on their introduction to the cluster.
>
> I thought the easiest way to control a complete "cluster" restart
> would be to extract the last write date and introduce the one with the
> last write date first, this will make certain the above scenario does
> not happen.

Sorry, I hadn't seen this post when I wrote my lost one.

Yeah, I think having a timestamp column with a rule so it has the
current timestamp when written to and then selecting for the max in each
table would work out.  You could probably get fancier, but I'm guessing
that cluster startup is a pretty rare thing, so it's probably easier to
write a script that selects all the tablenames from pg_tables (???) in
your schema and checks for the highest time in each table and selects
the master from that.

Re: Discovering time of last database write

От
"Michael Nolan"
Дата:
On several occasions I have thought that each row in a table should have a SYSTEM COLUMN which gave the timestamp of the last update of that row.  This could get a bit expensive on space and in some cases might be redundant with (or have a slightly different value from) a user-maintained timestamp field.

I have also thought that each table should have values for:

Last DDL
Last Insert
Last Update
Last Delete
--
Mike Nolan

Re: Discovering time of last database write

От
"Andy Dale"
Дата:
Hi Erik,

Can you elaborate a bit more on what you mean by pg_class, as looking at it i cannot figure out how to get the last write time from the pg_class table.

Cheers,

Andy

On 08/01/07, Erik Jones <erik@myemma.com> wrote:
Scott Marlowe wrote:
> On Mon, 2007-01-08 at 03:26, Andy Dale wrote:
>
>> Ok.
>>
>> The SQL Proxy i am using (HA-JDBC) has some limitations with regard to
>> getting it's "cluster" back into sync.  If ha-jdbc uses the wrong DB
>> (one that has been out of action for a while) as the starting point
>> for the cluster it will then try and delete stuff from the other DB's
>> on their introduction to the cluster.
>>
>> I thought the easiest way to control a complete "cluster" restart
>> would be to extract the last write date and introduce the one with the
>> last write date first, this will make certain the above scenario does
>> not happen.
>>
>
> Sorry, I hadn't seen this post when I wrote my lost one.
>
> Yeah, I think having a timestamp column with a rule so it has the
> current timestamp when written to and then selecting for the max in each
> table would work out.  You could probably get fancier, but I'm guessing
> that cluster startup is a pretty rare thing, so it's probably easier to
> write a script that selects all the tablenames from pg_tables (???)
pg_class


--
erik jones <erik@myemma.com>
software development
emma(r)