Обсуждение: data dump help

Поиск
Список
Период
Сортировка

data dump help

От
Terry
Дата:
Hello,

Sorry for the poor subject.  Not sure how to describe what I need
here.  I have an application that logs to a single table in pgsql.
In order for me to get into our log management, I need to dump it out
to a file on a periodic basis to get new logs.  I am not sure how to
tackle this.  I thought about doing a date calculation and just
grabbing the previous 6 hours of logs and writing that to a new log
file and setting up a rotation like that.  Unfortunately, the log
management solution can't go into pgsql directly.  Thoughts?

Thanks!

Re: data dump help

От
Andy Colson
Дата:
On 1/18/2010 4:08 PM, Terry wrote:
> Hello,
>
> Sorry for the poor subject.  Not sure how to describe what I need
> here.  I have an application that logs to a single table in pgsql.
> In order for me to get into our log management, I need to dump it out
> to a file on a periodic basis to get new logs.  I am not sure how to
> tackle this.  I thought about doing a date calculation and just
> grabbing the previous 6 hours of logs and writing that to a new log
> file and setting up a rotation like that.  Unfortunately, the log
> management solution can't go into pgsql directly.  Thoughts?
>
> Thanks!
>

How about a flag in the db, like: dumped.

inside one transactions you'd be safe doing:

begin
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
select * from log where dumped = 0;
-- app code to format/write/etc
update log set dumped = 1 where dumped = 0;
commit;

Even if other transactions insert new records, you're existing
transaction wont see them, and the update wont touch them.

-Andy

Re: data dump help

От
Terry
Дата:
On Mon, Jan 18, 2010 at 4:48 PM, Andy Colson <andy@squeakycode.net> wrote:
> On 1/18/2010 4:08 PM, Terry wrote:
>>
>> Hello,
>>
>> Sorry for the poor subject.  Not sure how to describe what I need
>> here.  I have an application that logs to a single table in pgsql.
>> In order for me to get into our log management, I need to dump it out
>> to a file on a periodic basis to get new logs.  I am not sure how to
>> tackle this.  I thought about doing a date calculation and just
>> grabbing the previous 6 hours of logs and writing that to a new log
>> file and setting up a rotation like that.  Unfortunately, the log
>> management solution can't go into pgsql directly.  Thoughts?
>>
>> Thanks!
>>
>
> How about a flag in the db, like: dumped.
>
> inside one transactions you'd be safe doing:
>
> begin
> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
> select * from log where dumped = 0;
> -- app code to format/write/etc
> update log set dumped = 1 where dumped = 0;
> commit;
>
> Even if other transactions insert new records, you're existing transaction
> wont see them, and the update wont touch them.
>
> -Andy
>

I like your thinking but I shouldn't add a new column to this
database.  It's a 3rd party application.

Re: data dump help

От
Terry
Дата:
On Mon, Jan 18, 2010 at 5:07 PM, Terry <td3201@gmail.com> wrote:
> On Mon, Jan 18, 2010 at 4:48 PM, Andy Colson <andy@squeakycode.net> wrote:
>> On 1/18/2010 4:08 PM, Terry wrote:
>>>
>>> Hello,
>>>
>>> Sorry for the poor subject.  Not sure how to describe what I need
>>> here.  I have an application that logs to a single table in pgsql.
>>> In order for me to get into our log management, I need to dump it out
>>> to a file on a periodic basis to get new logs.  I am not sure how to
>>> tackle this.  I thought about doing a date calculation and just
>>> grabbing the previous 6 hours of logs and writing that to a new log
>>> file and setting up a rotation like that.  Unfortunately, the log
>>> management solution can't go into pgsql directly.  Thoughts?
>>>
>>> Thanks!
>>>
>>
>> How about a flag in the db, like: dumped.
>>
>> inside one transactions you'd be safe doing:
>>
>> begin
>> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
>> select * from log where dumped = 0;
>> -- app code to format/write/etc
>> update log set dumped = 1 where dumped = 0;
>> commit;
>>
>> Even if other transactions insert new records, you're existing transaction
>> wont see them, and the update wont touch them.
>>
>> -Andy
>>
>
> I like your thinking but I shouldn't add a new column to this
> database.  It's a 3rd party application.
>

Although.  I really like your idea so I might create another table
where I will log whether the data has been dumped or not.  I just need
to come up with a query to check this with the other table.

Re: data dump help

От
"Bret S. Lambert"
Дата:
On Mon, Jan 18, 2010 at 05:49:32PM -0600, Terry wrote:
> On Mon, Jan 18, 2010 at 5:07 PM, Terry <td3201@gmail.com> wrote:
> > On Mon, Jan 18, 2010 at 4:48 PM, Andy Colson <andy@squeakycode.net> wrote:
> >> On 1/18/2010 4:08 PM, Terry wrote:
> >>>
> >>> Hello,
> >>>
> >>> Sorry for the poor subject. ?Not sure how to describe what I need
> >>> here. ?I have an application that logs to a single table in pgsql.
> >>> In order for me to get into our log management, I need to dump it out
> >>> to a file on a periodic basis to get new logs. ?I am not sure how to
> >>> tackle this. ?I thought about doing a date calculation and just
> >>> grabbing the previous 6 hours of logs and writing that to a new log
> >>> file and setting up a rotation like that. ?Unfortunately, the log
> >>> management solution can't go into pgsql directly. ?Thoughts?
> >>>
> >>> Thanks!
> >>>
> >>
> >> How about a flag in the db, like: dumped.
> >>
> >> inside one transactions you'd be safe doing:
> >>
> >> begin
> >> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
> >> select * from log where dumped = 0;
> >> -- app code to format/write/etc
> >> update log set dumped = 1 where dumped = 0;
> >> commit;
> >>
> >> Even if other transactions insert new records, you're existing transaction
> >> wont see them, and the update wont touch them.
> >>
> >> -Andy
> >>
> >
> > I like your thinking but I shouldn't add a new column to this
> > database. ?It's a 3rd party application.
> >
>
> Although.  I really like your idea so I might create another table
> where I will log whether the data has been dumped or not.  I just need
> to come up with a query to check this with the other table.

Isn't this just over-engineering? Why not let the database do
the work, and add the column with a default value of 0, so that
you don't have to modify whatever 3rd-party app dumps the data:

ALTER TABLE log ADD COLUMN dumped boolean DEFAULT FALSE

(I don't do much ALTER TABLE, so that syntax may be all foobar'ed)

- Bret

Re: data dump help

От
Andrej
Дата:
2010/1/19 Bret S. Lambert <bret.lambert@gmail.com>:

> Isn't this just over-engineering? Why not let the database do
> the work, and add the column with a default value of 0, so that
> you don't have to modify whatever 3rd-party app dumps the data:

But what if his third-party software does something silly like a "select *"
on the table and then gets a hissy fit because the data doesn't match
the expectations any longer?


Cheers,
Andrej


--
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

Re: data dump help

От
"Bret S. Lambert"
Дата:
On Tue, Jan 19, 2010 at 06:35:10PM +1300, Andrej wrote:
> 2010/1/19 Bret S. Lambert <bret.lambert@gmail.com>:
>
> > Isn't this just over-engineering? Why not let the database do
> > the work, and add the column with a default value of 0, so that
> > you don't have to modify whatever 3rd-party app dumps the data:
>
> But what if his third-party software does something silly like a "select *"
> on the table and then gets a hissy fit because the data doesn't match
> the expectations any longer?

He said his app logs there, so I kind of assumed that it's write-only
as far as the app is concerned. If not, then, yes, there could be
issues.

But why not keep things as simple as possible?

Re: data dump help

От
Johan Nel
Дата:
Terry wrote:
> Hello,
>
> Sorry for the poor subject.  Not sure how to describe what I need
> here.  I have an application that logs to a single table in pgsql.
> In order for me to get into our log management, I need to dump it out
> to a file on a periodic basis to get new logs.  I am not sure how to
> tackle this.  I thought about doing a date calculation and just
> grabbing the previous 6 hours of logs and writing that to a new log
> file and setting up a rotation like that.  Unfortunately, the log
> management solution can't go into pgsql directly.  Thoughts?

You do not indicate in your post, exactly how the data is stored, but I
would assume there is a timestamp inside this single table.

 From my perspective there are 3 options available:

Firstly, create a table that you can monitor when you have made dumps,
typically with a column that will store a datetimestamp with now() in it.

1.  You have access to the DB and you can schedule a pgAgent job to run
every 6 hours that dumps the table into some usable format e.g. csv:

SELECT Max(dumptimestamp) FROM dump_log INTO lastdump;
currtime := now();
COPY (SELECT * FROM singletable WHERE timestamp > lastdump) TO
'someexternaltable' DELIMETER ',' CSV HEADER ...;
INSERT INTO dumplog (dumptimestamp) VALUES (currtime);

2.  Same as above but run this as a trigger on your dumplog table when you
  need a dump by inserting the current_datetime into the dumplog table
that will trigger a process to export the data.

3.  You have an application that have an option to insert the current
datetimestamp into your dumplog table and then read the exported table
after completion.

HTH,

Johan Nel
Pretoria, South Africa.

Re: data dump help

От
Terry
Дата:
On Tue, Jan 19, 2010 at 12:06 AM, Johan Nel <johan.nel@xsinet.co.za> wrote:
> Terry wrote:
>>
>> Hello,
>>
>> Sorry for the poor subject.  Not sure how to describe what I need
>> here.  I have an application that logs to a single table in pgsql.
>> In order for me to get into our log management, I need to dump it out
>> to a file on a periodic basis to get new logs.  I am not sure how to
>> tackle this.  I thought about doing a date calculation and just
>> grabbing the previous 6 hours of logs and writing that to a new log
>> file and setting up a rotation like that.  Unfortunately, the log
>> management solution can't go into pgsql directly.  Thoughts?
>
> You do not indicate in your post, exactly how the data is stored, but I
> would assume there is a timestamp inside this single table.
>
> From my perspective there are 3 options available:
>
> Firstly, create a table that you can monitor when you have made dumps,
> typically with a column that will store a datetimestamp with now() in it.
>
> 1.  You have access to the DB and you can schedule a pgAgent job to run
> every 6 hours that dumps the table into some usable format e.g. csv:
>
> SELECT Max(dumptimestamp) FROM dump_log INTO lastdump;
> currtime := now();
> COPY (SELECT * FROM singletable WHERE timestamp > lastdump) TO
> 'someexternaltable' DELIMETER ',' CSV HEADER ...;
> INSERT INTO dumplog (dumptimestamp) VALUES (currtime);
>
> 2.  Same as above but run this as a trigger on your dumplog table when you
>  need a dump by inserting the current_datetime into the dumplog table that
> will trigger a process to export the data.
>
> 3.  You have an application that have an option to insert the current
> datetimestamp into your dumplog table and then read the exported table after
> completion.
>
> HTH,
>
> Johan Nel
> Pretoria, South Africa.
>

I appreciate everyone's response.  My first option was to create a new
column but since doing so would probably break the app or at the very
least  null my support with them, I chose against it.  I am instead
just doing it in a script and retaining a position file that keeps
track of the most recent record that was dumped from the table.  Easy
enough and it works.