Обсуждение: data dump help
Hello, Sorry for the poor subject. Not sure how to describe what I need here. I have an application that logs to a single table in pgsql. In order for me to get into our log management, I need to dump it out to a file on a periodic basis to get new logs. I am not sure how to tackle this. I thought about doing a date calculation and just grabbing the previous 6 hours of logs and writing that to a new log file and setting up a rotation like that. Unfortunately, the log management solution can't go into pgsql directly. Thoughts? Thanks!
On 1/18/2010 4:08 PM, Terry wrote: > Hello, > > Sorry for the poor subject. Not sure how to describe what I need > here. I have an application that logs to a single table in pgsql. > In order for me to get into our log management, I need to dump it out > to a file on a periodic basis to get new logs. I am not sure how to > tackle this. I thought about doing a date calculation and just > grabbing the previous 6 hours of logs and writing that to a new log > file and setting up a rotation like that. Unfortunately, the log > management solution can't go into pgsql directly. Thoughts? > > Thanks! > How about a flag in the db, like: dumped. inside one transactions you'd be safe doing: begin SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; select * from log where dumped = 0; -- app code to format/write/etc update log set dumped = 1 where dumped = 0; commit; Even if other transactions insert new records, you're existing transaction wont see them, and the update wont touch them. -Andy
On Mon, Jan 18, 2010 at 4:48 PM, Andy Colson <andy@squeakycode.net> wrote: > On 1/18/2010 4:08 PM, Terry wrote: >> >> Hello, >> >> Sorry for the poor subject. Not sure how to describe what I need >> here. I have an application that logs to a single table in pgsql. >> In order for me to get into our log management, I need to dump it out >> to a file on a periodic basis to get new logs. I am not sure how to >> tackle this. I thought about doing a date calculation and just >> grabbing the previous 6 hours of logs and writing that to a new log >> file and setting up a rotation like that. Unfortunately, the log >> management solution can't go into pgsql directly. Thoughts? >> >> Thanks! >> > > How about a flag in the db, like: dumped. > > inside one transactions you'd be safe doing: > > begin > SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; > select * from log where dumped = 0; > -- app code to format/write/etc > update log set dumped = 1 where dumped = 0; > commit; > > Even if other transactions insert new records, you're existing transaction > wont see them, and the update wont touch them. > > -Andy > I like your thinking but I shouldn't add a new column to this database. It's a 3rd party application.
On Mon, Jan 18, 2010 at 5:07 PM, Terry <td3201@gmail.com> wrote: > On Mon, Jan 18, 2010 at 4:48 PM, Andy Colson <andy@squeakycode.net> wrote: >> On 1/18/2010 4:08 PM, Terry wrote: >>> >>> Hello, >>> >>> Sorry for the poor subject. Not sure how to describe what I need >>> here. I have an application that logs to a single table in pgsql. >>> In order for me to get into our log management, I need to dump it out >>> to a file on a periodic basis to get new logs. I am not sure how to >>> tackle this. I thought about doing a date calculation and just >>> grabbing the previous 6 hours of logs and writing that to a new log >>> file and setting up a rotation like that. Unfortunately, the log >>> management solution can't go into pgsql directly. Thoughts? >>> >>> Thanks! >>> >> >> How about a flag in the db, like: dumped. >> >> inside one transactions you'd be safe doing: >> >> begin >> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; >> select * from log where dumped = 0; >> -- app code to format/write/etc >> update log set dumped = 1 where dumped = 0; >> commit; >> >> Even if other transactions insert new records, you're existing transaction >> wont see them, and the update wont touch them. >> >> -Andy >> > > I like your thinking but I shouldn't add a new column to this > database. It's a 3rd party application. > Although. I really like your idea so I might create another table where I will log whether the data has been dumped or not. I just need to come up with a query to check this with the other table.
On Mon, Jan 18, 2010 at 05:49:32PM -0600, Terry wrote: > On Mon, Jan 18, 2010 at 5:07 PM, Terry <td3201@gmail.com> wrote: > > On Mon, Jan 18, 2010 at 4:48 PM, Andy Colson <andy@squeakycode.net> wrote: > >> On 1/18/2010 4:08 PM, Terry wrote: > >>> > >>> Hello, > >>> > >>> Sorry for the poor subject. ?Not sure how to describe what I need > >>> here. ?I have an application that logs to a single table in pgsql. > >>> In order for me to get into our log management, I need to dump it out > >>> to a file on a periodic basis to get new logs. ?I am not sure how to > >>> tackle this. ?I thought about doing a date calculation and just > >>> grabbing the previous 6 hours of logs and writing that to a new log > >>> file and setting up a rotation like that. ?Unfortunately, the log > >>> management solution can't go into pgsql directly. ?Thoughts? > >>> > >>> Thanks! > >>> > >> > >> How about a flag in the db, like: dumped. > >> > >> inside one transactions you'd be safe doing: > >> > >> begin > >> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; > >> select * from log where dumped = 0; > >> -- app code to format/write/etc > >> update log set dumped = 1 where dumped = 0; > >> commit; > >> > >> Even if other transactions insert new records, you're existing transaction > >> wont see them, and the update wont touch them. > >> > >> -Andy > >> > > > > I like your thinking but I shouldn't add a new column to this > > database. ?It's a 3rd party application. > > > > Although. I really like your idea so I might create another table > where I will log whether the data has been dumped or not. I just need > to come up with a query to check this with the other table. Isn't this just over-engineering? Why not let the database do the work, and add the column with a default value of 0, so that you don't have to modify whatever 3rd-party app dumps the data: ALTER TABLE log ADD COLUMN dumped boolean DEFAULT FALSE (I don't do much ALTER TABLE, so that syntax may be all foobar'ed) - Bret
2010/1/19 Bret S. Lambert <bret.lambert@gmail.com>: > Isn't this just over-engineering? Why not let the database do > the work, and add the column with a default value of 0, so that > you don't have to modify whatever 3rd-party app dumps the data: But what if his third-party software does something silly like a "select *" on the table and then gets a hissy fit because the data doesn't match the expectations any longer? Cheers, Andrej -- Please don't top post, and don't use HTML e-Mail :} Make your quotes concise. http://www.american.edu/econ/notes/htmlmail.htm
On Tue, Jan 19, 2010 at 06:35:10PM +1300, Andrej wrote: > 2010/1/19 Bret S. Lambert <bret.lambert@gmail.com>: > > > Isn't this just over-engineering? Why not let the database do > > the work, and add the column with a default value of 0, so that > > you don't have to modify whatever 3rd-party app dumps the data: > > But what if his third-party software does something silly like a "select *" > on the table and then gets a hissy fit because the data doesn't match > the expectations any longer? He said his app logs there, so I kind of assumed that it's write-only as far as the app is concerned. If not, then, yes, there could be issues. But why not keep things as simple as possible?
Terry wrote: > Hello, > > Sorry for the poor subject. Not sure how to describe what I need > here. I have an application that logs to a single table in pgsql. > In order for me to get into our log management, I need to dump it out > to a file on a periodic basis to get new logs. I am not sure how to > tackle this. I thought about doing a date calculation and just > grabbing the previous 6 hours of logs and writing that to a new log > file and setting up a rotation like that. Unfortunately, the log > management solution can't go into pgsql directly. Thoughts? You do not indicate in your post, exactly how the data is stored, but I would assume there is a timestamp inside this single table. From my perspective there are 3 options available: Firstly, create a table that you can monitor when you have made dumps, typically with a column that will store a datetimestamp with now() in it. 1. You have access to the DB and you can schedule a pgAgent job to run every 6 hours that dumps the table into some usable format e.g. csv: SELECT Max(dumptimestamp) FROM dump_log INTO lastdump; currtime := now(); COPY (SELECT * FROM singletable WHERE timestamp > lastdump) TO 'someexternaltable' DELIMETER ',' CSV HEADER ...; INSERT INTO dumplog (dumptimestamp) VALUES (currtime); 2. Same as above but run this as a trigger on your dumplog table when you need a dump by inserting the current_datetime into the dumplog table that will trigger a process to export the data. 3. You have an application that have an option to insert the current datetimestamp into your dumplog table and then read the exported table after completion. HTH, Johan Nel Pretoria, South Africa.
On Tue, Jan 19, 2010 at 12:06 AM, Johan Nel <johan.nel@xsinet.co.za> wrote: > Terry wrote: >> >> Hello, >> >> Sorry for the poor subject. Not sure how to describe what I need >> here. I have an application that logs to a single table in pgsql. >> In order for me to get into our log management, I need to dump it out >> to a file on a periodic basis to get new logs. I am not sure how to >> tackle this. I thought about doing a date calculation and just >> grabbing the previous 6 hours of logs and writing that to a new log >> file and setting up a rotation like that. Unfortunately, the log >> management solution can't go into pgsql directly. Thoughts? > > You do not indicate in your post, exactly how the data is stored, but I > would assume there is a timestamp inside this single table. > > From my perspective there are 3 options available: > > Firstly, create a table that you can monitor when you have made dumps, > typically with a column that will store a datetimestamp with now() in it. > > 1. You have access to the DB and you can schedule a pgAgent job to run > every 6 hours that dumps the table into some usable format e.g. csv: > > SELECT Max(dumptimestamp) FROM dump_log INTO lastdump; > currtime := now(); > COPY (SELECT * FROM singletable WHERE timestamp > lastdump) TO > 'someexternaltable' DELIMETER ',' CSV HEADER ...; > INSERT INTO dumplog (dumptimestamp) VALUES (currtime); > > 2. Same as above but run this as a trigger on your dumplog table when you > need a dump by inserting the current_datetime into the dumplog table that > will trigger a process to export the data. > > 3. You have an application that have an option to insert the current > datetimestamp into your dumplog table and then read the exported table after > completion. > > HTH, > > Johan Nel > Pretoria, South Africa. > I appreciate everyone's response. My first option was to create a new column but since doing so would probably break the app or at the very least null my support with them, I chose against it. I am instead just doing it in a script and retaining a position file that keeps track of the most recent record that was dumped from the table. Easy enough and it works.