Re: performance for high-volume log insertion
От | david@lang.hm |
---|---|
Тема | Re: performance for high-volume log insertion |
Дата | |
Msg-id | alpine.DEB.1.10.0904211111040.12662@asgard.lang.hm обсуждение исходный текст |
Ответ на | Re: performance for high-volume log insertion (Stephen Frost <sfrost@snowman.net>) |
Ответы |
Re: performance for high-volume log insertion
(david@lang.hm)
Re: performance for high-volume log insertion (Stephen Frost <sfrost@snowman.net>) |
Список | pgsql-performance |
On Tue, 21 Apr 2009, Stephen Frost wrote: > * david@lang.hm (david@lang.hm) wrote: >> I think the key thing is that rsyslog today doesn't know anything about >> SQL variables, it just creates a string that the user and the database >> say looks like a SQL statement. > > err, what SQL variables? You mean the $NUM stuff? They're just > placeholders.. You don't really need to *do* anything with them.. Or > are you worried that users would provide something that would break as a > prepared query? If so, you just need to figure out how to handle that > cleanly.. > >> an added headache is that the rsyslog config does not have the concept of >> arrays (the closest that it has is one special-case hack to let you >> specify one variable multiple times) > > Argh. The array I'm talking about is a C array, and has nothing to do > with the actual config syntax.. I swear, I think you're making this > more difficult by half. not intentinally, but you may be right. > Alright, looking at the documentation on rsyslog.com, I see something > like: > > $template MySQLInsert,"insert iut, message, receivedat values > ('%iut%', '%msg:::UPPERCASE%', '%timegenerated:::date-mysql%') > into systemevents\r\n", SQL > > Ignoring the fact that this is horrible, horrible non-SQL, that example is for MySQL, nuff said ;-) or are you referring to the modifiers that rsyslog has to manipulate the strings before inserting them? (as opposed to using sql to manipulate the strings) > I see that > you use %blah% to define variables inside your string. That's fine. > There's no reason why you can't use this exact syntax to build a > prepared query. No user-impact changes are necessary. Here's what you > do: <snip psudocode to replace %blah% with $num> for some reason I was stuck on the idea of the config specifying the statement and variables seperatly, so I wasn't thinking this way, however there are headaches doing this will require changes to the structure of rsyslog, today the string manipulation is done before calling the output (database) module, so all the database module currently gets is a string. in a (IMHO misguided) attempt at security in a multi-threaded program, the output modules are not given access to the full data, only to the distiled result. also, this approach won't work if the user wants to combine fixed text with the variable into a column. an example of doing that would be to have a filter to match specific lines, and then use a slightly different template for those lines. I guess that could be done in SQL instead of in the rsyslog string manipulation (i.e. instead of 'blah-%host%' do 'blah-'||'%host') > As I mentioned before, the only obvious issue I > see with doing this implicitly is that the user might want to put > variables in places that you can't have variables in prepared queries. this problem space would be anywhere except the column contents, right? > You could deal with that by having the user indicate per template, using > another template option, if the query can be prepared or not. Another > options is adding to your syntax something like '%*blah%' which would > tell the system to pre-populate that variable before issuing PQprepare > on the resultant string. Of course, you might just use PQexecParams > there, unless you want to be gung-ho and actually keep a hash around of > prepared queries on the assumption that the variable the user gave you > doesn't change very often (eg, '%*month%') and it's cheap to keep a > small list of them around to use when they do match up. rsyslog supports something similar for writing to disk where you can use variables as part of the filename/path (referred to as 'dynafiles' in the documentation). that's a little easier to deal with as the filename is specified seperatly from the format of the data to write. If we end up doing prepared statements I suspect they initially won't support variables outside of the columns. David Lang
В списке pgsql-performance по дате отправления: