Re: Performance problems testing with Spamassassin 3.1.0
| От | Andreas Pflug | 
|---|---|
| Тема | Re: Performance problems testing with Spamassassin 3.1.0 | 
| Дата | |
| Msg-id | 42ED161F.1020408@pse-consulting.de обсуждение исходный текст | 
| Ответ на | Re: Performance problems testing with Spamassassin 3.1.0 ("Jim C. Nasby" <decibel@decibel.org>) | 
| Список | pgsql-performance | 
Jim C. Nasby wrote: > On Sun, Jul 31, 2005 at 08:51:06AM -0800, Matthew Schumacher wrote: > >>Ok, here is the current plan. >> >>Change the spamassassin API to pass a hash of tokens into the storage >>module, pass the tokens to the proc as an array, start a transaction, >>load the tokens into a temp table using copy, select the tokens distinct >>into the token table for new tokens, update the token table for known >>tokens, then commit. > > > You might consider: > UPDATE tokens > FROM temp_table (this updates existing records) > > INSERT INTO tokens > SELECT ... > FROM temp_table > WHERE NOT IN (SELECT ... FROM tokens) > > This way you don't do an update to newly inserted tokens, which helps > keep vacuuming needs in check. The subselect might be quite a big set, so avoiding a full table scan and materialization by DELETE temp_table WHERE key IN (select key FROM tokens JOIN temp_table); INSERT INTO TOKENS SELECT * FROM temp_table; or INSERT INTO TOKENS SELECT temp_table.* FROM temp_table LEFT JOIN tokens USING (key) WHERE tokens.key IS NULL might be an additional win, assuming that only a small fraction of tokens is inserted and updated. Regards, Andreas
В списке pgsql-performance по дате отправления: