Given that doing a massive UPDATE SET foo = bar || ' ' || baz; on a 12 million
row table (with about 100 columns -- the US Census PUMS for the 2005-2009 ACS)
is never going to be that fast, what should one do to make it faster?
I set work_mem to 2048MB, but it currently is only using a little bit of memory
and CPU. (3% and 15% according to top; on a SELECT DISTINCT ... LIMIT earlier,
it was using 70% of the memory).
The data is not particularly sensitive; if something happened and it rolled
back, that wouldnt be the end of the world. So I don't know if I can use
"dangerous" setting for WAL checkpoints etc. There are also aren't a lot of
concurrent hits on the DB, though a few.
I am loathe to create a new table from a select, since the indexes themselves
take a really long time to build.
As the title alludes, I will also be doing GROUP BY's on the data, and would
love to speed these up, mostly just for my own impatience...