On Mon, Oct 12, 2015 at 1:28 PM, Andres Freund <andres@anarazel.de> wrote:
> Any chance
> you could provide profiles of such a run?
This is as simple as I could make it reliably. With one copy running,
the thread finishes in about 1 second. With 2, it's 1.5s each, and
with all 4, it's a little over 3s for each according to the logs. I
have log_min_duration_statement set to 1000, so it's pretty obvious.
The scary part is that it's not even scaling linearly; performance is
actually getting *worse* with each subsequent thread.
Regarding performance, all of this fits in memory. The tables are only
100k rows with the COPY statement. The machine itself is 8 CPUs with
32GB of RAM, so it's not an issue of hardware. So far as I can tell,
it happens on every version I've tested on, from 9.2 to 9.4. I also
take back what I said about wal_level. Setting it to minimal does
nothing. Disabling archive_mode and setting max_wal_senders to 0 also
does nothing. With 4 concurrent processes, each takes 3 seconds, for a
total of 12 seconds to import 400k rows when it would take 4 seconds
to do sequentially. Sketchy.
COPY (
SELECT id, id % 100, id % 1000, now() - (id || 's')::INTERVAL
FROM generate_series(1, 100000) a(id)
) TO '/tmp/loadtest1.csv';
CREATE TABLE test_copy (
id SERIAL PRIMARY KEY,
location VARCHAR NOT NULL,
reading BIGINT NOT NULL,
reading_date TIMESTAMP NOT NULL
);
CREATE INDEX idx_test_copy_location ON test_copy (location);
CREATE INDEX idx_test_copy_date ON test_copy (reading_date);
CREATE TABLE test_copy2 (LIKE test_copy INCLUDING INDEXES);
CREATE TABLE test_copy3 (LIKE test_copy INCLUDING INDEXES);
CREATE TABLE test_copy4 (LIKE test_copy INCLUDING INDEXES);
psql -c "COPY test_copy FROM '/tmp/loadtest1.csv'" &>/dev/null &
psql -c "COPY test_copy2 FROM '/tmp/loadtest1.csv'" &>/dev/null &
psql -c "COPY test_copy3 FROM '/tmp/loadtest1.csv'" &>/dev/null &
psql -c "COPY test_copy4 FROM '/tmp/loadtest1.csv'" &>/dev/null &
--
Shaun Thomas
bonesmoses@gmail.com
http://bonesmoses.org/