Обсуждение: Create index concurrently hanging with big table on pgsql 9.3.12
Hi,
We are trying to create a index concurrently but, at least apparently, it hangs in a infinite loop and never ends.
Our version:
flip=# select version();
version
--------------------------------
PostgreSQL 9.3.12 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 2012
0313 (Red Hat 4.4.7-16), 64-bit
(1 row)
Index creation:
CREATE INDEX CONCURRENTLY index_texto
ON flip_pagina_edicao
USING hash
(texto COLLATE pg_catalog."default"); -- texto is a text data type.
Size of the table:
flip=# select pg_size_pretty(pg_total_relation_size('flip_pagina_edicao'));
pg_size_pretty
----------------
956 GB
(1 row)
Process strace:
....
semop(622611, {{6, 1, 0}}, 1) = 0
semop(655380, {{5, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(655380, {{5, -1, 0}}, 1) = 0
semop(622611, {{6, 1, 0}}, 1) = 0
semop(655380, {{5, -1, 0}}, 1) = 0
semop(622611, {{6, 1, 0}}, 1) = 0
semop(622611, {{6, 1, 0}}, 1) = 0
semop(622611, {{6, 1, 0}}, 1) = 0
semop(655380, {{5, -1, 0}}, 1) = 0
semop(622611, {{6, 1, 0}}, 1) = 0
semop(655380, {{5, -1, 0}}, 1) = 0
semop(622611, {{6, 1, 0}}, 1) = 0
semop(557073, {{2, 1, 0}}, 1) = 0
semop(655380, {{5, -1, 0}}, 1) = 0
semop(655380, {{5, -1, 0}}, 1) = 0
semop(655380, {{5, -1, 0}}, 1) = 0
semop(655380, {{5, -1, 0}}, 1) = 0
...
Thanks in advance.
Robert
On 05/09/2016 05:04 AM, Robert Anderson wrote: > Hi, > > We are trying to create a index concurrently but, at least apparently, > it hangs in a infinite loop and never ends. Apparently how? How long did you wait? JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
Try the following query. See if it shows you if another transaction is blocking the needed locks to create the index.
SELECT
w.query as waiting_query,
w.pid as waiting_pid,
w.usename as w_user,
l.pid as blocking_pid,
l.usename as blocking_user,
t.schemaname || '.' || t.relname as tablename
FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.pid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE w.waiting;
SELECT
w.query as waiting_query,
w.pid as waiting_pid,
w.usename as w_user,
l.pid as blocking_pid,
l.usename as blocking_user,
t.schemaname || '.' || t.relname as tablename
FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.pid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE w.waiting;
On Mon, May 9, 2016 at 11:50 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
On 05/09/2016 05:04 AM, Robert Anderson wrote:Hi,
We are trying to create a index concurrently but, at least apparently,
it hangs in a infinite loop and never ends.
Apparently how?
How long did you wait?
JD
--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Hi,
There aren't transactions blocking:
postgres=# SELECT
postgres-# w.query as waiting_query,
postgres-# w.pid as waiting_pid,
postgres-# w.usename as w_user,
postgres-# l.pid as blocking_pid,
postgres-# l.usename as blocking_user,
postgres-# t.schemaname || '.' || t.relname as tablename
postgres-# FROM pg_stat_activity w
postgres-# JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
postgres-# JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
postgres-# JOIN pg_stat_activity l ON (l2.pid = l.pid)
postgres-# JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
postgres-# WHERE w.waiting;
waiting_query | waiting_pid | w_user | blocking_pid | blocking_user | tablename
---------------+-------------+--------+--------------+---------------+-----------
(0 rows)
How long I'm waiting:
postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_activity where query like 'CREATE%';
-[ RECORD 1 ]----+----------------------------------------
datid | 16434
datname | flip
pid | 3990
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2016-05-07 11:48:39.218398-03
xact_start | 2016-05-07 11:48:43.417734-03
query_start | 2016-05-07 11:48:43.417734-03
state_change | 2016-05-07 11:48:43.417742-03
waiting | f
state | active
query | CREATE INDEX CONCURRENTLY index_texto
| ON flip_pagina_edicao
| USING hash
| (texto COLLATE pg_catalog."default");
postgres=#
In attachment follows a strace sample of the running process.
2016-05-09 13:25 GMT-03:00 Melvin Davidson <melvin6925@gmail.com>:
Try the following query. See if it shows you if another transaction is blocking the needed locks to create the index.
SELECT
w.query as waiting_query,
w.pid as waiting_pid,
w.usename as w_user,
l.pid as blocking_pid,
l.usename as blocking_user,
t.schemaname || '.' || t.relname as tablename
FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.pid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE w.waiting;On Mon, May 9, 2016 at 11:50 AM, Joshua D. Drake <jd@commandprompt.com> wrote:On 05/09/2016 05:04 AM, Robert Anderson wrote:Hi,
We are trying to create a index concurrently but, at least apparently,
it hangs in a infinite loop and never ends.
Apparently how?
How long did you wait?
JD--
--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Вложения
Robert Anderson <ranomail@gmail.com> writes: > There aren't transactions blocking: > postgres=# SELECT > postgres-# w.query as waiting_query, > postgres-# w.pid as waiting_pid, > postgres-# w.usename as w_user, > postgres-# l.pid as blocking_pid, > postgres-# l.usename as blocking_user, > postgres-# t.schemaname || '.' || t.relname as tablename > postgres-# FROM pg_stat_activity w > postgres-# JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted) > postgres-# JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) > postgres-# JOIN pg_stat_activity l ON (l2.pid = l.pid) > postgres-# JOIN pg_stat_user_tables t ON (l1.relation = t.relid) > postgres-# WHERE w.waiting; > waiting_query | waiting_pid | w_user | blocking_pid | blocking_user | > tablename > ---------------+-------------+--------+--------------+---------------+----------- > (0 rows) This test proves little, because that last JOIN will discard locks on non-table objects, and what CREATE INDEX CONCURRENTLY would be most likely to be blocked on is transaction VXIDs. However, since pg_stat_activity claims that "waiting" is false, probably there isn't anything in pg_locks. Still, it'd be better to do "select * from pg_stat_activity where pid = 3990" and be sure. regards, tom lane
Only one line returned:
postgres=# select * from pg_stat_activity where pid=3990;
-[ RECORD 1 ]----+----------------------------------------
datid | 16434
datname | flip
pid | 3990
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2016-05-07 11:48:39.218398-03
xact_start | 2016-05-07 11:48:43.417734-03
query_start | 2016-05-07 11:48:43.417734-03
state_change | 2016-05-07 11:48:43.417742-03
waiting | f
state | active
query | CREATE INDEX CONCURRENTLY index_texto
| ON flip_pagina_edicao
| USING hash
| (texto COLLATE pg_catalog."default");
postgres=#
2016-05-09 14:20 GMT-03:00 Tom Lane <tgl@sss.pgh.pa.us>:
Robert Anderson <ranomail@gmail.com> writes:
> There aren't transactions blocking:
> postgres=# SELECT
> postgres-# w.query as waiting_query,
> postgres-# w.pid as waiting_pid,
> postgres-# w.usename as w_user,
> postgres-# l.pid as blocking_pid,
> postgres-# l.usename as blocking_user,
> postgres-# t.schemaname || '.' || t.relname as tablename
> postgres-# FROM pg_stat_activity w
> postgres-# JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
> postgres-# JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
> postgres-# JOIN pg_stat_activity l ON (l2.pid = l.pid)
> postgres-# JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
> postgres-# WHERE w.waiting;
> waiting_query | waiting_pid | w_user | blocking_pid | blocking_user |
> tablename
> ---------------+-------------+--------+--------------+---------------+-----------
> (0 rows)
This test proves little, because that last JOIN will discard locks on
non-table objects, and what CREATE INDEX CONCURRENTLY would be most
likely to be blocked on is transaction VXIDs. However, since
pg_stat_activity claims that "waiting" is false, probably there isn't
anything in pg_locks. Still, it'd be better to do
"select * from pg_stat_activity where pid = 3990" and be sure.
regards, tom lane
On 05/09/2016 10:32 AM, Robert Anderson wrote: > Only one line returned: > > postgres=# select * from pg_stat_activity where pid=3990; > -[ RECORD 1 ]----+---------------------------------------- > datid | 16434 > datname | flip > pid | 3990 > usesysid | 10 > usename | postgres > application_name | psql > client_addr | > client_hostname | > client_port | -1 > backend_start | 2016-05-07 11:48:39.218398-03 > xact_start | 2016-05-07 11:48:43.417734-03 > query_start | 2016-05-07 11:48:43.417734-03 > state_change | 2016-05-07 11:48:43.417742-03 > waiting | f > state | active > query | CREATE INDEX CONCURRENTLY index_texto > | ON flip_pagina_edicao > | USING hash > | (texto COLLATE pg_catalog."default"); So it isn't blocked by a lock. You still haven't stated why you think it is blocked? How long has it been taking? What is your maintenance_work_mem set to? JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
You still haven't stated why you think it is blocked?
Ouput from iotop:
3990 be/4 postgres 0.00 B/s 0.00 B/s 0.00 % 0.00 % postgres: postgres flip [local] CREATE INDEX
The process isn't reading or writing anything for many hours, but it's using almost 90% of CPU.
How long has it been taking?
backend_start | 2016-05-07 11:48:39.218398-03
More than 50 hours.
What is your maintenance_work_mem set to?
maintenance_work_mem = 352MB
2016-05-09 14:34 GMT-03:00 Joshua D. Drake <jd@commandprompt.com>:
On 05/09/2016 10:32 AM, Robert Anderson wrote:Only one line returned:
postgres=# select * from pg_stat_activity where pid=3990;
-[ RECORD 1 ]----+----------------------------------------
datid | 16434
datname | flip
pid | 3990
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2016-05-07 11:48:39.218398-03
xact_start | 2016-05-07 11:48:43.417734-03
query_start | 2016-05-07 11:48:43.417734-03
state_change | 2016-05-07 11:48:43.417742-03
waiting | f
state | active
query | CREATE INDEX CONCURRENTLY index_texto
| ON flip_pagina_edicao
| USING hash
| (texto COLLATE pg_catalog."default");
So it isn't blocked by a lock. You still haven't stated why you think it is blocked? How long has it been taking? What is your maintenance_work_mem set to?
JD
--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Robert Anderson <ranomail@gmail.com> writes: > Only one line returned: > postgres=# select * from pg_stat_activity where pid=3990; Aaah, sorry, that was a brain fade. I meant to ask about rows in pg_locks with that pid. regards, tom lane