Segfault in BackendIdGetTransactionIds with psycopg2

Поиск

Список

Период

Сортировка

От	Vladimir Borodin
Тема	Segfault in BackendIdGetTransactionIds with psycopg2
Дата	30 марта 2015 г. 17:40:07
Msg-id	45AA4CC5-38F4-481A-BF8E-9269C9810517@simply.name обсуждение исходный текст
Список	psycopg

Дерево обсуждения

Hi all.

After upgrading from 9.3.6 to 9.4.1 (both installed from packages on yum.postgresql.org) we have started getting segfaults of different backends. Backtraces of all coredumps look similar:

(gdb) bt

#0 0x000000000066bf9b in BackendIdGetTransactionIds (backendID=<value optimized out>, xid=0x7f2a1b714798, xmin=0x7f2a1b71479c) at sinvaladt.c:426

#1 0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2871

#2 0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c:2342

#3 0x00000000006f9d5a in pg_stat_get_db_numbackends (fcinfo=<value optimized out>) at pgstatfuncs.c:1080

#4 0x000000000059c345 in ExecMakeFunctionResultNoSets (fcache=0x1f4c270, econtext=0x1f4bbe0, isNull=0x1f5e588 "", isDone=<value optimized out>) at execQual.c:2023

#5 0x00000000005981a3 in ExecTargetList (projInfo=<value optimized out>, isDone=0x0) at execQual.c:5304

#6 ExecProject (projInfo=<value optimized out>, isDone=0x0) at execQual.c:5519

#7 0x00000000005a458d in advance_aggregates (aggstate=0x1f4bdc0, pergroup=0x1f5e380) at nodeAgg.c:556

#8 0x00000000005a4da5 in agg_retrieve_direct (node=<value optimized out>) at nodeAgg.c:1223

#9 ExecAgg (node=<value optimized out>) at nodeAgg.c:1115

#10 0x0000000000597638 in ExecProcNode (node=0x1f4bdc0) at execProcnode.c:476

#11 0x0000000000596252 in ExecutePlan (queryDesc=0x1eae6d0, direction=<value optimized out>, count=0) at execMain.c:1486

#12 standard_ExecutorRun (queryDesc=0x1eae6d0, direction=<value optimized out>, count=0) at execMain.c:319

#13 0x0000000000686797 in PortalRunSelect (portal=0x1ea5660, forward=<value optimized out>, count=0, dest=<value optimized out>) at pquery.c:946

#14 0x00000000006879c1 in PortalRun (portal=0x1ea5660, count=9223372036854775807, isTopLevel=1 '\001', dest=0x1f5a528, altdest=0x1f5a528, completionTag=0x7fff277b3b80 "") at pquery.c:790

#15 0x000000000068404e in exec_simple_query (query_string=0x1e989d0 "SELECT sum(numbackends) FROM pg_stat_database;") at postgres.c:1072

#16 0x00000000006856c8 in PostgresMain (argc=<value optimized out>, argv=<value optimized out>, dbname=0x1e7f398 "postgres", username=<value optimized out>) at postgres.c:4074

#17 0x0000000000632d7d in BackendRun (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4155

#18 BackendStartup (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:3829

#19 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1597

#20 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1244

#21 0x00000000005cadb8 in main (argc=3, argv=0x1e7e5e0) at main.c:228

(gdb)

Unfortunatelly, I can't give a clear sequence of steps to reproduce the problem, segfaults are happening in quiet random time and under random workloads :( So I'm trying to reproduce it on testing stand where PostgreSQL is built with --enable-debug flag to give you more information.

But the common conditions are:

1. it happens only on master hosts (never on any of the streaming replicas),

2. it happens on simple queries to pg_catalog or system views as shown in the backtrace above,

3. it happens only with direct connecting to PostgreSQL (production-queries go through pgbouncer and no coredumps contain production queries). And it happened only with python-psycopg2 (we have tried versions 2.5.3-1.rhel6 with postgresql93-libs, 2.5.4-1.rhel6 and 2.6-1.rhel6 with postgresql94-libs).

The code that lead to backtrace above is the following:

#!/usr/bin/env python

import socket

import time

import sys

import psycopg2

import os

me = sys.argv[0].split('/')[-1].split('.')[0]

hostname = socket.gethostname()

hostname_s = hostname.replace('.', '_')

current_ts = int(time.time())

gr_prefix = "mail.pg"

metrics = ['numbackends', 'xact_commit', 'xact_rollback', 'blks_read',

'blks_hit', 'tup_returned', 'tup_fetched', 'tup_inserted',

'tup_updated', 'tup_deleted']

user = 'monitor'

password = ''

with open(os.path.expanduser("~/.pgpass")) as pgpass:

for line in pgpass:

tokens = line.rstrip().split(':')

if tokens[3] == user:

password = tokens[4]

break

conn = psycopg2.connect('host=localhost port=5432 dbname=postgres ' +

'user=%s password=%s ' % (user, password) +

'connect_timeout=1')

cur = conn.cursor()

for metric in metrics:

cur.execute("SELECT sum(%s) FROM pg_stat_database;" % metric)

result = cur.fetchone()[0]

print("%s.%s.%s.%s %d %d" % (gr_prefix, hostname_s, me, metric,

result, current_ts))

May this be related to psycopg2 or I should write to pgsql-bugs@?

Thanks.

May the force be with you…

https://simply.name

В списке psycopg по дате отправления:

Предыдущее

От: Vladimir Borodin
Дата: 30 марта 2015 г., 17:11:17
Сообщение: Segfault in BackendIdGetTransactionIds with psycopg2

Следующее

От: Daniele Varrazzo
Дата: 30 марта 2015 г., 17:43:33
Сообщение: Re: Segfault in BackendIdGetTransactionIds with psycopg2

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Segfault in BackendIdGetTransactionIds with psycopg2

Предыдущее

Следующее