[HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling

Поиск
Список
Период
Сортировка
От Jeevan Chalke
Тема [HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling
Дата
Msg-id CAM2+6=VEE30YtRQCZX7_sCFsEpoUkFBV1gZazL70fqLn8rcvBA@mail.gmail.com
обсуждение исходный текст
Ответы Re: [HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

We have observed a random server crash (FailedAssertion), while running few tests at our end. Stack-trace is attached.

By looking at the stack-trace, and as discussed it with my team members; what we have observed that in SearchCatCacheList(), we are incrementing refcount and then decrementing it at the end. However for some reason, if we are in TRY() block (where we increment the refcount), and hit with any interrupt, we failed to decrement the refcount due to which later we get assertion failure.

To mimic the scenario, I have added a sleep in SearchCatCacheList() as given below:

diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index e7e8e3b..eb6d4b5 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1520,6 +1520,9 @@ SearchCatCacheList(CatCache *cache,
            hashValue = CatalogCacheComputeTupleHashValue(cache, ntp);
            hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
 
+           elog(INFO, "Sleeping for 0.1 seconds.");
+           pg_usleep(100000L); /* 0.1 seconds */
+
            bucket = &cache->cc_bucket[hashIndex];
            dlist_foreach(iter, bucket)
            {

And then followed these steps to get a server crash:

-- Terminal 1
DROP TYPE typ;
DROP FUNCTION func(x int);

CREATE TYPE typ AS (X VARCHAR(50), Y INT);

CREATE OR REPLACE FUNCTION func(x int) RETURNS int AS $$
DECLARE
  rec typ;
  var2 numeric;
BEGIN
  RAISE NOTICE 'Function Called.';
  REC.X := 'Hello';
  REC.Y := 0;
 
  IF (rec.Y + var2) = 0 THEN
    RAISE NOTICE 'Check Pass';
  END IF;

  RETURN 1;
END;
$$ LANGUAGE plpgsql;

SELECT pg_backend_pid();

SELECT func(1);


-- Terminal 2, should be run in parallel when SELECT func(1) is in progress in terminal 1.
SELECT pg_terminate_backend(<pid of backend obtained in terminal 1>);


I thought it worth posting here to get others attention.

I have observed this on the master branch, but can also be reproducible on back-branches.

Thanks
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Следующее
От: Sandeep Thakkar
Дата:
Сообщение: Re: [HACKERS] pl/perl extension fails on Windows