Re: WIP: Avoid creation of the free space map for small tables
От | Amit Kapila |
---|---|
Тема | Re: WIP: Avoid creation of the free space map for small tables |
Дата | |
Msg-id | CAA4eK1LYUtHkVknVaKQDF4OjP_LzrbLfTyAG-=5PC_0xn8JiDQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: WIP: Avoid creation of the free space map for small tables (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On Sat, Feb 2, 2019 at 7:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Jan 28, 2019 at 4:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Jan 28, 2019 at 10:03 AM John Naylor > > <john.naylor@2ndquadrant.com> wrote: > > In the past few days, we have done a further analysis of each problem > and tried to reproduce it. We are successful in generating some form > of reproducer for 3 out of 4 problems in the same way as it was failed > in the buildfarm. For the fourth symptom, we have tried a lot (even > Andrew Dunstan has helped us to run the regression tests with the > faulty commit on Jacana for many hours, but it didn't got reproduced) > but not able to regenerate a failure in a similar way. However, I > have a theory as mentioned below why the particular test could fail > and the fix for the same is done in the patch. I am planning to push > the latest version of the patch [1] which has fixes for all the > symptoms. > Today, I have spent some more time to generate a test which can reproduce the failure though it is with the help of breakpoints. See below: > > > > 4. Failure on jacana: > > --- c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/test/regress/expected/box.out > > 2018-09-26 > > 17:53:33 -0400 > > +++ c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/src/test/regress/results/box.out > > 2019-01-27 23:14:35 > > -0500 > > @@ -252,332 +252,7 @@ > > ('(0,100)(0,infinity)'), > > ('(-infinity,0)(0,infinity)'), > > ('(-infinity,-infinity)(infinity,infinity)'); > > -SET enable_seqscan = false; > > -SELECT * FROM box_temp WHERE f1 << '(10,20),(30,40)'; > > .. > > .. > > TRAP: FailedAssertion("!(!(fsm_local_map.nblocks > 0))", File: > > "c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c", > > Line: > > 1118) > > .. > > 2019-01-27 23:14:35.495 EST [5c4e81a0.2e28:4] LOG: server process > > (PID 14388) exited with exit code 3 > > 2019-01-27 23:14:35.495 EST [5c4e81a0.2e28:5] DETAIL: Failed process > > was running: INSERT INTO box_temp > > VALUES (NULL), > > > > I think the reason for this failure is same as previous (as mentioned > > in point-3), but this can happen in a different way. Say, we have > > searched the local map and then try to extend a relation 'X' and in > > the meantime, another backend has extended such that it creates FSM. > > Now, we will reuse that page and won't clear local map. Now, say we > > try to insert in relation 'Y' which doesn't have FSM. It will try to > > set the local map and will find that it already exists, so will fail. > > Now, the question is how it can happen in this box.sql test. I guess > > that is happening for some system table which is being populated by > > Create Index statement executed just before the failing Insert. > > Based on the above theory, the test is as below: Session-1 --------------- postgres=# create table test_1(c1 int, c2 char(1500)); CREATE TABLE postgres=# create table test_2(c1 int, c2 char(1500)); CREATE TABLE postgres=# insert into test_1 values(generate_series(1,20),'aaaa'); INSERT 0 20 postgres=# insert into test_2 values(1,'aaaa'); INSERT 0 1 Session-2 ---------------- postgres=# analyze test_1; ANALYZE postgres=# analyze test_2; ANALYZE postgres=# select oid, relname, relpages from pg_class where relname like 'test%'; oid | relname | relpages -------+----------------+---------- 41835 | test_1 | 4 41838 | test_2 | 1 Till here we can see that test_1 has 4 pages and test2_1 has 1 page. At this stage, even one more record insertion in test_1 will create a new page. So, now we have to hit the scenario with the help of debugger. For session-1, attach debugger and put breakpoint in RelationGetBufferForTuple(). Session-1 ---------------- postgres=# insert into test_1 values(21,'aaaa'); It will hit the breakpoint in RelationGetBufferForTuple(). Now, add one more breakpoint on line 542 in hio.c, aka below line: RelationGetBufferForTuple { .. else if (!ConditionalLockRelationForExtension(relation, ExclusiveLock)) Press continue and stop the debugger at line 542. Attach the debugger for session-2 and add a breakpoint on line 580 in hio.c, aka below line: RelationGetBufferForTuple { .. buffer = ReadBufferBI(relation, P_NEW, bistate); Session-2 --------------- postgres=# insert into test_1 values(22,'aaaa'); It will hit the breakpoint in RelationGetBufferForTuple(). Now proceed with debugging on session-1 by one step. This is to ensure that session-1 doesn't get a conditional lock. Now, continue the debugger in session-2 and after that run vacuum test_1 from session-2, this will ensure that FSM is created for relation test_1. So the state of session-2 will be as below: Session-2 ---------------- postgres=# insert into test_1 values(22,'aaaa'); INSERT 0 1 postgres=# vacuum test_1; VACUUM postgres=# select oid, relname, relpages from pg_class where relname like 'test%'; oid | relname | relpages -------+----------------+---------- 41835 | test_1 | 5 41838 | test_2 | 1 Continue the debugger in session-1. Now insert one row in test_2 from session-1 and kaboom: Session-1 --------------- postgres=# insert into test_2 values(2,'aaaa'); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> Server logs is as below: TRAP: FailedAssertion("!(!(fsm_local_map.nblocks > 0))", File: "..\postgresql\src\backend\storage\freespace\freespace.c", Line: 1118) 2019-02-02 16:48:06.216 IST [4044] LOG: server process (PID 3540) exited with exit code 3 2019-02-02 16:48:06.216 IST [4044] DETAIL: Failed process was running: insert into test_2 values(2,'aaaa'); This looks exactly the same as the failure in Jacana. The patch fixes this case. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления:
Следующее
От: rajanДата:
Сообщение: Able to do ALTER DEFAULT PRIVILEGES from a user who is not theowner