Re: Maintenance question / DB size anomaly...

Поиск
Список
Период
Сортировка
От Kurt Overberg
Тема Re: Maintenance question / DB size anomaly...
Дата
Msg-id EC556F3E-E964-441D-A06D-278987F7EB5A@hotdogrecords.com
обсуждение исходный текст
Ответ на Re: Maintenance question / DB size anomaly...  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Okay,


  select * from _my_cluster.sl_log_1 where ctid = '(1,1)';
  select * from _my_cluster.sl_log_1 where ctid = '(1,2)';
  select * from _my_cluster.sl_log_1 where ctid = '(1,3)';
  select * from _my_cluster.sl_log_1 where ctid = '(1,4)';

all returns zero rows.  When I do a dump of that file, I get:

Block    1 ********************************************************
<Header> -----
Block Offset: 0x00002000         Offsets: Lower     408 (0x0198)
Block: Size 8192  Version    2            Upper    7680 (0x1e00)
LSN:  logid    955 recoff 0x0daed68c      Special  8192 (0x2000)
Items:   97                   Free Space: 7272
Length (including item array): 412

<Data> ------
Item   1 -- Length:  121  Offset: 8068 (0x1f84)  Flags: USED
   XMIN: 1491480520  CMIN: 1  XMAX: 0  CMAX|XVAC: 0
   Block Id: 1  linp Index: 1   Attributes: 6   Size: 32
   infomask: 0x0912 (HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID)

...the fact that they weren't in the table, but in the file (I did
the filedump first,
then the query), then redid the filedump, the results are the same,
the rows are still
in the file.  I have no idea how frequently these files are getting
written to, I assume
frequently.  I also looked at the last block listed in the file,
6445, and also looked for
items 1-4, and also did not find them in the table using a similar
select as above.  That seems
kinda strange, since there's right this second 11,000 items in that
table, but I'll roll with it for awhile.

Intrigued, I wanted to see what a filedump looked like of a row that
WAS in the table:

ctid   | log_origin |  log_xid   | log_tableid | log_actionseq |
log_cmdtype |
  (7,1) |         10 | 1491481037 |           8 |     473490934 |
I           | (memberid,answerid,taskinstanceid) values
('144854','148707','0')



Block    7 ********************************************************
<Header> -----
Block Offset: 0x0000e000         Offsets: Lower     424 (0x01a8)
Block: Size 8192  Version    2            Upper     508 (0x01fc)
LSN:  logid    955 recoff 0x0dc4bcc0      Special  8192 (0x2000)
Items:  101                   Free Space:   84
Length (including item array): 428

<Data> ------
Item   1 -- Length:  129  Offset: 8060 (0x1f7c)  Flags: USED
   XMIN: 1491481037  CMIN: 7  XMAX: 0  CMAX|XVAC: 0
   Block Id: 7  linp Index: 1   Attributes: 6   Size: 32
   infomask: 0x0912 (HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID)


...the NextID was (taken about 5 mins after the previous filedump):
  Latest checkpoint's NextXID:          1491498183


I don't see any real differences in the file entry for a row that is
in the table, and one that I
don't see in the table.  I hope I'm getting this right, its totally
fascinating seeing how
all this works.

About your other questions:

1.  I have pg_clog segments all the way back to the start of the
database, all the way back
to March 14th, 2006 (most likely when the database was first brought
up on this machine).
The numeric names start at 0000 and go to 058E.  I checked the recent
(within last 8 days)
and saw no errors containing the word 'checkpoint'.  In fact, very
few errors at all.
The dang thing seems to be running pretty well, just a little slow.

mydb=# select datname, age(datfrozenxid) from pg_database;
   datname  |    age
-----------+------------
template1 | 1491520270
template0 | 1491520270
postgres  | 1491520270
mydb      | 1076194904


Oooooooo..... thats not good, is it?  Thanks for taking an interest,
Tom.  I'm most likely going to
promote one of my subscribers to be master, then nuke this database,
but I have no problems keeping it
around if you think I may have found some obscure bug that could help
someone debug.  Again, this
DB gets vacuumed every day, and in the beginning, I think I remember
doing a vacuum full every
day.

Thanks,

/kurt


On Jun 20, 2007, at 5:08 PM, Tom Lane wrote:

> Kurt Overberg <kurt@hotdogrecords.com> writes:
>> Drat!  I'm wrong again.  I thought for sure there wouldn't be a
>> wraparound problem.
>
> Well, I'm not sure what it is now.  You showed some invisible tuples
> with XMINs of
>    XMIN: 1489323584  CMIN: 1  XMAX: 0  CMAX|XVAC: 0
>    XMIN: 1489323590  CMIN: 2  XMAX: 0  CMAX|XVAC: 0
>    XMIN: 1489323592  CMIN: 1  XMAX: 0  CMAX|XVAC: 0
> but the nextXID is
>          1490547335
> which is not that far ahead of those --- about 1.2 million
> transactions,
> or less than a day's time according to the clog timestamps, which
> suggest that you're burning several million XIDs a day.  Perhaps
> you've
> wrapped past them since your earlier check --- if you try the same
> "select where ctid = " queries now, do they show rows?
>
> The other thing that's strange here is that an 8.0 installation should
> be pretty aggressive about recycling pg_clog segments, and yet you've
> got a bunch there.  How far back do the files in pg_clog go --- what's
> the numeric range of the filenames, and the date range of their mod
> times?  Have you checked the postmaster log to see if you're
> getting any
> complaints about checkpoint failures or anything like that?  It would
> also be useful to look at the output of
> select datname, age(datfrozenxid) from pg_database;
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq


В списке pgsql-performance по дате отправления:

Предыдущее
От: Karl Wright
Дата:
Сообщение: Re: Performance query about large tables, lots of concurrent access
Следующее
От: Stefan Kaltenbrunner
Дата:
Сообщение: Re: PostgreSQL Configuration Tool for Dummies