Re: pg_upgrade and rsync
От | Bruce Momjian |
---|---|
Тема | Re: pg_upgrade and rsync |
Дата | |
Msg-id | 20150122234331.GA15865@momjian.us обсуждение исходный текст |
Ответ на | Re: pg_upgrade and rsync (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Ответы |
Re: pg_upgrade and rsync
(Jim Nasby <Jim.Nasby@BlueTreble.com>)
|
Список | pgsql-hackers |
On Thu, Jan 22, 2015 at 10:48:37PM +0200, Heikki Linnakangas wrote: > >>> * If we need to protect hint bit updates from torn writes, WAL-log a > >>> * full page image of the page. This full page image is only necessary > >>> * if the hint bit update is the first change to the page since the > >>> * last checkpoint. > >>> * > >>> * We don't check full_page_writes here because that logic is included > >>> * when we call XLogInsert() since the value changes dynamically. > >>> */ > >>> if (XLogHintBitIsNeeded() && (bufHdr->flags & BM_PERMANENT)) > >>> { > >>> /* > >>> * If we're in recovery we cannot dirty a page because of a hint. > >>> * We can set the hint, just not dirty the page as a result so the > >>> * hint is lost when we evict the page or shutdown. > >>> * > >>> * See src/backend/storage/page/README for longer discussion. > >>> */ > >>> if (RecoveryInProgress()) > >>> return; > > > >What if XLogHintBitIsNeeded is false? That would be the case if we're not wall logging hints *on the standby*. > > Then the page will be updated without writing a WAL record. Just > like in the master, if wal_log_hints is off. wal_log_hints works the > same on the master or the standby. [ see below for why this entire idea might not work ] OK, I was confused by your previous "no". It means we do update hint bits on read-only slave queries --- we just don't WAL log them. In fact, we can't update hint bits on the standby if we are wal logging them ---- is that right? My text was saying: these differences can be reduced by using a fresh standby and byenabling <xref linkend="guc-wal-log-hints">. (While<varname>wal_log_hints</>transfers hint bits from the primary tostandbys, additional hint bits are still set on thestandbys byread-only queries.) meaning if you don't run any read-only queries on the standby, the files will be same on master/standby because the hint bits will be the same, and rsync will not copy the files. This brings up the other problem that the mod times of the files are likely to be different between master and slave --- should I recommend to only use rsync --checksum? I would really like to get a way to pg_upgrade the standbys but we have never really be able to get a solution. Ideally we would update just the system table files, and if the order of pg_upgrade file renames is exactly the same, everything else would match, but I can't imagine what such an API would look like. Have pg_upgrade spit out a list of files to be copied? In fact, these are the relfilenodes pg_upgrade preserves: * While pg_class.oid and pg_class.relfilenode are initially the same* in a cluster, they can diverge due to CLUSTER, REINDEX,or VACUUM* FULL. In the new cluster, pg_class.oid and pg_class.relfilenode will* be the same and will match theold pg_class.oid value. Because of* this, old/new pg_class.relfilenode values will not match if CLUSTER,* REINDEX,or VACUUM FULL have been performed in the old cluster.** We control all assignments of pg_type.oid because theseoids are stored* in user composite type values.** We control all assignments of pg_enum.oid because these oids arestored* in user tables as enum values.** We control all assignments of pg_authid.oid because these oids are stored* in pg_largeobject_metadata. so if the table/index relfilenodes no longer match the oid on the old cluster, due to CLUSTER, REINDEX, or VACUUM FULL, the file name will not match on the new cluster and rsync will copy the entire file. In fact, rsync is going to copy it to the wrong file name, and delete the right file. I am going to now conclude that rsync is never going to work for this, unless we have pg_upgrade preserve relfilenodes as well. However, I am not even sure that is possible due to conflicts with system table relfilenodes created in the new cluster. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Andreas KarlssonДата:
Сообщение: Re: Using 128-bit integers for sum, avg and statistics aggregates