Обсуждение: PATCH: Warn users about tablespace abuse data loss risk
Hi all I've just seen another case of data loss due to misuse of / misunderstanding of tablespaces: http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres and it's prompted me to write some docs amendments to make it more obvious that *you shouldn't do that*. Not that it'll stop people, but it'll at least mean they can't say we didn't warn them. This is actually quite important, because many users are used to MySQL's MyISAM, where each table contains its own metadata and is readable by simply copying the table into a different MySQL install's data directory. It doesn't even have to be the same version! Users are clearly surprised that PostgreSQL tablespaces don't have the same properties. Thoughts? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
2014-02-12 12:52 GMT+09:00 Craig Ringer <craig@2ndquadrant.com>: > Hi all > > I've just seen another case of data loss due to misuse of / > misunderstanding of tablespaces: > > http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres > > and it's prompted me to write some docs amendments to make it more > obvious that *you shouldn't do that*. > > Not that it'll stop people, but it'll at least mean they can't say we > didn't warn them. > > This is actually quite important, because many users are used to MySQL's > MyISAM, where each table contains its own metadata and is readable by > simply copying the table into a different MySQL install's data > directory. It doesn't even have to be the same version! Users are > clearly surprised that PostgreSQL tablespaces don't have the same > properties. > > Thoughts? People still use MyISAM!? I had a similar issue pop up at work a while back, having something explicit to point to is definitely a good idea. Suggestion for the first paragraph of the patch (sorry I can't provide it in patch form right now): Even if they are located outside the main PostgreSQL data directory, tablespaces are an integral part of the database cluster and <emphasis>cannot</emphasis> be treated as an autonomous collection of data files. They rely on metadata contained in the main data directory, without which they are useless. In particular, tablespaces cannot be reattached to a different database cluster, and backing up individual tablespaces makes no sense as a backup/redundancy method. Similarly, if you lose a tablespace (file deletion, disk failure, etc) the main database may become unreadable or fail to start. Regards Ian Barwick
Ian Lawrence Barwick wrote > 2014-02-12 12:52 GMT+09:00 Craig Ringer < > craig@ > >: >> Hi all >> >> I've just seen another case of data loss due to misuse of / >> misunderstanding of tablespaces: >> >> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres >> >> and it's prompted me to write some docs amendments to make it more >> obvious that *you shouldn't do that*. >> >> Not that it'll stop people, but it'll at least mean they can't say we >> didn't warn them. >> >> This is actually quite important, because many users are used to MySQL's >> MyISAM, where each table contains its own metadata and is readable by >> simply copying the table into a different MySQL install's data >> directory. It doesn't even have to be the same version! Users are >> clearly surprised that PostgreSQL tablespaces don't have the same >> properties. >> >> Thoughts? > > People still use MyISAM!? > > I had a similar issue pop up at work a while back, having something > explicit to point to is definitely a good idea. > > Suggestion for the first paragraph of the patch (sorry I can't provide it > in > patch form right now): > > Even if they are located outside the main PostgreSQL data directory, > tablespaces > are an integral part of the database cluster and > <emphasis> > cannot > </emphasis> > be > treated as an autonomous collection of data files. They rely on > metadata contained > in the main data directory, without which they are useless. In > particular, tablespaces > cannot be reattached to a different database cluster, and backing up > individual > tablespaces makes no sense as a backup/redundancy method. Similarly, > if you lose a > tablespace (file deletion, disk failure, etc) the main database may > become unreadable > or fail to start. > > Regards > > > Ian Barwick While providing additional warnings is good and necessary it may also help to be more descriptive as to in what situations tablespaces are appropriate and/or necessary so that people leave with a better understanding of why the feature exists and not just trying to know what not to use it for. It goes against the more prescriptive tone of the documentation generally but both approaches work well together to tackle the knowledge/understanding gap some users seem to have. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PATCH-Warn-users-about-tablespace-abuse-data-loss-risk-tp5791537p5791542.html Sent from the PostgreSQL - docs mailing list archive at Nabble.com.
2014-02-12 14:06 GMT+09:00 David Johnston <polobo@yahoo.com>: > Ian Lawrence Barwick wrote >> 2014-02-12 12:52 GMT+09:00 Craig Ringer < > >> craig@ > >> >: >>> Hi all >>> >>> I've just seen another case of data loss due to misuse of / >>> misunderstanding of tablespaces: >>> >>> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres >>> >>> and it's prompted me to write some docs amendments to make it more >>> obvious that *you shouldn't do that*. >>> >>> Not that it'll stop people, but it'll at least mean they can't say we >>> didn't warn them. >>> >>> This is actually quite important, because many users are used to MySQL's >>> MyISAM, where each table contains its own metadata and is readable by >>> simply copying the table into a different MySQL install's data >>> directory. It doesn't even have to be the same version! Users are >>> clearly surprised that PostgreSQL tablespaces don't have the same >>> properties. >>> >>> Thoughts? >> >> People still use MyISAM!? >> >> I had a similar issue pop up at work a while back, having something >> explicit to point to is definitely a good idea. >> >> Suggestion for the first paragraph of the patch (sorry I can't provide it >> in >> patch form right now): >> >> Even if they are located outside the main PostgreSQL data directory, >> tablespaces >> are an integral part of the database cluster and >> <emphasis> >> cannot >> </emphasis> >> be >> treated as an autonomous collection of data files. They rely on >> metadata contained >> in the main data directory, without which they are useless. In >> particular, tablespaces >> cannot be reattached to a different database cluster, and backing up >> individual >> tablespaces makes no sense as a backup/redundancy method. Similarly, >> if you lose a >> tablespace (file deletion, disk failure, etc) the main database may >> become unreadable >> or fail to start. >> > While providing additional warnings is good and necessary it may also help > to be more descriptive as to in what situations tablespaces are appropriate > and/or necessary so that people leave with a better understanding of why the > feature exists and not just trying to know what not to use it for. It goes > against the more prescriptive tone of the documentation generally but both > approaches work well together to tackle the knowledge/understanding gap some > users seem to have. The warning would appear on this page: http://www.postgresql.org/docs/current/static/manage-ag-tablespaces.html which describes what tablespaces *can* do, but unless you're familiar with the structure of the PostgreSQL data directories, it's not obvious what you *can't* do. I recall reading a blog post a while back about tablespaces being "archived" to the cloud with disastrous results, and a quick search pulls up stuff like this: http://stackoverflow.com/questions/3534415/moving-postgres-tablespaces-and-tables-across-ec2-instance so it's definitely not a niche issue. Something "official" to link to would be very useful in this kind of situation. That doesn't preclude the general documentation being improved of course. Regards Ian Barwick
2014-02-12 15:16 GMT+09:00 Ian Lawrence Barwick <barwick@gmail.com>: > 2014-02-12 14:06 GMT+09:00 David Johnston <polobo@yahoo.com>: >> Ian Lawrence Barwick wrote >>> 2014-02-12 12:52 GMT+09:00 Craig Ringer < >> >>> craig@ >> >>> >: >>>> Hi all >>>> >>>> I've just seen another case of data loss due to misuse of / >>>> misunderstanding of tablespaces: >>>> >>>> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres >>>> >>>> and it's prompted me to write some docs amendments to make it more >>>> obvious that *you shouldn't do that*. >>>> >>>> Not that it'll stop people, but it'll at least mean they can't say we >>>> didn't warn them. >>>> >>>> This is actually quite important, because many users are used to MySQL's >>>> MyISAM, where each table contains its own metadata and is readable by >>>> simply copying the table into a different MySQL install's data >>>> directory. It doesn't even have to be the same version! Users are >>>> clearly surprised that PostgreSQL tablespaces don't have the same >>>> properties. >>>> >>>> Thoughts? >>> >>> People still use MyISAM!? >>> >>> I had a similar issue pop up at work a while back, having something >>> explicit to point to is definitely a good idea. >>> >>> Suggestion for the first paragraph of the patch (sorry I can't provide it >>> in >>> patch form right now): >>> >>> Even if they are located outside the main PostgreSQL data directory, >>> tablespaces >>> are an integral part of the database cluster and >>> <emphasis> >>> cannot >>> </emphasis> >>> be >>> treated as an autonomous collection of data files. They rely on >>> metadata contained >>> in the main data directory, without which they are useless. In >>> particular, tablespaces >>> cannot be reattached to a different database cluster, and backing up >>> individual >>> tablespaces makes no sense as a backup/redundancy method. Similarly, >>> if you lose a >>> tablespace (file deletion, disk failure, etc) the main database may >>> become unreadable >>> or fail to start. >>> > >> While providing additional warnings is good and necessary it may also help >> to be more descriptive as to in what situations tablespaces are appropriate >> and/or necessary so that people leave with a better understanding of why the >> feature exists and not just trying to know what not to use it for. It goes >> against the more prescriptive tone of the documentation generally but both >> approaches work well together to tackle the knowledge/understanding gap some >> users seem to have. > > The warning would appear on this page: > > http://www.postgresql.org/docs/current/static/manage-ag-tablespaces.html > > which describes what tablespaces *can* do, but unless you're familiar with the > structure of the PostgreSQL data directories, it's not obvious what you *can't* > do. I recall reading a blog post a while back about tablespaces being "archived" > to the cloud with disastrous results, and a quick search pulls up > stuff like this: > > http://stackoverflow.com/questions/3534415/moving-postgres-tablespaces-and-tables-across-ec2-instance > > so it's definitely not a niche issue. Something "official" to link to > would be very useful in this kind of situation. That doesn't preclude the general > documentation being improved of course. And taking a look at the page in question I see this prominent example: CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data'; As '/mnt' is usually a temporary mount point for detachable media and all that [*] maybe it's not the best impression to give for a suitable tablespace location. [*] http://www.pathname.com/fhs/pub/fhs-2.3.html#MNTMOUNTPOINTFORATEMPORARILYMOUNT "This directory is provided so that the system administrator may temporarily mount a filesystem as needed. " Ian Barwick
On 02/12/2014 02:35 PM, Ian Lawrence Barwick wrote: > And taking a look at the page in question I see this prominent example: > > CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data'; Here's an updated patch, adopting your wording for the first par and tweaking the example. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
On Thu, Feb 13, 2014 at 08:15:59AM +0800, Craig Ringer wrote: > On 02/12/2014 02:35 PM, Ian Lawrence Barwick wrote: > > And taking a look at the page in question I see this prominent example: > > > > CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data'; > > Here's an updated patch, adopting your wording for the first par and > tweaking the example. I have applied an adjusted version of this patch that is more compact. Thanks. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +