Обсуждение: PATCH: Warn users about tablespace abuse data loss risk

Поиск
Список
Период
Сортировка

PATCH: Warn users about tablespace abuse data loss risk

От
Craig Ringer
Дата:
Hi all

I've just seen another case of data loss due to misuse of /
misunderstanding of tablespaces:

http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres

and it's prompted me to write some docs amendments to make it more
obvious that *you shouldn't do that*.

Not that it'll stop people, but it'll at least mean they can't say we
didn't warn them.

This is actually quite important, because many users are used to MySQL's
MyISAM, where each table contains its own metadata and is readable by
simply copying the table into a different MySQL install's data
directory. It doesn't even have to be the same version! Users are
clearly surprised that PostgreSQL tablespaces don't have the same
properties.

Thoughts?

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: PATCH: Warn users about tablespace abuse data loss risk

От
Ian Lawrence Barwick
Дата:
2014-02-12 12:52 GMT+09:00 Craig Ringer <craig@2ndquadrant.com>:
> Hi all
>
> I've just seen another case of data loss due to misuse of /
> misunderstanding of tablespaces:
>
> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres
>
> and it's prompted me to write some docs amendments to make it more
> obvious that *you shouldn't do that*.
>
> Not that it'll stop people, but it'll at least mean they can't say we
> didn't warn them.
>
> This is actually quite important, because many users are used to MySQL's
> MyISAM, where each table contains its own metadata and is readable by
> simply copying the table into a different MySQL install's data
> directory. It doesn't even have to be the same version! Users are
> clearly surprised that PostgreSQL tablespaces don't have the same
> properties.
>
> Thoughts?

People still use MyISAM!?

I had a similar issue pop up at work a while back, having something
explicit to point to is definitely a good idea.

Suggestion for the first paragraph of the patch (sorry I can't provide it in
patch form right now):

  Even if they are located outside the main PostgreSQL data directory,
tablespaces
  are an integral part of the database cluster and
<emphasis>cannot</emphasis> be
  treated as an autonomous collection of data files. They rely on
metadata contained
  in the main data directory, without which they are useless. In
particular, tablespaces
  cannot be reattached to a different database cluster, and backing up
individual
  tablespaces makes no sense as a backup/redundancy method. Similarly,
if you lose a
  tablespace (file deletion, disk failure, etc) the main database may
become unreadable
  or fail to start.

Regards


Ian Barwick


Re: PATCH: Warn users about tablespace abuse data loss risk

От
David Johnston
Дата:
Ian Lawrence Barwick wrote
> 2014-02-12 12:52 GMT+09:00 Craig Ringer <

> craig@

> >:
>> Hi all
>>
>> I've just seen another case of data loss due to misuse of /
>> misunderstanding of tablespaces:
>>
>> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres
>>
>> and it's prompted me to write some docs amendments to make it more
>> obvious that *you shouldn't do that*.
>>
>> Not that it'll stop people, but it'll at least mean they can't say we
>> didn't warn them.
>>
>> This is actually quite important, because many users are used to MySQL's
>> MyISAM, where each table contains its own metadata and is readable by
>> simply copying the table into a different MySQL install's data
>> directory. It doesn't even have to be the same version! Users are
>> clearly surprised that PostgreSQL tablespaces don't have the same
>> properties.
>>
>> Thoughts?
>
> People still use MyISAM!?
>
> I had a similar issue pop up at work a while back, having something
> explicit to point to is definitely a good idea.
>
> Suggestion for the first paragraph of the patch (sorry I can't provide it
> in
> patch form right now):
>
>   Even if they are located outside the main PostgreSQL data directory,
> tablespaces
>   are an integral part of the database cluster and
> <emphasis>
> cannot
> </emphasis>
>  be
>   treated as an autonomous collection of data files. They rely on
> metadata contained
>   in the main data directory, without which they are useless. In
> particular, tablespaces
>   cannot be reattached to a different database cluster, and backing up
> individual
>   tablespaces makes no sense as a backup/redundancy method. Similarly,
> if you lose a
>   tablespace (file deletion, disk failure, etc) the main database may
> become unreadable
>   or fail to start.
>
> Regards
>
>
> Ian Barwick

While providing additional warnings is good and necessary it may also help
to be more descriptive as to in what situations tablespaces are appropriate
and/or necessary so that people leave with a better understanding of why the
feature exists and not just trying to know what not to use it for.  It goes
against the more prescriptive tone of the documentation generally but both
approaches work well together to tackle the knowledge/understanding gap some
users seem to have.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PATCH-Warn-users-about-tablespace-abuse-data-loss-risk-tp5791537p5791542.html
Sent from the PostgreSQL - docs mailing list archive at Nabble.com.


Re: PATCH: Warn users about tablespace abuse data loss risk

От
Ian Lawrence Barwick
Дата:
2014-02-12 14:06 GMT+09:00 David Johnston <polobo@yahoo.com>:
> Ian Lawrence Barwick wrote
>> 2014-02-12 12:52 GMT+09:00 Craig Ringer <
>
>> craig@
>
>> >:
>>> Hi all
>>>
>>> I've just seen another case of data loss due to misuse of /
>>> misunderstanding of tablespaces:
>>>
>>> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres
>>>
>>> and it's prompted me to write some docs amendments to make it more
>>> obvious that *you shouldn't do that*.
>>>
>>> Not that it'll stop people, but it'll at least mean they can't say we
>>> didn't warn them.
>>>
>>> This is actually quite important, because many users are used to MySQL's
>>> MyISAM, where each table contains its own metadata and is readable by
>>> simply copying the table into a different MySQL install's data
>>> directory. It doesn't even have to be the same version! Users are
>>> clearly surprised that PostgreSQL tablespaces don't have the same
>>> properties.
>>>
>>> Thoughts?
>>
>> People still use MyISAM!?
>>
>> I had a similar issue pop up at work a while back, having something
>> explicit to point to is definitely a good idea.
>>
>> Suggestion for the first paragraph of the patch (sorry I can't provide it
>> in
>> patch form right now):
>>
>>   Even if they are located outside the main PostgreSQL data directory,
>> tablespaces
>>   are an integral part of the database cluster and
>> <emphasis>
>> cannot
>> </emphasis>
>>  be
>>   treated as an autonomous collection of data files. They rely on
>> metadata contained
>>   in the main data directory, without which they are useless. In
>> particular, tablespaces
>>   cannot be reattached to a different database cluster, and backing up
>> individual
>>   tablespaces makes no sense as a backup/redundancy method. Similarly,
>> if you lose a
>>   tablespace (file deletion, disk failure, etc) the main database may
>> become unreadable
>>   or fail to start.
>>

> While providing additional warnings is good and necessary it may also help
> to be more descriptive as to in what situations tablespaces are appropriate
> and/or necessary so that people leave with a better understanding of why the
> feature exists and not just trying to know what not to use it for.  It goes
> against the more prescriptive tone of the documentation generally but both
> approaches work well together to tackle the knowledge/understanding gap some
> users seem to have.

The warning would appear on this page:

  http://www.postgresql.org/docs/current/static/manage-ag-tablespaces.html

which describes what tablespaces *can* do, but unless you're familiar with the
structure of the PostgreSQL data directories, it's not obvious what you *can't*
do. I recall reading a blog post a while back about tablespaces being "archived"
to the cloud with disastrous results, and a quick search pulls up
stuff like this:

  http://stackoverflow.com/questions/3534415/moving-postgres-tablespaces-and-tables-across-ec2-instance

so it's definitely not a niche issue. Something "official" to link to
would be very
useful in this kind of situation. That doesn't preclude the general
documentation
being improved of course.

Regards

Ian Barwick


Re: PATCH: Warn users about tablespace abuse data loss risk

От
Ian Lawrence Barwick
Дата:
2014-02-12 15:16 GMT+09:00 Ian Lawrence Barwick <barwick@gmail.com>:
> 2014-02-12 14:06 GMT+09:00 David Johnston <polobo@yahoo.com>:
>> Ian Lawrence Barwick wrote
>>> 2014-02-12 12:52 GMT+09:00 Craig Ringer <
>>
>>> craig@
>>
>>> >:
>>>> Hi all
>>>>
>>>> I've just seen another case of data loss due to misuse of /
>>>> misunderstanding of tablespaces:
>>>>
>>>> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres
>>>>
>>>> and it's prompted me to write some docs amendments to make it more
>>>> obvious that *you shouldn't do that*.
>>>>
>>>> Not that it'll stop people, but it'll at least mean they can't say we
>>>> didn't warn them.
>>>>
>>>> This is actually quite important, because many users are used to MySQL's
>>>> MyISAM, where each table contains its own metadata and is readable by
>>>> simply copying the table into a different MySQL install's data
>>>> directory. It doesn't even have to be the same version! Users are
>>>> clearly surprised that PostgreSQL tablespaces don't have the same
>>>> properties.
>>>>
>>>> Thoughts?
>>>
>>> People still use MyISAM!?
>>>
>>> I had a similar issue pop up at work a while back, having something
>>> explicit to point to is definitely a good idea.
>>>
>>> Suggestion for the first paragraph of the patch (sorry I can't provide it
>>> in
>>> patch form right now):
>>>
>>>   Even if they are located outside the main PostgreSQL data directory,
>>> tablespaces
>>>   are an integral part of the database cluster and
>>> <emphasis>
>>> cannot
>>> </emphasis>
>>>  be
>>>   treated as an autonomous collection of data files. They rely on
>>> metadata contained
>>>   in the main data directory, without which they are useless. In
>>> particular, tablespaces
>>>   cannot be reattached to a different database cluster, and backing up
>>> individual
>>>   tablespaces makes no sense as a backup/redundancy method. Similarly,
>>> if you lose a
>>>   tablespace (file deletion, disk failure, etc) the main database may
>>> become unreadable
>>>   or fail to start.
>>>
>
>> While providing additional warnings is good and necessary it may also help
>> to be more descriptive as to in what situations tablespaces are appropriate
>> and/or necessary so that people leave with a better understanding of why the
>> feature exists and not just trying to know what not to use it for.  It goes
>> against the more prescriptive tone of the documentation generally but both
>> approaches work well together to tackle the knowledge/understanding gap some
>> users seem to have.
>
> The warning would appear on this page:
>
>   http://www.postgresql.org/docs/current/static/manage-ag-tablespaces.html
>
> which describes what tablespaces *can* do, but unless you're familiar with the
> structure of the PostgreSQL data directories, it's not obvious what you *can't*
> do. I recall reading a blog post a while back about tablespaces being "archived"
> to the cloud with disastrous results, and a quick search pulls up
> stuff like this:
>
>   http://stackoverflow.com/questions/3534415/moving-postgres-tablespaces-and-tables-across-ec2-instance
>
> so it's definitely not a niche issue. Something "official" to link to
> would be very useful in this kind of situation. That doesn't preclude the general
> documentation being improved of course.

And taking a look at the page in question I see this prominent example:

  CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';

As '/mnt' is usually a temporary mount point for detachable media and
all that [*]
maybe it's not the best impression to give for a suitable tablespace location.

[*] http://www.pathname.com/fhs/pub/fhs-2.3.html#MNTMOUNTPOINTFORATEMPORARILYMOUNT
"This directory is provided so that the system administrator may
temporarily mount a
filesystem as needed. "


Ian Barwick


Re: PATCH: Warn users about tablespace abuse data loss risk

От
Craig Ringer
Дата:
On 02/12/2014 02:35 PM, Ian Lawrence Barwick wrote:
> And taking a look at the page in question I see this prominent example:
>
>   CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';

Here's an updated patch, adopting your wording for the first par and
tweaking the example.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: PATCH: Warn users about tablespace abuse data loss risk

От
Bruce Momjian
Дата:
On Thu, Feb 13, 2014 at 08:15:59AM +0800, Craig Ringer wrote:
> On 02/12/2014 02:35 PM, Ian Lawrence Barwick wrote:
> > And taking a look at the page in question I see this prominent example:
> >
> >   CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';
>
> Here's an updated patch, adopting your wording for the first par and
> tweaking the example.

I have applied an adjusted version of this patch that is more compact.
Thanks.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Вложения