Обсуждение: Question on Fragmentations

Поиск
Список
Период
Сортировка

Question on Fragmentations

От
"Moiz Kothari"
Дата:
Hi All,

What are the reasons of data getting fragmented in postgres? Do we have any page which explains different scenarios of data getting fragmented?

Regards,
Moiz Kothari

Re: Question on Fragmentations

От
Peter Eisentraut
Дата:
Moiz Kothari wrote:
> What are the reasons of data getting fragmented in postgres?

What do you mean by that?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Question on Fragmentations

От
"Joshua D. Drake"
Дата:
Peter Eisentraut wrote:
> Moiz Kothari wrote:
>> What are the reasons of data getting fragmented in postgres?
>
> What do you mean by that?
>

I am guessing he means filesystem fragmentation.

Joshua D. Drake

--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: Question on Fragmentations

От
Peter Koczan
Дата:
Moiz Kothari wrote:
> Hi All,
>
> What are the reasons of data getting fragmented in postgres? Do we
> have any page which explains different scenarios of data getting
> fragmented?
>
> Regards,
> Moiz Kothari
I guess there are two types of fragmentation in play with postgres,
internal database fragmentation and external filesystem fragmentation.

Internal fragmentation is caused by holes in disk blocks when records
are deleted, or a record is *just* a little too big that it can't fit in
a remaining empty space in a particular disk block. This can be fixed
(or at least reduced quite a bit) by using VACUUM/VACUUM ANALYZE
(especially using the FULL option) in psql or the command vacuumdb (and
its myriad options), as is a very helpful, oft-suggested strategy by
people on these lists, for reasons other than just defragmentation.

External fragmentation occurs in postgres for the same reason it occurs
in other types of files. Lots of appends and generally continual file
growth mean that the filesystem is less likely to find adjacent disk
blocks (which is the source of fragmentation). Even though Unix file
systems generally do a good job of preventing fragmentation, some
workloads are just not good. For instance, large mail spools (where the
policy is one file per spool instead of one file per message) are prone
to fragmentation.

Case in point, I use xfs as the filesystem running under postgres, and
after a few days the "major" database clusters showed ~90% fragmentation
on their respective partitions (which is about a 10 to 1 ratio of file
fragments to files). After running a defragmenter, the fragmentation
went under 5% (which is under 1.2 fragments per file), and performance
generally doubled. Other, more application specific databases which
still get hit and appended quite often, are often at no more than 3%-5%
fragmentation.

I poked around and sadly couldn't find any references to fragmentation
in the official docs, but I found a couple references to the psql
command CLUSTER in other archived lists. This may help fix
fragmentation, though I haven't tried it out myself to know if it works
(or what implications there are for performance in general).

I've also read accounts of people with *HUGE* databases with almost no
fragmentation, so yet another possibility in the fight against
fragmentation may be to tweak your filesystem or postgres configs. If
anyone has any specific parameters to try or experience doing this, I
would consider it more than welcome, because I do not have that knowledge.

Sorry for the relative verbosity.

Peter

Re: Question on Fragmentations

От
Michael Monnerie
Дата:
On Freitag, 9. Februar 2007 04:08 Peter Koczan wrote:
> Case in point, I use xfs as the filesystem running under postgres,
> and after a few days the "major" database clusters showed ~90%
> fragmentation on their respective partitions (which is about a 10 to
> 1 ratio of file fragments to files). After running a defragmenter

Does xfs have such stats, and defragmenter included? It could be a good
idea for me to use that, then. Currently I use reiserfs.

mfg zmi
--
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0676/846 914 666                      .network.your.ideas.
// PGP Key:        "curl -s http://zmi.at/zmi4.asc | gpg --import"
// Fingerprint: EA39 8918 EDFF 0A68 ACFB  11B7 BA2D 060F 1C6F E6B0
// Keyserver: www.keyserver.net                   Key-ID: 1C6FE6B0

Вложения

Re: Question on Fragmentations

От
Peter Koczan
Дата:
Michael Monnerie wrote:
> On Freitag, 9. Februar 2007 04:08 Peter Koczan wrote:
>
>> Case in point, I use xfs as the filesystem running under postgres,
>> and after a few days the "major" database clusters showed ~90%
>> fragmentation on their respective partitions (which is about a 10 to
>> 1 ratio of file fragments to files). After running a defragmenter
>>
>
> Does xfs have such stats, and defragmenter included? It could be a good
> idea for me to use that, then. Currently I use reiserfs.
>
> mfg zmi
>
xfs comes with It does have built-in, xfs-approved utilities for stats
and defragmenting built-in.

xfs_db gives stats (for fragmentation use xfs_db -c frag -r /dev/XXX).
This works even if the filesystem is mounted and active, but I believe
that old stats are cached until said filesystem is remounted or until
some stat collection process runs.

xfs_fsr is the defragmenter (simply use xfs_fsr /dev/XXX). It's safe to
run this on an active filesystem/database partition, because it throws
away the fragmented data if files are changed. So, for full
defragmentation, you'll either want to run it offline, unmounted, or
during idle times.

Peter