Обсуждение: Curious question about physical files to store database
As I have seen, some database created or pre-allocate large physical files on the file system to as the backend of the database tablespace.
For Postgresql, I have observed that it created several files in the base and global directory.
It may be by design, what is the pros and cons of this behavior?
Thanks and regards,
Patrick
Patrick
Patrick Dung wrote: > As I have seen, some database created or pre-allocate large physical files on the file system to as > the backend of the database tablespace. > > For Postgresql, I have observed that it created several files in the base and global directory. > > It may be by design, what is the pros and cons of this behavior? You are a bit unspecific; are you talking about Oracle? The disk layout is of course by design. Oracle uses large container files and keeps its data in those. As far as I know, this is to bypass file system functionality. Oracle usually recommends direct I/O and bypasses file system functionality (caching etc.) as much as possible. I guess one reason for this is that, historically, file system implementations incurred more overhead than they do now and had all sorts of other problems with larger amounts of data. These days, filesystems perform much better, so this is no longer necessary, but Oracle is quite old software. Another reason may be Oracle's quest to rule the world, and the storage layer is part of that. Lately, Oracle tries to get everybody to use ASM, their storage layer, which completely bypasses file system functionality. PostgreSQL, on the other hand, does not have the resources or intentions to write a better file system and actually uses file system capabilities like caching to improve performance. PostgreSQL keeps what Oracle calls segments as individual files in a directory structure. Yours, Laurenz Albe
On Monday, November 4, 2013 10:09 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
Patrick Dung wrote:
> As I have seen, some database created or pre-allocate large physical files on the file system to as
> the backend of the database tablespace.
>
> For Postgresql, I have observed that it created several files in the base and global directory.
>
> It may be by design, what is the pros and cons of this behavior?
You are a bit unspecific; are you talking about Oracle?
The disk layout is of course by design.
Oracle uses large container files and keeps its data in those.
As far as I know, this is to bypass file system functionality.
Oracle usually recommends direct I/O and bypasses file system
functionality (caching etc.) as much as possible.
I guess one reason for this is that, historically, file system
implementations incurred more overhead than they do now and had
all sorts of other problems with larger amounts of data.
These days, filesystems perform much better, so this is no longer
necessary, but Oracle is quite old software.
Another reason may be Oracle's quest to rule the world, and the
storage layer is part of that. Lately, Oracle tries to get everybody
to use ASM, their storage layer, which completely bypasses
file system functionality.
PostgreSQL, on the other hand, does not have the resources or
intentions to write a better file system and actually uses
file system capabilities like caching to improve performance.
PostgreSQL keeps what Oracle calls segments as individual files
in a directory structure.
Yours,
Laurenz Albe
--------------------------------------------------------------------------------------
I have seen some databases product that allocate small number of large files.
Please correct me if I am wrong:
MSSQL (one file is the data and another file for the transaction log)
MySQL with InnoDB
Oracle
DB2
Thanks and regards,
Patrick
> As I have seen, some database created or pre-allocate large physical files on the file system to as
> the backend of the database tablespace.
>
> For Postgresql, I have observed that it created several files in the base and global directory.
>
> It may be by design, what is the pros and cons of this behavior?
You are a bit unspecific; are you talking about Oracle?
The disk layout is of course by design.
Oracle uses large container files and keeps its data in those.
As far as I know, this is to bypass file system functionality.
Oracle usually recommends direct I/O and bypasses file system
functionality (caching etc.) as much as possible.
I guess one reason for this is that, historically, file system
implementations incurred more overhead than they do now and had
all sorts of other problems with larger amounts of data.
These days, filesystems perform much better, so this is no longer
necessary, but Oracle is quite old software.
Another reason may be Oracle's quest to rule the world, and the
storage layer is part of that. Lately, Oracle tries to get everybody
to use ASM, their storage layer, which completely bypasses
file system functionality.
PostgreSQL, on the other hand, does not have the resources or
intentions to write a better file system and actually uses
file system capabilities like caching to improve performance.
PostgreSQL keeps what Oracle calls segments as individual files
in a directory structure.
Yours,
Laurenz Albe
--------------------------------------------------------------------------------------
I have seen some databases product that allocate small number of large files.
Please correct me if I am wrong:
MSSQL (one file is the data and another file for the transaction log)
MySQL with InnoDB
Oracle
DB2
Thanks and regards,
Patrick
Patrick Dung wrote: > I have seen some databases product that allocate small number of large files. > > Please correct me if I am wrong: > > MSSQL (one file is the data and another file for the transaction log) > MySQL with InnoDB > Oracle > DB2 I don't know enough about DB2 and MSSQL, but you are correct with regard to InnoDB and Oracle. Yours, Laurenz Albe
Em 05/11/2013 12:51, Albe Laurenz escreveu: > Patrick Dung wrote: >> I have seen some databases product that allocate small number of large files. >> >> Please correct me if I am wrong: >> >> MSSQL (one file is the data and another file for the transaction log) >> MySQL with InnoDB >> Oracle >> DB2 > I don't know enough about DB2 and MSSQL, but you are correct > with regard to InnoDB and Oracle. > > Yours, > Laurenz Albe > MSSQL can use several data files, but in general, you will have one for database data, and one for database log. Advanced administrator would choose to split database among several files (specially in sparse disks). For instance, you would have one data file in one SCSI disk for data, and one data file in another SCSI disk for indexes, and a third SCSI disk for log. You can have several (hundreds? thousands? I can't remember) data files. regards, Edson Richter
On Tue, Nov 05, 2013 at 10:42:36PM +0800, Patrick Dung wrote: > I have seen some databases product that allocate small number of large files. > > Please correct me if I am wrong: > > MySQL with InnoDB Actually, InnoDB has a file-per-table mode which I tend to prefer. It means that when I drop a partition I actually see the disk usage drop. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer