Обсуждение: How does Postgres store a B-Tree on disk while using the OS file system?

Поиск
Список
Период
Сортировка

How does Postgres store a B-Tree on disk while using the OS file system?

От
Siddharth Jain
Дата:
I am trying to sharpen my understanding of Postgres. As I understand, Postgres does not write directly to disk blocks. It uses the file system provided by the OS:
My question: How can it then store a B Tree on disk? I would think storing a B Tree requires storing disk offset addresses and so on (for a node to navigate to another etc.). For this, one would need to write directly to the disk using low-level disk access functions and not use file system API.

Re: How does Postgres store a B-Tree on disk while using the OS file system?

От
Ron
Дата:
On 3/6/23 18:24, Siddharth Jain wrote:
I am trying to sharpen my understanding of Postgres. As I understand, Postgres does not write directly to disk blocks. It uses the file system provided by the OS:
My question: How can it then store a B Tree on disk? I would think storing a B Tree requires storing disk offset addresses and so on (for a node to navigate to another etc.). For this, one would need to write directly to the disk using low-level disk access functions and not use file system API.

Disk offsets... file offsets... one is just a step up on the abstraction layer.

How can that be? The OS only knows about Logical Block Addresses, which are a huge list of block numbers (just like userland software sees files as long lists of bytes).  It hasn't see cylinders, heads and platters in almost FORTY (more, probably, for SCSI and mainframe systems) years.

--
Born in Arizona, moved to Babylonia.

Re: How does Postgres store a B-Tree on disk while using the OS file system?

От
Christophe Pettus
Дата:

> On Mar 6, 2023, at 16:24, Siddharth Jain <siddhsql@gmail.com> wrote:
> My question: How can it then store a B Tree on disk? I would think storing a B Tree requires storing disk offset
addressesand so on (for a node to navigate to another etc.). For this, one would need to write directly to the disk
usinglow-level disk access functions and not use file system API. 

All of PostgreSQL's relations (tables and indexes) are stored in files.  (They're actually stored in a set of files if
it'slarger than 1GB, but each relation is treated as one logical file.)  The "pointers" in this case are just offsets
fromthe start of that file. 

There's some additional information here:

    https://www.postgresql.org/docs/current/storage.html

and here:

    https://www.postgresql.org/docs/current/btree-implementation.html


Re: How does Postgres store a B-Tree on disk while using the OS file system?

От
Siddharth Jain
Дата:
thanks Christophe. will try to go through it. its bit difficult to grasp.

On Mon, Mar 6, 2023 at 5:08 PM Christophe Pettus <xof@thebuild.com> wrote:


> On Mar 6, 2023, at 16:24, Siddharth Jain <siddhsql@gmail.com> wrote:
> My question: How can it then store a B Tree on disk? I would think storing a B Tree requires storing disk offset addresses and so on (for a node to navigate to another etc.). For this, one would need to write directly to the disk using low-level disk access functions and not use file system API.

All of PostgreSQL's relations (tables and indexes) are stored in files.  (They're actually stored in a set of files if it's larger than 1GB, but each relation is treated as one logical file.)  The "pointers" in this case are just offsets from the start of that file.

There's some additional information here:

        https://www.postgresql.org/docs/current/storage.html

and here:

        https://www.postgresql.org/docs/current/btree-implementation.html