Обсуждение: PostgreSQL as a filesystem
This isn’t a high-priority question.
I’m developing a hobby OS and I’m looking into file systems. I’ve thought about writing my own, and that appeals, but I’m also very interested in the database-as-a-filesystem paradigm. It would be nice to not have to write all of the stuff that goes into the DBMS (e.g. parsers, query schedulers, etc) myself.
So I was wondering what sort of filesystem requirements Postgre has. For example, could I write a simple interface layer that just requests blocks from the physical device and translate those into byte sets, or does the DB actually require multiple files mapped by a larger file system that maintains names, etc.
I guess my real question is how much file system support is really required by the DBMS’s disk routines. Please reply to nadiasvertex@gmail.com since I’m not subscribed to this list. Thanks in advance!
-={C}=-
"Christopher Nelson" <paradox@BBHC.ORG> writes: > I'm developing a hobby OS and I'm looking into file systems. I've > thought about writing my own, and that appeals, but I'm also very > interested in the database-as-a-filesystem paradigm. It would be nice > to not have to write all of the stuff that goes into the DBMS (e.g. > parsers, query schedulers, etc) myself. > So I was wondering what sort of filesystem requirements Postgre has. There are DB's you could use for this, but Postgres (not "Postgre", please, there is no such animal) isn't one of them :-(. We really assume we are sitting on top of a full-spec file system --- we want space management for variable-size files, robust storage of directory information, etc. Also, the things you typically expect to do with a filesystem, such as drop many-megabytes files into it without blinking, don't match up very well with the stuff that's fast in Postgres. Bottom line is that it'd probably be doable, but it'd be a pain and probably not perform real well... regards, tom lane
Sorry for the misnomer. :-D Thanks for answering my question so quickly! > "Christopher Nelson" <paradox@BBHC.ORG> writes: > > I'm developing a hobby OS and I'm looking into file systems. I've > > thought about writing my own, and that appeals, but I'm also very > > interested in the database-as-a-filesystem paradigm. It would be nice > > to not have to write all of the stuff that goes into the DBMS (e.g. > > parsers, query schedulers, etc) myself. > > > So I was wondering what sort of filesystem requirements Postgre has. > > There are DB's you could use for this, but Postgres (not "Postgre", > please, there is no such animal) isn't one of them :-(. We really > assume we are sitting on top of a full-spec file system --- we want > space management for variable-size files, robust storage of directory > information, etc. > > Also, the things you typically expect to do with a filesystem, such as > drop many-megabytes files into it without blinking, don't match up very > well with the stuff that's fast in Postgres. > > Bottom line is that it'd probably be doable, but it'd be a pain and > probably not perform real well... > > regards, tom lane
>> "Christopher Nelson" <paradox@BBHC.ORG> writes: >>> I'm developing a hobby OS and I'm looking into file systems. I've >>> thought about writing my own, and that appeals, but I'm also very >>> interested in the database-as-a-filesystem paradigm. It would be You may want to take a look at SQLLite. It would probably need some tweaking to make it be as reliable as you like but it is public domain. You could also look at SleepyCat. Sincerely, Joshua D. Drake -- Command Prompt, Inc., Your PostgreSQL solutions company. 503-667-4564 Custom programming, 24x7 support, managed services, and hosting Open Source Authors: plPHP, pgManage, Co-Authors: plPerlNG Reliable replication, Mammoth Replicator - http://www.commandprompt.com/
On Monday 18 April 2005 01:42 pm, Christopher Nelson wrote: > This isn't a high-priority question. > > > > I'm developing a hobby OS and I'm looking into file systems. I've > thought about writing my own, and that appeals, but I'm also very > interested in the database-as-a-filesystem paradigm. It would be nice > to not have to write all of the stuff that goes into the DBMS (e.g. > parsers, query schedulers, etc) myself. > > > > So I was wondering what sort of filesystem requirements Postgre has. > For example, could I write a simple interface layer that just requests > blocks from the physical device and translate those into byte sets, or > does the DB actually require multiple files mapped by a larger file > system that maintains names, etc. > > > > I guess my real question is how much file system support is really > required by the DBMS's disk routines. Please reply to > nadiasvertex@gmail.com since I'm not subscribed to this list. Thanks in > advance! > > > > -={C}=- You might be interested in the following site. It is a Python DBAPI driver that uses the file system as a database. http://fssdb.sourceforge.net/ -- Adrian Klaver aklaver@comcast.net
On 4/18/05, Christopher Nelson <paradox@bbhc.org> wrote: > > This isn't a high-priority question. > and if I can latch on to this non-priority question with another in a similar vain: what sort of RDBMS do huge transactional systems like Tandy's use? I've read that everything is a database, similar to the unix paradigm "everything is a file". Just curious, aaron.glenn
On Mon, 2005-04-18 at 17:18 -0400, Tom Lane wrote: > "Christopher Nelson" <paradox@BBHC.ORG> writes: > > I'm developing a hobby OS and I'm looking into file systems. I've > > thought about writing my own, and that appeals, but I'm also very > > interested in the database-as-a-filesystem paradigm. It would be nice > > to not have to write all of the stuff that goes into the DBMS (e.g. > > parsers, query schedulers, etc) myself. > > > So I was wondering what sort of filesystem requirements Postgre has. > > There are DB's you could use for this, but Postgres (not "Postgre", > please, there is no such animal) isn't one of them :-(. We really > assume we are sitting on top of a full-spec file system --- we want > space management for variable-size files, robust storage of directory > information, etc. I've been thinking of it, too. I think no filesystem out there is really optimized for a steady write load with many fsyncs, that is, is really transaction-oriented on the data side (journalled ones may implement real transactions for meta-data, but only for it). Out of curiosity, do you have any feedback from filesystem people, are they interested in optimizing for the kind of workload (expecially on write) a database generates? I ask for it seems to me it's a corner case to them, or even a degenerated one. I'm not aware of _any_ comparative benchmarch among different filesystems that is based on write+fsync load, for one. Using a DB as filesystem at OS level is a different matter, of course. Christopher, you may have a look at FUSE. http://fuse.sourceforge.net/ It may help in both developing a new filesystem and in understanding how it works under Linux (with a nice separation of userspace and kernelspace). I think you could even write one based on PostgreSQL, but it won't help much, since PostgreSQL needs a filesystem to work. But if your OS has TCP/IP, it could be interesting anyway. Note that I'm not aware of any other way to access PostgreSQL than sockets, so you need those at least. There's no standalone library you can link to in order to access database files, AFAIK. .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
> On Mon, 2005-04-18 at 17:18 -0400, Tom Lane wrote: > > "Christopher Nelson" <paradox@BBHC.ORG> writes: > > > I'm developing a hobby OS and I'm looking into file systems. I've > > > thought about writing my own, and that appeals, but I'm also very > > > interested in the database-as-a-filesystem paradigm. It would be nice > > > to not have to write all of the stuff that goes into the DBMS (e.g. > > > parsers, query schedulers, etc) myself. > > > > > So I was wondering what sort of filesystem requirements Postgre has. > > > > There are DB's you could use for this, but Postgres (not "Postgre", > > please, there is no such animal) isn't one of them :-(. We really > > assume we are sitting on top of a full-spec file system --- we want > > space management for variable-size files, robust storage of directory > > information, etc. > > I've been thinking of it, too. I think no filesystem out there is really > optimized for a steady write load with many fsyncs, that is, is really > transaction-oriented on the data side (journalled ones may implement > real transactions for meta-data, but only for it). Out of curiosity, > do you have any feedback from filesystem people, are they interested in > optimizing for the kind of workload (expecially on write) a database > generates? I ask for it seems to me it's a corner case to them, or even > a degenerated one. I'm not aware of _any_ comparative benchmarch among > different filesystems that is based on write+fsync load, for one. I don't know of any filesystem people who have a desire to explicitly support that sort of traffic. I have looked at the internals of systems like BFS, and those journaled systems support transactions for all data... not just metadata. For example, on BFS there is an area where all data is journaled, then once it's been verified that the data journaling is done, the log is rolled forward. XFS has an interesting alternative. They do only journal metadata, but no filedata is overwritten until the transaction succeeds. So what they do is write the transaction metadata, allocate new storage for the block, write the block, copy the extents map with the new block, commit the new extents map, and then commit the metadata. So during all parts of the process, up until the final commit of the metadata, two copies of everything exist for that context. > Using a DB as filesystem at OS level is a different matter, of course. Which is what I'm trying to accomplish. > Christopher, you may have a look at FUSE. > http://fuse.sourceforge.net/ Thanks for the link. It's not exactly what I'm looking for, since I'm using the spoon microkernel and the file system is going to be a user space agent in any case. But the information is interesting. > It may help in both developing a new filesystem and in understanding > how it works under Linux (with a nice separation of userspace and > kernelspace). I think you could even write one based on PostgreSQL, > but it won't help much, since PostgreSQL needs a filesystem to work. > But if your OS has TCP/IP, it could be interesting anyway. > > Note that I'm not aware of any other way to access PostgreSQL than > sockets, so you need those at least. There's no standalone library > you can link to in order to access database files, AFAIK. Hmm. So it would be a LOT of work to use it. Obviously I wouldn't be using sockets, but I would be using an IPC primitive similar to sockets. It would be relatively simple to create a basic filesystem abstraction that kept track of large blocks of data, and nothing else. Then mount the database layer on top of that. I suppose it would make more sense to have both raw data streams and associated relational object data. Streams for data performance, and the relational data for information about the stream. -={C}=-