Re: large dataset with write vs read clients

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: large dataset with write vs read clients
Дата
Msg-id 4CB16080.1050406@postnewspapers.com.au
обсуждение исходный текст
Ответ на Re: large dataset with write vs read clients  (Mladen Gogala <mladen.gogala@vmsinfo.com>)
Ответы Re: large dataset with write vs read clients
Список pgsql-performance
On 10/10/2010 5:35 AM, Mladen Gogala wrote:
> I have a logical problem with asynchronous commit. The "commit" command
> should instruct the database to make the outcome of the transaction
> permanent. The application should wait to see whether the commit was
> successful or not. Asynchronous behavior in the commit statement breaks
> the ACID rules and should not be used in a RDBMS system. If you don't
> need ACID, you may not need RDBMS at all. You may try with MongoDB.
> MongoDB is web scale: http://www.youtube.com/watch?v=b2F-DItXtZs

That argument makes little sense to me.

Because you can afford a clearly defined and bounded loosening of the
durability guarantee provided by the database, such that you know and
accept the possible loss of x seconds of work if your OS crashes or your
UPS fails, this means you don't really need durability guarantees at all
- let alone all that atomic commit silliness, transaction isolation, or
the guarantee of a consistent on-disk state?

Some of the other flavours of non-SQL databases, both those that've been
around forever (PICK/UniVerse/etc, Berkeley DB, Cache, etc) and those
that're new and fashionable Cassandra, CouchDB, etc, provide some ACID
properties anyway. If you don't need/want an SQL interface to your
database you don't have to throw out all that other database-y goodness
if you haven't been drinking too much of the NoSQL kool-aid.

There *are* situations in which it's necessary to switch to relying on
distributed, eventually-consistent databases with non-traditional
approaches to data management. It's awfully nice not to have to, though,
and can force you to do a lot more wheel reinvention when it comes to
querying, analysing and reporting on your data.

FWIW, a common approach in this sort of situation has historically been
- accepting that RDBMSs aren't great at continuous fast loading of
individual records - to log the records in batches to a flat file,
Berkeley DB, etc as a staging point. You periodically rotate that file
out and bulk-load its contents into the RDBMS for analysis and
reporting. This doesn't have to be every hour - every minute is usually
pretty reasonable, and still gives your database a much easier time
without forcing you to modify your app to batch inserts into
transactions or anything like that.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

В списке pgsql-performance по дате отправления:

Предыдущее
От: Samuel Gendler
Дата:
Сообщение: Re: Slow count(*) again...
Следующее
От: Mladen Gogala
Дата:
Сообщение: Re: large dataset with write vs read clients