Re: Urgent: 10K or more connections

Поиск
Список
Период
Сортировка
От Gianni Mariani
Тема Re: Urgent: 10K or more connections
Дата
Msg-id 3F1974D9.2050108@mariani.ws
обсуждение исходный текст
Ответ на Re: Urgent: 10K or more connections  (Sean Chittenden <sean@chittenden.org>)
Список pgsql-general
Sean Chittenden wrote:

>>>PostgreSQL will never be single proc, multi-threaded, and I don't
>>>think it should be for reliability's sake.  See my above post,
>>>however, as I think I may have a better way to handle "lots of
>>>connections" without using threads.  -sc
>>>
>>>
>>never is a VERY long time ...  Also, the single proc/multiple proc
>>thing does not have to be exclusive.  Meaning you could "tune" the
>>system so that it could do either.
>>
>>
>
>True.  This topic has come up a zillion times in the past though.  The
>memory segmentation and reliability that independent processes give
>you is huge and the biggest reason why _if_ PostgreSQL does
>spontaneously wedge itself (like MySQL does all too often), you're
>only having to cope with a single DB connection being corrupt,
>invalid, etc.  Imagine a threaded model where the process was horked
>and you loose 1000 connections worth of data in a SEGV.  *shudder*
>Unix is reliable at the cost of memory segmentation... something that
>I dearly believe in.  If that weren't worth anything, then I'd run
>everything in kernel and avoid the context switching, which is pretty
>expensive.
>
>
Yep, but if you design it right, you can have both.  A rare occasion
where you can have the cake and eat it too.

>>I have developed a single process server that handled thousands of
>>connections.  I've also developed a single process database (a while
>>back) that handled multiple connections but I'm not sure I would do
>>it the "hard" way again as the cost of writing the code for keeping
>>context was not insignificant, although there are much better ways
>>of doing it than how I did it 15 years ago.
>>
>>
>
>Not saying it's not possible, just that at this point, reliability is
>more paramount than handling additional connections.  With copy on
>write VM's being abundant these days, a lot of the size that you see
>with PostgreSQL is shared.  Memory profiling and increasing the number
>of read only pages would be an extremely interesting exercise that
>could yield some slick results in terms of reducing the memory foot
>print of PG's children.
>
>
Context switching and cache thrashing are the killers in a multiple
process model.  There is a 6-10x performance penalty for running in
separate processes vs running in a single process (and single thread)
which I observed when doing benchmarking on a streaming server.  Perhaps
a better scheduler (like the O(1) scheduler in Linux 2.6.* would improve
that but I just don't know.

>>What you talk about is very fundamental and I would love to have
>>another go at it ....  however you're right that this won't happen
>>any time soon.  Connection pooling is a fundamentally flawed way of
>>overcoming this problem.  A different design could render a
>>significantly higher feasable connection count.
>>
>>
>
>Surprisingly, it's not that complex at least handling a large number
>of FDs and figuring out which ones have data on them and need to be
>passed to a backend.  I'm actually using the model for monitoring FD's
>from thttpd and reapplying bits where appropriate.  It's abstraction
>of kqueue()/poll()/select() is nice enough to not want to reinvent the
>wheel (same with its license).  Hopefully ripping through the incoming
>data and figuring out which backend pool to send a connection to won't
>be that bad, but I have next to no experience with writing that kind
>of code and my Stevens is hidden away in one of 23 boxes from a move
>earlier this month.  I only know that Apache 1.3 does this with
>obviously huge success on basically every *nix so it can't be too
>hard.
>
>
No epoll ?




В списке pgsql-general по дате отправления:

Предыдущее
От: Tino Wildenhain
Дата:
Сообщение: Re: Regarding double byte support
Следующее
От: Andrew Gould
Дата:
Сообщение: Re: What about a comp.databases.postgresql usenet