Re: getting the most of out multi-core systems for repeated complex SELECT statements

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: getting the most of out multi-core systems for repeated complex SELECT statements
Дата
Msg-id 4D4B79D8.9040402@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: getting the most of out multi-core systems for repeated complex SELECT statements  (Andy Colson <andy@squeakycode.net>)
Ответы Re: getting the most of out multi-core systems for repeated complex SELECT statements  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: getting the most of out multi-core systems for repeated complex SELECT statements  (Andy Colson <andy@squeakycode.net>)
Список pgsql-performance
Andy Colson wrote:
> Cpu's wont get faster, but HD's and SSD's will.  To have one database
> connection, which runs one query, run fast, it's going to need
> multi-core support.

My point was that situations where people need to run one query on one
database connection that aren't in fact limited by disk I/O are far less
common than people think.  My troublesome database servers aren't ones
with a single CPU at its max but wishing there were more workers,
they're the ones that have >25% waiting for I/O.  And even that crowd is
still a subset, distinct from people who don't care about the speed of
any one core, they need lots of connections to go at once.


> That's not to say we need "parallel query's".  Or we need multiple
> backends to work on one query.  We need one backend, working on one
> query, using mostly the same architecture, to just use more than one
> core.

That's exactly what we mean when we say "parallel query" in the context
of a single server.

> My point is, there must be levels of threading, yes?  If a backend has
> data to sort, has it collected, nothing locked, what would it hurt to
> use multi-core sorting?

Optimizer nodes don't run that way.  The executor "pulls" rows out of
the top of the node tree, which then pulls from its children, etc.  If
you just blindly ran off and executed every individual node to
completion in parallel, that's not always going to be faster--could be a
lot slower, if the original query never even needed to execute portions
of the tree.

When you start dealing with all of the types of nodes that are out there
it gets very messy in a hurry.  Decomposing the nodes of the query tree
into steps that can be executed in parallel usefully is the hard problem
hiding behind the simple idea of "use all the cores!"

> I thought I read a paper someplace that said shared cache (L1/L2/etc)
> multicore cpu's would start getting really slow at 16/32 cores, and
> that message passing was the way forward past that.  If PG started
> aiming for 128 core support right now, it should use some kinda
> message passing with queues thing, yes?

There already is a TupleStore type that is going to serve as the message
being sent between the client backends.  Unfortunately we won't get
anywhere near 128 cores without addressing the known scalability issues
that are in the code right now, ones you can easily run into even with 8
cores.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


В списке pgsql-performance по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: [HACKERS] Slow count(*) again...
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: getting the most of out multi-core systems for repeated complex SELECT statements