Re: Is there a cache consistent interface to tables ?

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Is there a cache consistent interface to tables ?
Дата
Msg-id CAMsr+YGYuGEA0CPw89Hj2SAQoDG8B5SDfJ7=ue14=3GwLenFmA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Is there a cache consistent interface to tables ?  (Gary M <garym@oedata.com>)
Ответы Re: Is there a cache consistent interface to tables ?
Список pgsql-hackers
On 10 February 2018 at 14:26, Gary M <garym@oedata.com> wrote:
Thanks for the replies.  

@Craig

Realtime ? well not really. I guess it's your definition of realtime. I usually think of micro to nano seconds as real-time. If I were still designing chips, I'd be calling picosecs real-time these days.  

Hence "soft realtime".
 
"How sure are you that it's not viable for SQL queries?"
I'm topping off at 20K/s inserts due to locking and scheduling issues. There are peaks reaching 50K inserts, but not sustained. I also placed the login on a separate disk to prevent collisions. I have backing storage with 50us latency and 400mb/s bandwidth, so backing storage is not the issue.

Sure, but none of that has anything to do with SQL vs other access methods.

Where's your indication that you're tackling a bottleneck, not something unrelated?

I expect you're probably having more issues around the relation extension lock, wal lock etc than anything else. But you'll need to do some profiling and diagnostic work to determine that.

Otherwise this is all just a variant on premature optimisation.

"And, if not, what makes you think that a lower level interface will help you? "
Well, I was thinking sql parsing and planning is an unnecessary step that can be removed from the pipeline. Instead of parsing the data stream to another format and creating a query, I'll just have the stream parser(s) generate the table structure and write directly to the tables.

A variant of that is certainly practical - and not that hard.  Again, see heap_open, heap_form_tuple, etc. But beware index maintenance. 

You will NOT bypass various write bottlenecking locks this way.

 "Has profiling and tracing/timing shown that significant time/delays are arising from layers you can bypass in a sensible way?"
"Sensible" is why I'm posting the question here.  I'm not familiar enough with the code and processing pipelines to understand subtleties effecting high volume insert performance.  There is a lot of "stuff" going on, much of it event driven.  My first impulse is a pipeline scheduler taking advantage of processor affinity.  Its an ugly, brute force approach, but it does work.

Do you have any indication CPU contention is even the issue?

Establishing CPU affinity probably does make sense, as does optimising the number of executors for minimal contention / max throughput. But if your performance is actually limited by lock contention, I/O, memory throughput, cache churn, or whatever, it may not help much.
 
The final design will likely require some type of shared block storage.  I've always liked the Solaris ZFS LVM layer, although is not distributed.  Fire and forget is not an option here. I was hoping to leverage existing postgres facilities for that heavy lifting. That's why I had originally looked at the wal interface. 

My point is that you'll want something in front that you can use to buffer work during short DB outages / failovers / whatever. Unless you're OK with dropping trace data when the DB is inaccessible, or having it buffered sender-side.

I'm also looking at approaches from a project called "Bottled-Water" https://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/  

Not sure how that's relevant / will help you. It's about getting a change stream *out* of PostgreSQL to drive Kafka.

pglogical3 will also have some features related to feeding changes to Kafka by the way.
 
One more fact I forgot to add.. The insert load into the database is about 2kb/record or about 200MB/s.  

So you're presumably rotating partitions or something; ageing things out.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Disabling src/test/[ssl|ldap] when not building with SSL/LDAPsupport
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: CALL stmt, ERROR: unrecognized node type: 113 bug