Обсуждение: The ultimate extension hook.

Поиск
Список
Период
Сортировка

The ultimate extension hook.

От
Daniel Wood
Дата:
Hooks exist all over PG for extensions to cover various specific usages.

The hook I'd like to see would be in the PostgresMain() loop
for the API "firstchar" messages.

While I started just wanting the hook for the absolute minimum overhead to execute a function, even faster than fastpath, and in brainstorming with David Rowley other use cases became apparent.

API tracing within the engine.  I've heard of client tools for this.
API filtering.  Block/ignore manual checkpoints for instance.
API message altering.
Anything you want to hook into at the highest level well above ExecutorRun.

Originally I just wanted a lightweight mechanism to capture some system counters like /proc/stat without going through the SQL execution machinery.  I'm picky about implementing stuff in the absolute fastest way.  :-)  But I think there are other practical things that I haven't even thought of yet.

There are a few implementation mechanisms which achieve slightly different possibilities:

1) The generic mechanism would let one or more API filters be installed to directly call functions in an extension.  There would be no SQL arg processing overhead based on the specific function.   You'd just pass it the StringInfoData 'msg' itself. Multiple extensions might use the hook so you'd need to rewind the StringInfo buffer.  Maybe I return a boolean to indicate no further processing of this message or fall through to the normal "switch (firstchar)" processing.

2) switch (firstchar) { case 'A': // New trivial API message for extensions
which would call a single extension installed function to do whatever I wanted based on the message payload.  And, yes, I know this can be done just using SQL.  It is simply a variation.  But this would require client support and I prefer the below.

3) case 'Q':  /* simple query */
if (pq_peekbyte() == '!' && APIHook != NULL) {
   (*APIHook)(msg);
  <return something>
  ...
  continue;
}

I've use this last technique to do things like:
if (!strncmp(query_string, "DIEDIEDIE", 9) {
    char *np = NULL;
    *np = 1;
} else if (!strncmp(query_string, "PING", 4) {
    static const char *pong = "PONG";
    pq_putmessage('C', pong, strlen(pong) + 1);
    send_ready_for_query = true;
    continue;
} else if (...)

Then I can simple type PING into psql and get back a PONG.
Or during a stress test on a remote box I can execute the simple query "DIEDIEDIE" and crash the server.  I did this inline for experimentation before but it would be nice if I had the mechanism to use a "statement" to invoke a hook function in an extension.  A single check for "!" in the 'Q' processing would allow user defined commands in extensions.  The dispatcher would be in the extension.  I just need the "!" check.

Another example where ultimate performance might be a goal, if  you are familiar with why redis/memcached/etc. exists then imagine loading SQL results into a cache in an extension and executing as a 'simple' query something like:  !LOOKUP <KEY>
and getting the value faster than SQL could do.

Before I prototype I want to get some feedback.  Why not have a hook at the API level?

Re: The ultimate extension hook.

От
Tom Lane
Дата:
Daniel Wood <hexexpert@comcast.net> writes:
> Hooks exist all over PG for extensions to cover various specific usages.
> The hook I'd like to see would be in the PostgresMain() loop
> for the API "firstchar" messages.

What, to invent your own protocol?  Where will you find client libraries
buying into that?

I'm not really convinced that any of the specific use-cases you suggest
are untenable to approach via the existing function fastpath mechanism,
anyway.

            regards, tom lane



Re: The ultimate extension hook.

От
David Rowley
Дата:
On Thu, 24 Sep 2020 at 16:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Wood <hexexpert@comcast.net> writes:
> > Hooks exist all over PG for extensions to cover various specific usages.
> > The hook I'd like to see would be in the PostgresMain() loop
> > for the API "firstchar" messages.
>
> What, to invent your own protocol?  Where will you find client libraries
> buying into that?

Well, Dan did mention other use cases.  It's certainly questionable if
people wanted to use it to invent their own message types as they'd
need client support.  However, when it comes to newly proposed hooks,
I thought we should be asking ourself questions like, are there
legitimate use cases for this?  Is it safe to expose this?  It seems a
bit backwards to consider illegitimate uses of a hook unless they
relate to security.

> I'm not really convinced that any of the specific use-cases you suggest
> are untenable to approach via the existing function fastpath mechanism,
> anyway.

I wondered if there was much in the way of use-cases like a traffic
filter, or statement replication. I wasn't sure if it was a solution
looking for a problem or not, but it seems like it could be productive
to talk about possibilities here and make a judgement call based on if
any alternatives exist today that will allow that problem to be solved
sufficiently in another way.

David



Re: The ultimate extension hook.

От
Daniel Wood
Дата:
> On 09/23/2020 9:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> ...
> > The hook I'd like to see would be in the PostgresMain() loop
> > for the API "firstchar" messages.
> 
> What, to invent your own protocol?  Where will you find client libraries
> buying into that?

No API/client changes are needed for:
    1) API tracing/filtering; or
    3) custom SQL like commands through a trivial modification to  Simple Query 'Q'.  Purely optional as you'll see at
theend.
 

Yes, (2) API extension "case 'A'" could be used to roll ones own protocol.  When pondering API hooking, in general, I
thoughtof this also but don't let it be a distraction.
 

> I'm not really convinced that any of the specific use-cases you suggest
> are untenable to approach via the existing function fastpath mechanism,
> anyway.

Certainly (3) is just a command level way to execute a function instead of 'select myfunc()'.  But it does go through
theSQL machinery and SQL argument type lookup and processing.  I like fast and direct things.  And (3) is so trivial to
implement.

However, even fastpath doesn't provide a protocol hook function where tracing could be done.  If I had that alone I
coulddo my own 'Q' hook and do the "!cmd" processing in my extension even if I sold the idea just based on
tracing/filtering.

We hook all kinds of things in PG.  Think big.  Why should the protocol processing not have a hook?  I'll bet some
otherswill think of things I haven't even yet thought of that would leverage this.
 

- Dan Wood



Re: The ultimate extension hook.

От
Jehan-Guillaume de Rorthais
Дата:
On Thu, 24 Sep 2020 17:08:44 +1200
David Rowley <dgrowleyml@gmail.com> wrote:
[...]
> I wondered if there was much in the way of use-cases like a traffic
> filter, or statement replication. I wasn't sure if it was a solution
> looking for a problem or not, but it seems like it could be productive
> to talk about possibilities here and make a judgement call based on if
> any alternatives exist today that will allow that problem to be solved
> sufficiently in another way.

If I understand correctly the proposal, this might enable traffic capture using
a loadable extension.

This kind of usage would allows to replay and validate any kind of traffic from
a major version to another one. Eg. to look for regressions from the application
point of view, before a major upgrade.

I did such regression tests in past. We were capturing production traffic
using libpcap and replay it using pgshark on upgraded test env. Very handy.
However:

* libpcap can drop network packet during high load. This make the capture
  painful to recover past the hole.
* useless with encrypted traffic

So, +1 for such hooks.

Regards,



Re: The ultimate extension hook.

От
Daniel Wood
Дата:
> On 10/23/2020 9:31 AM Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> [...]
> * useless with encrypted traffic
> 
> So, +1 for such hooks.
> 
> Regards,

Ultimately Postgresql is supposed to be extensible.
I don't see an API hook as being some crazy idea even if some may not like what I might want to use it for.  It can be
usefulfor a number of things.
 

- Dan