Обсуждение: Improve OAuth discovery logging
Hello Currently when the client sends an empty OAuth token to request the issuer URL, the server logs the attempt with FATAL: OAuth bearer authentication failed for user Which is quite confusing, as this is an expected part of the OAuth authentication flow and not an error at all, there's also a TODO message saying that this needs improvement. In practice this results in the server spamming the log with these messages, which are difficult to separate from real (OAuth) authentication failures. This patch improves it by handling the situation properly in the SASL/Oauth code, by introducing a new SASL authentication status, PG_SASL_EXCHANGE_RESTART. The expectation is that authentication mechanisms can set this if they request a restart of the authentication flow. Restart currently requires starting with a new connection, so this simply sets STATUS_EOF. Not sure if this is the best way to handle it or not, but it seems the cleanest to me, as the SASL code already had these return codes and this way the patch doesn't introduce anything OAuth specific to the logic.
Вложения
> On 11 Feb 2026, at 20:24, Zsolt Parragi <zsolt.parragi@percona.com> wrote:
> Not sure if this is the best way to handle it or not, but it seems the
> cleanest to me, as the SASL code already had these return codes and
> this way the patch doesn't introduce anything OAuth specific to the
> logic.
Off the cuff this seems reasonable from technical standpoint. However, is the
below message of LOG level interest to an admin?
+ ereport(LOG,
+ errmsg("OAuth issuer discovery requested by user \"%s\"",
Is this valuable to administrators in production, or should this perhaps be a
DEBUGx level logging?
--
Daniel Gustafsson
Thank you for the quick review! > Is this valuable to administrators in production, or should this perhaps be a > DEBUGx level logging? No, I was even thinking about removing that completely, and then forgot about it before sending my email. I changed it to DEBUG1, it's definitely not needed outside of debugging oauth issues.
Вложения
On Thu, Feb 12, 2026 at 10:51 AM Zsolt Parragi
<zsolt.parragi@percona.com> wrote:
>
> Thank you for the quick review!
Thanks for tackling this TODO! :D
I have a lot of little nitpicky feedback, but please push back on
anything you disagree with (trying to avoid Primary Author Syndrome).
I've also copied Michael for opinions on the new API.
= Mechanism-Independent Changes =
> +#define PG_SASL_EXCHANGE_RESTART 3
I think the "restart" nomenclature is maybe a subtle layering
violation. From the point of view of the server (and especially the
mechanism-independent code in auth-sasl.c), we have no idea if the
client's going to retry or not. All we know is that the client doesn't
intend to authenticate on this connection. (There's also maybe some
relationship with SASL's concept of client-aborted exchanges, which we
don't support AFAIK, and would be handled at the SASL layer rather
than the mech layer.)
Maybe something like PG_SASL_EXCHANGE_ABANDONED?
> + if (result == PG_SASL_EXCHANGE_RESTART)
> + return STATUS_EOF;
> +
> /* Oops, Something bad happened */
> if (result != PG_SASL_EXCHANGE_SUCCESS)
Let's keep these two cases together, either as an if/else-if, or by
switching STATUS_ERROR to STATUS_EOF as needed.
= Mechanism-Specific Changes =
> + if (auth[0] == '\0')
> + ctx->discovery = true;
> +
> if (!validate(ctx->port, auth))
> {
I think this might be more straightforward as a variant of the
OAUTH_STATE_ERROR state (OAUTH_STATE_ERROR_DISCOVERY?) rather than a
separate `discovery` flag.
> - * TODO: should we find a way to return STATUS_EOF at the top level,
> - * to suppress the authentication error entirely?
> + * The caller detects this case and returns
> + * PG_SASL_EXCHANGE_RESTART to suppress the authentication FATAL.
> */
IMO this shows that there's no reason for me to have called validate()
from oauth_exchange() for this case -- there's nothing to validate.
validate() should just Assert that it's been passed a non-empty token.
> + errmsg("OAuth issuer discovery requested by user \"%s\"",
> + ctx->port->user_name));
Since authentication isn't complete, we don't really know "who"
requested discovery. I think we should rely on log_[dis]connections to
provide debugging info to the DBA in these situations; this can just
log the fact that discovery was requested.
Thanks!
--Jacob
These all are good suggestions, attached updated patch.
> Maybe something like PG_SASL_EXCHANGE_ABANDONED?
This is the only one I wasn't sure of, I used RESTART because I was
focusing more on the intention of the server ("please restart
authentication with this additional information"), and a bit also on
the idea that later restart could stay even within the same
connection, both in this case and if we add support for
reauthentication on token expiration.
On the other hand I'm not 100% sure how the other two would work, and
ABANDONED is a better description for the current situation, so I
adjusted the patch to use that.
Вложения
> On Feb 13, 2026, at 21:13, Zsolt Parragi <zsolt.parragi@percona.com> wrote:
>
> These all are good suggestions, attached updated patch.
>
>> Maybe something like PG_SASL_EXCHANGE_ABANDONED?
>
> This is the only one I wasn't sure of, I used RESTART because I was
> focusing more on the intention of the server ("please restart
> authentication with this additional information"), and a bit also on
> the idea that later restart could stay even within the same
> connection, both in this case and if we add support for
> reauthentication on token expiration.
>
> On the other hand I'm not 100% sure how the other two would work, and
> ABANDONED is a better description for the current situation, so I
> adjusted the patch to use that.
> <v3-0001-Improve-OAuth-discovery-logging.patch>
Hi Zsolt,
Thanks for the patch. A few small comments:
1 - commit message
```
SASL/Oauth code, by introducing a new SASL authentication status,
PG_SASL_EXCHANGE_RESTART. The expectation is that authentication
```
Looks like you forgot to update the commit message to change PG_SASL_EXCHANGE_RESTART to PG_SASL_EXCHANGE_ABANDONED.
2 - auth-oauth.c
```
/* The (failed) handshake is now complete. */
+ if (ctx->state == OAUTH_STATE_ERROR_DISCOVERY)
+ {
+ ctx->state = OAUTH_STATE_FINISHED;
+ ereport(DEBUG1,
+ errmsg("OAuth issuer discovery requested"));
+ return PG_SASL_EXCHANGE_ABANDONED;
+ }
+
ctx->state = OAUTH_STATE_FINISHED;
return PG_SASL_EXCHANGE_FAILURE;
```
"ctx->state = OAUTH_STATE_FINISHED;" is duplicated in the “if” and after the “if”, so it can be pull up to before the
“if”.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
> On Feb 13, 2026, at 21:13, Zsolt Parragi <zsolt.parragi@percona.com> wrote:
> > Maybe something like PG_SASL_EXCHANGE_ABANDONED?
>
> This is the only one I wasn't sure of, I used RESTART because I was
> focusing more on the intention of the server ("please restart
> authentication with this additional information"), and a bit also on
> the idea that later restart could stay even within the same
> connection, both in this case and if we add support for
> reauthentication on token expiration.
I think "abandoned" would still work as a descriptor if we eventually
supported multiple SASL exchanges per connection.
On Mon, Feb 23, 2026 at 7:01 PM Chao Li <li.evan.chao@gmail.com> wrote:
> Looks like you forgot to update the commit message to change PG_SASL_EXCHANGE_RESTART to PG_SASL_EXCHANGE_ABANDONED.
Yes -- though keep in mind that committers will often rewrite commit
messages from scratch. So while keeping it accurate and well-written
should be the goal, perfection isn't required to move something into
RfC.
Speaking of which: Zsolt, would you mind adding this to the Commitfest?
> "ctx->state = OAUTH_STATE_FINISHED;" is duplicated in the “if” and after the “if”, so it can be pull up to before the
“if”.
+1
--Jacob
> "ctx->state = OAUTH_STATE_FINISHED;" is duplicated in the “if” and after the “if”, so it can be pull up to before the “if”. It can't, because the if is based on ctx->state. If I move it to before the if, I have to save the previous value, which just makes the code longer. I attached v4 with an edited commit message, nothing else changed. Commitfest: https://commitfest.postgresql.org/patch/6529/
Вложения
This looks like nice patch addressing real issue in log analyzing.
Basic idea seems correct to me WRT OAuth, but I'm not a real expert in auth.
> On 25 Feb 2026, at 18:14, Zsolt Parragi <zsolt.parragi@percona.com> wrote:
>
> It can't, because the if is based on ctx->state. If I move it to
> before the if, I have to save the previous value, which just makes the
> code longer.
Well, you can do something in a line with
bool was_discovery = (ctx->state == OAUTH_STATE_ERROR_DISCOVERY);
ctx->state = OAUTH_STATE_FINISHED;
if (was_discovery)
{
}
But it's a matter of taste. Your code is correct anyway.
We can tweak comments a bit in sasl.h:
/*---------
* exchange()
*
* Produces a server challenge to be sent to the client. The callback
* must return one of the PG_SASL_EXCHANGE_* values, depending on
* whether the exchange continues, has finished successfully, or has
* failed. <---- , or was abandoned by the client.
* a successful outcome). The callback should set this to
* NULL if the exchange is over and no output should be sent,
* which should correspond to either PG_SASL_EXCHANGE_FAILURE
* or a PG_SASL_EXCHANGE_SUCCESS with no outcome data. <----- or ABANDONED
* failure message.) Ignored if the exchange is completed
* with PG_SASL_EXCHANGE_SUCCESS. <------ or ABANDONED
That's all what I could grep.
And thanks for your review in my thread!
Best regards, Andrey Borodin.
On Wed, Feb 25, 2026 at 5:15 AM Zsolt Parragi <zsolt.parragi@percona.com> wrote: > Commitfest: https://commitfest.postgresql.org/patch/6529/ This is in my commit queue. (I didn't read adequately and ended up duplicating the patch in CF; I've withdrawn that. Sorry for the confusion, if you received any notification emails.) Thanks! --Jacob
> Well, you can do something in a line with
Yes, but I either have to declare was_discovery at the beginning of
the function, or add { ... } for the case and declare it at the
beginning of the case branch. Neither of those seems to be more
readable to me. With different coding conventions I would definitely
do a `const bool was_discovery` directly before the if, but I can't do
that here.
Thanks for the comment suggestions, I attached an updated version with
an edited comment for exchange().
Вложения
On Fri, Feb 27, 2026 at 11:51 AM Zsolt Parragi <zsolt.parragi@percona.com> wrote: > Thanks for the comment suggestions, I attached an updated version with > an edited comment for exchange(). Cirrus seems to have noticed an intermittent failure [1]; what's that about? --Jacob [1] https://cirrus-ci.com/task/5855947479842816
> Cirrus seems to have noticed an intermittent failure [1]; what's that about? Looks like we have some out of order logging, because of the multiple backends involved. DISCOVERY 2026-02-27 20:11:12.104 UTC postmaster[43175] DEBUG: forked new client backend, pid=43267 socket=7 DISCOVERY 2026-02-27 20:11:12.118 UTC client backend[43267] [unknown] LOG: connection received: host=[local] 2026-02-27 20:11:12.179 UTC postmaster[43175] DEBUG: assigned pm child slot 3 for client backend 2026-02-27 20:11:12.179 UTC postmaster[43175] DEBUG: forked new client backend, pid=43281 socket=7 LOGIN 2026-02-27 20:11:12.179 UTC client backend[43281] [unknown] LOG: connection received: host=[local] LOGIN 2026-02-27 20:11:12.180 UTC client backend[43281] [unknown] LOG: oauth_validator: token="9243959234", role="test" LOGIN 2026-02-27 20:11:12.180 UTC client backend[43281] [unknown] LOG: oauth_validator: issuer="http://127.0.0.1:44122", scope="openid postgres" LOGIN 2026-02-27 20:11:12.180 UTC client backend[43281] [unknown] LOG: connection authenticated: identity="test" method=oauth (/tmp/cirrus-ci-build/build/testrun/oauth_validator/001_server/data/t_001_server_primary_data/pgdata/pg_hba.conf:2) LOGIN 2026-02-27 20:11:12.180 UTC client backend[43281] [unknown] LOG: connection authorized: user=test database=postgres application_name=001_server.pl LOGIN 2026-02-27 20:11:12.180 UTC client backend[43281] 001_server.pl LOG: connection ready: setup total=1.804 ms, fork=0.424 ms, authentication=0.287 ms LOGIN 2026-02-27 20:11:12.181 UTC client backend[43281] 001_server.pl LOG: statement: SELECT $$connected with user=test dbname=postgres oauth_issuer=http://127.0.0.1:44122 oauth_client_id=f02c6361-0635$$ ... we read the log around this part ... DISCOVERY 2026-02-27 20:11:12.189 UTC client backend[43267] [unknown] DEBUG: OAuth issuer discovery requested 2026-02-27 20:11:12.190 UTC postmaster[43175] DEBUG: releasing pm child slot 1 DISCOVERY 2026-02-27 20:11:12.190 UTC postmaster[43175] DEBUG: client backend (PID 43267) exited with exit code 0 So it's a scheduling issue, since we log the "oauth discovery requested" after we already sent the issuer info to the client, so the other connection attempt can already be in progress (at least with simple setups like this test) Also related that connect_fails uses wait_for_log, noticing log messages that appear later, while connect_ok simply checks the logs at that moment. I'm not sure which solution is better for this: removing the check for this message from the test or modifying connect_ok to wait for all backends that started since the connection attempt to finish? Modifying it seems the better choice to me, but that is also a separate change.
[cc'ing Tom for awareness] On Wed, Mar 4, 2026 at 12:40 PM Zsolt Parragi <zsolt.parragi@percona.com> wrote: > Looks like we have some out of order logging, because of the multiple > backends involved. Ugh. In retrospect, then, my commit ab8af1db4 was probably useless. It doesn't improve any existing usage and it can't help future usage. > I'm not sure which solution is better for this: removing the check for > this message from the test The log_unlike() part is the important bit, so that would be my preference. Unfortunately that means the intermittent false failure becomes an intermittent false success, but unless I'm missing something, we might not be able to do better with the current tools. > or modifying connect_ok Well, connect_fails() would seem to suffer the same problem, right? Tom's solution in e0f373ee4 was never meant to handle concurrent backends. I hadn't really considered that all "normal" OAuth connections can have two concurrent backends in practice. (I think of them as serial, but they're not.) We're just getting lucky that we haven't made use of log_[un]like in many problematic cases yet. > to wait for all > backends that started since the connection attempt to finish? Easier said than done, I think. I've wanted to teach the server how to bracket logs of interest for testing purposes for a while now; I don't mind using this as a catalyst. But I don't think it should be done as part of this thread. Thanks, --Jacob
Attached v6 with the problematic log expectation removed. > Easier said than done, I think. I've wanted to teach the server how to > bracket logs of interest for testing purposes for a while now; I don't > mind using this as a catalyst. But I don't think it should be done as > part of this thread. Yes, I want to look at how to make connect_ok/connect_fails more reliable for oidc scenarios, but let's not make this dependent on that, my first idea to do it wasn't 100% reliable based on some targeted testing.
Вложения
On Thu, Mar 5, 2026 at 12:11 PM Zsolt Parragi <zsolt.parragi@percona.com> wrote: > Attached v6 with the problematic log expectation removed. Okay, I was doing some final pre-commit review today and... unfortunately, using STATUS_EOF like my "TODO" suggested breaks our de facto SASL profile. The server hasn't completed its side of the exchange until it sends either [AuthenticationSASLFinal+]AuthenticationOk or ErrorResponse. Since STATUS_EOF suppresses not only the log message but the entire ereport(FATAL), we'll never send that last packet, so a more polite client can't tell whether the server finished the exchange or just crashed. v6 doesn't fail any tests because of a shortcut I took in PQconnectPoll() in libpq, which skips reading the final message from a known-doomed OAuth discovery connection. But you can see it if you apply the attached patch. (It's not a correct patch; it just shows the problem.) I'm experimenting with an ereport(FATAL_CLIENT_ONLY) option, in the same vein as WARNING_CLIENT_ONLY, to try to cover this. --Jacob P.S. I would eventually like to record our undocumented SASL profile in a test suite (he said, staring at pg-pytest)...
Вложения
> I'm experimenting with an ereport(FATAL_CLIENT_ONLY) option, in the > same vein as WARNING_CLIENT_ONLY, to try to cover this. I attached v7 that uses that and removes the abandoned handling as it is no longer needed with it. > P.S. I would eventually like to record our undocumented SASL profile > in a test suite (he said, staring at pg-pytest)... That would be definitely useful, with the todo comment and this not being documented I thought that this is a proper way to handle the issue. Even a proper documentation about it would be a good starting point.
Вложения
> On 16 Mar 2026, at 11:24, Zsolt Parragi <zsolt.parragi@percona.com> wrote: > >> I'm experimenting with an ereport(FATAL_CLIENT_ONLY) option, in the >> same vein as WARNING_CLIENT_ONLY, to try to cover this. > > I attached v7 that uses that and removes the abandoned handling as it > is no longer needed with it. > >> P.S. I would eventually like to record our undocumented SASL profile >> in a test suite (he said, staring at pg-pytest)... > > That would be definitely useful, with the todo comment and this not > being documented I thought that this is a proper way to handle the > issue. Even a proper documentation about it would be a good starting > point. > <v7-0001-Improve-OAuth-discovery-logging.patch> I've took a look into v7. FATAL_CLIENT_ONLY approach LGTM. pg_stat_database.sessions_fatal seems to be still incremented, but, probably, we can live with it. But also we can fix it. Changes to send_message_to_server_log() seems unreachable to me. I think is_log_level_output() returns false for FATAL_CLIENT_ONLY, so edata->output_to_server is never set to true for this level, and these functions are never called. FATAL_CLIENT_ONLY = 23 sits between FATAL (22) and PANIC (24). Consider swapping FATAL and FATAL_CLIENT_ONLY, so that code like this will have more sense: elevel = Max(elevel, errordata[i].elevel); Does this assignment have an effect? + ctx->state = OAUTH_STATE_FINISHED; + ereport(FATAL_CLIENT_ONLY, Thanks! Best regards, Andrey Borodin.
Thanks for the quick review! > pg_stat_database.sessions_fatal seems to be still incremented, but, probably, > we can live with it. But also we can fix it. We are still doing a fatal disconnect, so it seems appropriate to me? > Changes to send_message_to_server_log() > ... > FATAL_CLIENT_ONLY = 23 sits between FATAL (22) and PANIC (24). I handled these the same way as the existing WARNING_CLIENT_ONLY. We can change it, but then we probably should also update the warning case. > Does this assignment have an effect? No, but that's also true for the other already existing assignment in this branch, I think these are mostly there for internal bookkeeping/consistency?
> On 16 Mar 2026, at 12:32, Zsolt Parragi <zsolt.parragi@percona.com> wrote:
>
> Thanks for the quick review!
>
>> pg_stat_database.sessions_fatal seems to be still incremented, but, probably,
>> we can live with it. But also we can fix it.
>
> We are still doing a fatal disconnect, so it seems appropriate to me?
Makes sense. We must see this activity somewhere. Even establishing connection is
a big deal, let alone TLS handshake.
>
>> Changes to send_message_to_server_log()
>> ...
>> FATAL_CLIENT_ONLY = 23 sits between FATAL (22) and PANIC (24).
>
> I handled these the same way as the existing WARNING_CLIENT_ONLY. We
> can change it, but then we probably should also update the warning
> case.
Ahh, OK.
>
>> Does this assignment have an effect?
>
> No, but that's also true for the other already existing assignment in
> this branch, I think these are mostly there for internal
> bookkeeping/consistency?
OK.
One more nit: errdetail("Empty request, discovery requested?").
Question marks are uncommon in errdetail and errmsg.
I have no more comments about the patch, feel free to flip it to RfC.
Best regards, Andrey Borodin.
On Mon, Mar 16, 2026 at 1:30 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > I have no more comments about the patch, feel free to flip it to RfC. Thanks all, v7 looks very similar to one of my local patches. I don't want to escape the authentication flow from inside a SASL mech, though (it's unusual/invisible to other maintainers, plus it bypasses the ClientAuthentication_hook). I'm working on a three-patch set to add FATAL_CLIENT_ONLY, the new abandoned state, and the log fix making use of both. Should have something posted today if things go my way. --Jacob
> I don't > want to escape the authentication flow from inside a SASL mech, though > (it's unusual/invisible to other maintainers, plus it bypasses the > ClientAuthentication_hook). I tried to figure out if this is fine or not, but isn't it the same as the existing ereport(ERROR, ...) calls everywhere in the sasl/scram code? I didn't see any clear pattern, for example the LDAP code clearly uses ereport(LOG, ...); return STATUS_ERROR; even for internal/configuration errors, while the scram/sasl code uses ereport(ERROR, ...) for those errors.
On Mon, Mar 16, 2026 at 12:45 PM Zsolt Parragi <zsolt.parragi@percona.com> wrote: > I tried to figure out if this is fine or not, but isn't it the same as > the existing ereport(ERROR, ...) calls everywhere in the sasl/scram > code? Those are *also* not good, IMHO; they're what I had in mind when I said "it's unusual/invisible". (ERROR is upgraded to FATAL here, so they're also misleading.) OAuth inherited a few of those from SCRAM to avoid divergent behavior for protocol violations, but I don't really want to lock that usage into the SASL architecture by myself, especially not for normal operation. CheckSASLAuth should ideally have control over the logic flow. (It might be nice to make it possible to throw ERRORs from inside authentication code without bypassing the top level. Then maybe we could square that with our treatment of logdetail et al. But we'd have to decide how we want protocol errors to interact with the hook.) On Mon, Mar 16, 2026 at 11:14 AM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > I'm working on a three-patch set to add FATAL_CLIENT_ONLY, the new > abandoned state, and the log fix making use of both. Attached as v8. --Jacob
Вложения
> On Mar 17, 2026, at 08:24, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> On Mon, Mar 16, 2026 at 12:45 PM Zsolt Parragi
> <zsolt.parragi@percona.com> wrote:
>> I tried to figure out if this is fine or not, but isn't it the same as
>> the existing ereport(ERROR, ...) calls everywhere in the sasl/scram
>> code?
>
> Those are *also* not good, IMHO; they're what I had in mind when I
> said "it's unusual/invisible". (ERROR is upgraded to FATAL here, so
> they're also misleading.) OAuth inherited a few of those from SCRAM to
> avoid divergent behavior for protocol violations, but I don't really
> want to lock that usage into the SASL architecture by myself,
> especially not for normal operation. CheckSASLAuth should ideally have
> control over the logic flow.
>
> (It might be nice to make it possible to throw ERRORs from inside
> authentication code without bypassing the top level. Then maybe we
> could square that with our treatment of logdetail et al. But we'd have
> to decide how we want protocol errors to interact with the hook.)
>
> On Mon, Mar 16, 2026 at 11:14 AM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
>> I'm working on a three-patch set to add FATAL_CLIENT_ONLY, the new
>> abandoned state, and the log fix making use of both.
>
> Attached as v8.
>
> --Jacob
>
<v8-0001-Add-FATAL_CLIENT_ONLY-to-ereport-elog.patch><v8-0003-oauth-Don-t-log-discovery-connections-by-default.patch><v8-0002-sasl-Allow-backend-mechanisms-to-abandon-exchange.patch>
A few review comments:
1 - 0001
```
@@ -3800,6 +3801,7 @@ send_message_to_server_log(ErrorData *edata)
syslog_level = LOG_WARNING;
break;
case FATAL:
+ case FATAL_CLIENT_ONLY:
syslog_level = LOG_ERR;
```
As is_log_level_output() returns false against FATAL_CLIENT_ONLY, so that FATAL_CLIENT_ONLY should not reach
send_message_to_server_log().Should we assert edata->elevel != FATAL_CLIENT_ONLY?
2 - 0002
```
+ if (!abandoned)
+ {
+ /*
+ * Programmer error: caller needs to track the abandoned state for
+ * this mechanism.
+ */
+ elog(ERROR, "SASL exchange was abandoned, but CheckSASLAuth isn't tracking it");
+ }
```
As far as I understand, for a programmer error, Assert should be used. Why do we use elog(ERROR) here?
3 - 0002 - auth.c
```
cdetail = psprintf(_("Connection matched file \"%s\" line %d: \"%s\""),
port->hba->sourcefile, port->hba->linenumber,
port->hba->rawline);
if (logdetail)
logdetail = psprintf("%s\n%s", logdetail, cdetail);
else
logdetail = cdetail;
ereport(elevel,
(errcode(errcode_return),
errmsg(errstr, port->user_name),
logdetail ? errdetail_log("%s", logdetail) : 0));
```
This comment is not against the code introduced by this patch, just as this patch is touching the code block.
Based on https://www.postgresql.org/docs/current/error-style-guide.html, a detail message should end with a period.
003 LGTM.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
> As is_log_level_output() returns false against FATAL_CLIENT_ONLY, so that FATAL_CLIENT_ONLY should not reach send_message_to_server_log().Should we assert edata->elevel != FATAL_CLIENT_ONLY? Andrey asked the same question upthread, this mirrors how WARNING_CLIENT_ONLY is implemented. > As far as I understand, for a programmer error, Assert should be used. Why do we use elog(ERROR) here? +1, I would also say that for CheckSASLAuth, specifying abandoned is always required, since the caller can't know when it will result in an error. So the assert/if should be at the beginning of the function, not in the error path. Or instead: + /* + * "Abandoned" is a SASL-specific state similar to STATUS_EOF, in that we + * don't want to generate any server logs. But it's caused by an in-band + * client action that requires a server response, not an out-of-band + * connection closure, so we can't just proc_exit() like we do with + * STATUS_EOF. + */ + bool abandoned = false; + Have you considered adding the error level here instead, the same way as in auth_failed, explicitly defaulted to normal fatal by the caller, so existing code don't have to change it? That wouldn't need an SASL-specific explanation or flag in the generic code.
> On Mar 17, 2026, at 13:29, Zsolt Parragi <zsolt.parragi@percona.com> wrote: > >> As is_log_level_output() returns false against FATAL_CLIENT_ONLY, so that FATAL_CLIENT_ONLY should not reach send_message_to_server_log().Should we assert edata->elevel != FATAL_CLIENT_ONLY? > > Andrey asked the same question upthread, this mirrors how > WARNING_CLIENT_ONLY is implemented. > Do you mean that we do the same as WARNING_CLIENT_ONLY in this patch, and use a separate patch to fix them together? Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
On Mon, Mar 16, 2026 at 11:19 PM Chao Li <li.evan.chao@gmail.com> wrote: > Do you mean that we do the same as WARNING_CLIENT_ONLY in this patch, and use a separate patch to fix them together? I'm not sure I want to fix it at all; it keeps the code coherent even if someone later decides they really want to override the CLIENT_ONLY directive for some reason. On the WARNING_CLIENT_ONLY thread [1], Andres said > I don't think it needs to be done right now, but I again want to suggest > it'd be nice if we split log levels into a bitmask. If we bits, separate > from the log level, for do-not-log-to-client and do-not-log-to-server > some of this code would imo look nicer. and I think I agree that would be a good way for future improvement. --Jacob [1] https://postgr.es/m/20201228191428.p5bhcvd4ixsuyifd%40alap3.anarazel.de
On Mon, Mar 16, 2026 at 10:29 PM Zsolt Parragi <zsolt.parragi@percona.com> wrote: > > As far as I understand, for a programmer error, Assert should be used. Why do we use elog(ERROR) here? > > +1 A couple reasons: - to keep it parallel with the similar elog directly above it - to strengthen the API boundary for an esoteric edge case without requiring assertions to be enabled I don't mind relying on assertions for (not an exhaustive list!) well-exercised code, or when performance is critical, or when the callers and callees are in the same source file, or when it's not a security-critical context... This would seem to fail all of those tests. I imagine some of the existing assertions in OAuth should ideally be strengthened into production checks, too. I've considered adding a "production assert" variant for our security code in general, client- and server-side. > I would also say that for CheckSASLAuth, specifying abandoned is > always required, since the caller can't know when it will result in an > error. That's not really true, because the caller hardcodes the mechanism descriptor. The layers above and below CheckSASLAuth are coupled via injection -- ClientAuthentication knows full well that a uaOAuth HBA entry can't be satisfied by the SCRAM mechanisms, and vice-versa. If that were to change in a meaningful way, then sure, the caller would need to always provide the currently-OAuth-specific-edge-case flag. (But in that case, if the caller somehow doesn't have to know the mechanism in use, we could presumably also centralize a single call to CheckSASLAuth().) > Have you considered adding the error level here instead, the same way > as in auth_failed, explicitly defaulted to normal fatal by the caller, > so existing code don't have to change it? That wouldn't need an > SASL-specific explanation or flag in the generic code. I don't think I can visualize what you're proposing. If you mean that CheckSASLAuth should set the elevel with an output parameter, I'd rather not; that moves the responsibility for a very critical assumption (we're ending the process now) across a bunch of different files. (If more things than OAuth need this eventually, maybe it becomes STATUS_SILENT_ERROR or something, to make it even more generic?) Thanks, --Jacob
> That's not really true, because the caller hardcodes the mechanism > descriptor. I meant that the caller shouldn't depend on the implementation details of the mechanism. The abandoned comment says that '"Abandoned" is a SASL-specific state similar to STATUS_EOF ...', yet later it also depends on an implementation detail of which sasl mechanism actually use it. > (If more things than OAuth need this eventually, maybe it becomes > STATUS_SILENT_ERROR or something, to make it even more generic?) That's a good idea, better than my error level suggestion. The code would actually shorter, because you could remove the programmer error check from CheckSASLAuth. The diff also, because it would work without modifying the calls to it. The patch is also good as-is, all these comments in the last few messages are just very minor details, I probably spent way too much time thinging about how to make this not oauth specific in the generic part of the code.
On Tue, Mar 17, 2026 at 2:19 PM Zsolt Parragi <zsolt.parragi@percona.com> wrote: > > That's not really true, because the caller hardcodes the mechanism > > descriptor. > > I meant that the caller shouldn't depend on the implementation details > of the mechanism. The abandoned comment says that '"Abandoned" is a > SASL-specific state similar to STATUS_EOF ...', yet later it also > depends on an implementation detail of which sasl mechanism actually > use it. I don't disagree, I'm just trying to point out that this coupling is already part of CheckSASLAuth. See e.g. the handling of shadow_pass. (I'm not very worried about this, because we're free to improve this API at any time, and there are only two callers. Michael was very receptive to prefactoring patches here prior to the addition of OAUTHBEARER, and I expect we'll continue to refactor it if/when more mechanisms show up. It's just hard to pull a general interface out of two mechanisms as dissimilar as SCRAM and OAuth.) > The patch is also good as-is, all these comments in the last few > messages are just very minor details, I probably spent way too much > time thinging about how to make this not oauth specific in the generic > part of the code. I appreciate the review! I'm not in a rush to get this patch pushed, and I want to give Michael ample time to weigh in. (Personally, I don't think anyone is likely to argue against the behavior change here, only against how it's being done. We have alternative implementations available if there are strong opinions late in the cycle. So I feel pretty confident we can land a fix for 19.) Thanks, --Jacob
On Fri, Mar 20, 2026 at 11:14 AM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > I'm not in a rush to get this patch pushed, and I want to give Michael > ample time to weigh in. (Personally, I don't think anyone is likely to > argue against the behavior change here, only against how it's being > done. We have alternative implementations available if there are > strong opinions late in the cycle. So I feel pretty confident we can > land a fix for 19.) Pushed (but post-commit review is very welcome :D). --Jacob