Обсуждение: setBinaryStream can abandon connection
Currently using setBinaryStream over a v3 protocol connection can completely break your connection if you happen to get the length wrong or your InputStream throws an Exception the driver will simply close your connection giving you no chance to rollback. This patch fixes that by continuing to pump fake data into the Bind message and simply not issuing the Execute message. This isn't ideal because it doesn't actually force a backend error so it is possible to issue a commit later that will succeed instead of failing, but I still think it's better than the current behavior. I suppose we could deliberately introduce a syntax error or something if in not in autocommit mode. Finally do we need a special -1 length argument to tell the driver we don't know the length. This will force a copy of the stream instead of the current direct streaming to the backend, but a memory hog is better than not working at all. Kris Jurka
Вложения
Kris Jurka wrote: > This isn't ideal because it doesn't actually force a backend error so it > is possible to issue a commit later that will succeed instead of failing, > but I still think it's better than the current behavior. I suppose we > could deliberately introduce a syntax error or something if in not in > autocommit mode. My gut feeling is that this is a bad idea, but I guess that given you're not actually Executing the portal, and the driver is throwing an error to the application, then there's not a huge risk of the app missing the error. I think we have a secondary problem here in that we've seen at least one instance where app code passes Integer.MAX_VALUE as the length argument when it does not know the actual length beforehand. a) we are going to overflow a signed 32-bit int for the Bind message's length and b) even if we have a large-but-not-too-large length, isn't the backend going to run out of memory if we pass that on directly? We should probably fix (a) at a minimum. For (b) we may just have to let the backend complain and abandon the connection, unless we want to set some arbitary limit on stream length. > Finally do we need a special -1 length argument to tell the driver we > don't know the length. This will force a copy of the stream instead of > the current direct streaming to the backend, but a memory hog is better > than not working at all. That's a nonstandard extension though.. I don't really see the point, given that the app can just buffer internally anyway. The only win would be that we wouldn't need to clone the bytearray as you need to with setBytes(). The app can get the same effect right now by buffering to a bytearray/ByteArrayInputStream before calling setBinaryStream(). Are there existing apps that pass -1? -O
On Mon, 18 Oct 2004, Oliver Jowett wrote: > I think we have a secondary problem here in that we've seen at least one > instance where app code passes Integer.MAX_VALUE as the length argument > when it does not know the actual length beforehand. a) we are going to > overflow a signed 32-bit int for the Bind message's length and b) even > if we have a large-but-not-too-large length, isn't the backend going to > run out of memory if we pass that on directly? Some details on what actually happens: When overflow happens the backend logs "invalid message length" and the connection is broken. The message length will not cause the backend to allocate large amounts of memory, the individual parameter lengths are what actually allocates memory. In the case of very large parameter values the error, "invalid string enlargement request size XXXXX", occurs. This makes sense because the largest a text/bytea field can be is 1 GB in size. So it is not an arbitrary length check depending on the number of parameters, but instead an exact check on each parameter. So if we don't have a problem with a broken connection we don't actually need to check for these errors because they don't do anything terrible, but obviously I believe a broken connection is bad. Kris Jurka
Kris Jurka wrote: > The message length will not cause the backend to allocate large amounts of > memory, the individual parameter lengths are what actually allocates > memory. Are you sure about this? I see in pq_getmessage (called from the main backend loop via SocketBackend): if (len > 0) { /* Allocate space for message */ enlargeStringInfo(s, len); /* And grab the message */ if (pq_getbytes(s->data, len) == EOF) // ... } 'len' is the total packet length, not a per-parameter length. > So if we don't have a problem with a broken connection we don't actually > need to check for these errors because they don't do anything terrible, > but obviously I believe a broken connection is bad. I'd still like to see a more useful error message in the overflow case. We really shouldn't be generating a negative message length in the first place. -O
On Mon, 18 Oct 2004, Oliver Jowett wrote: > Kris Jurka wrote: > > > The message length will not cause the backend to allocate large amounts of > > memory, the individual parameter lengths are what actually allocates > > memory. > > Are you sure about this? I see in pq_getmessage (called from the main > backend loop via SocketBackend): > > 'len' is the total packet length, not a per-parameter length. Right you are. I was just looking in enlargeStringInfo and not where it was called from. > > > So if we don't have a problem with a broken connection we don't actually > > need to check for these errors because they don't do anything terrible, > > but obviously I believe a broken connection is bad. > > I'd still like to see a more useful error message in the overflow case. > We really shouldn't be generating a negative message length in the first > place. OK. So we should calculate the total message length using a "long" variable to guard against overflow and then check that it doesn't exceed MaxAllocSize. If it does, do not issue the bind at all and throw an Exception in a fashion that keeps the connection in working order. I suppose we could also put direct checks on the setXXXStream calls to give a better error message to the user. Instead of a "total message length exceeded" it would tell you the failing call immediately. We'd still need the total message length check, but this would probably catch the majority of the cases. Yeah, we need to guard against negative message lengths somewhere. Kris Jurka
Kris Jurka wrote: > OK. So we should calculate the total message length using a "long" > variable to guard against overflow and then check that it doesn't exceed > MaxAllocSize. If it does, do not issue the bind at all and throw an > Exception in a fashion that keeps the connection in working order. Sounds reasonable. I was wondering if it was worth teaching PGstream (or a subclass) about the protocol format. Then it could automatically check for oversize messages, that we wrote as much data we claimed, and that when reading a server message we consumed the entire message. I don't know what the overhead of doing this would be, though. -O