Обсуждение: Stream Copy for 8.1 - 8.3dev

Поиск
Список
Период
Сортировка

Stream Copy for 8.1 - 8.3dev

От
"Kalle Hallivuori"
Дата:
Stream Copy extension for 8.1 - 8.3dev drivers is available with examples at

http://kato.iki.fi/sw/db/postgresql/jdbc/copy/

Cheers,

--
Kalle Hallivuori +358-41-5053073 http://korpiq.iki.fi/

Re: Stream Copy for 8.1 - 8.3dev

От
Kris Jurka
Дата:

On Wed, 27 Jun 2007, Kalle Hallivuori wrote:

> Stream Copy extension for 8.1 - 8.3dev drivers is available with examples at
>
> http://kato.iki.fi/sw/db/postgresql/jdbc/copy/
>

Some thoughts after giving this a read (but not a test).

1) Your CopyObjects example pays no attention to encoding or escaping.
Calling getBytes() will give you bytes in the JVM's character set which
can easily be different than the client_encoding which the driver will
always set to UTF-8 (for 7.3 and up).  You also have to think about what
happens when your data contains a null escape, delimiter, or newline
itself.

2) I'm not sure it's helpful to have the copy methods throw both an IO and
SQL exception.  Why not just wrap any IOExceptions inside a SQLException?

3) What is the purpose of the reuse_buffer parameter?  What is the use
case for someone wanting a "fresh" buffer every time?

4) buildCopyQuery doesn't handle quoting/escaping of identifiers.  By
providing something like this you're now responsible for all the "hard"
stuff.  I would leave it out unless you're prepared to do a lot more
thinking about it.  It also doesn't handle all the possible copy options.
Perhaps you should split this into core copy functionality and a helper
class that builds upon it and can provide other useful things (escaping /
conversion of java objects to pg datatypes).

5) The coding in QueryExecutorImpl is unsafe because once you get
CopyInResponse you loop firing away data without listening for any return
data.  Consider a table which had a trigger on it which issued a notice
for each row it received.  If you don't read from the server the server
will block and then you'll block sending it data.  See the comments in
core/v3/QueryExecutorImpl near MAX_BUFFERED_QUERIES for more details.

6) You mix warnings and errors together.  They should be kept separate.

7) When you get CopyResponses and the user hasn't provided the appropriate
stream you bail leaving the protocol in an unknown state.  You should
issue a CopyFail and wait for ReadyForQuery so the whole connection isn't
lost.

Kris Jurka

Re: Stream Copy for 8.1 - 8.3dev

От
"Kalle Hallivuori"
Дата:
Hi Kris!

2007/7/16, Kris Jurka <books@ejurka.com>:
> > http://kato.iki.fi/sw/db/postgresql/jdbc/copy/
>
> Some thoughts after giving this a read (but not a test).

Thanks a lot!

> 1) Your CopyObjects example pays no attention to encoding or escaping.
> Calling getBytes() will give you bytes in the JVM's character set which
> can easily be different than the client_encoding which the driver will
> always set to UTF-8 (for 7.3 and up).  You also have to think about what
> happens when your data contains a null escape, delimiter, or newline
> itself.

Yes, it's very quick and dirty example. I'll look into providing a
more thorough mediator class as you propose below.

> 2) I'm not sure it's helpful to have the copy methods throw both an IO and
> SQL exception.  Why not just wrap any IOExceptions inside a SQLException?

I thought SQLException means that the transaction failed, but
connection is still usable, whereas an IOException leaves the
connection in an unknown state so that it must be dropped. But since
that is not consistent with the rest of the driver, it's obviously
wrong thought :) I'll wrap'em.

> 3) What is the purpose of the reuse_buffer parameter?  What is the use
> case for someone wanting a "fresh" buffer every time?

A custom stream might store them internally. Obviously I should
document such details before giving the code out for review.

> 4) buildCopyQuery doesn't handle quoting/escaping of identifiers.  By
> providing something like this you're now responsible for all the "hard"
> stuff.  I would leave it out unless you're prepared to do a lot more
> thinking about it.  It also doesn't handle all the possible copy options.

I'll just drop the query building functionality. It doesn't really
offer much value, as the caller has to be familiar with the syntax
anyway.

> Perhaps you should split this into core copy functionality and a helper
> class that builds upon it and can provide other useful things (escaping /
> conversion of java objects to pg datatypes).

I'll look into providing a couple of helper classes to target these.

On one hand I think we should honor the copy philosophy of keeping the
raw data as flat as possible. On the other hand, providing a full
object interface on top of the former would be definitely sexy :)

> 5) The coding in QueryExecutorImpl is unsafe because once you get
> CopyInResponse you loop firing away data without listening for any return
> data.  Consider a table which had a trigger on it which issued a notice
> for each row it received.  If you don't read from the server the server
> will block and then you'll block sending it data.  See the comments in
> core/v3/QueryExecutorImpl near MAX_BUFFERED_QUERIES for more details.

I'll fix that. Thanks!

> 6) You mix warnings and errors together.  They should be kept separate.

Ok, I'll fix that too.

> 7) When you get CopyResponses and the user hasn't provided the appropriate
> stream you bail leaving the protocol in an unknown state.  You should
> issue a CopyFail and wait for ReadyForQuery so the whole connection isn't
> lost.

Oops, that's really lousy of me. Will fix.

I'll let you know once I have a fixed version available.

Cheers,

--
Kalle Hallivuori +358-41-5053073 http://korpiq.iki.fi/

Re: Stream Copy for 8.1 - 8.3dev

От
"Kalle Hallivuori"
Дата:
Hi again.

Fixed versions of copy patches and drivers are now available at
http://kato.iki.fi/sw/db/postgresql/jdbc/copy/

Unit tests do pass, but I had to postpone any more extensive tests for now.

2007/7/16, Kris Jurka <books@ejurka.com>:
> 1) Your CopyObjects example pays no attention to encoding or escaping.
> Calling getBytes() will give you bytes in the JVM's character set which
> can easily be different than the client_encoding which the driver will
> always set to UTF-8 (for 7.3 and up).  You also have to think about what
> happens when your data contains a null escape, delimiter, or newline
> itself.

Example dropped for now. Once the core implementation is considered
acceptable, I'll add documentation and helper classes.

> 2) I'm not sure it's helpful to have the copy methods throw both an IO and
> SQL exception.  Why not just wrap any IOExceptions inside a SQLException?

Wrapped as suggested.

> 3) What is the purpose of the reuse_buffer parameter?  What is the use
> case for someone wanting a "fresh" buffer every time?

Dropped as unnecessary.

> 4) buildCopyQuery doesn't handle quoting/escaping of identifiers.  By
> providing something like this you're now responsible for all the "hard"
> stuff.  I would leave it out unless you're prepared to do a lot more
> thinking about it.  It also doesn't handle all the possible copy options.
> Perhaps you should split this into core copy functionality and a helper
> class that builds upon it and can provide other useful things (escaping /
> conversion of java objects to pg datatypes).

Dropped as unnecessary.

> 5) The coding in QueryExecutorImpl is unsafe because once you get
> CopyInResponse you loop firing away data without listening for any return
> data.  Consider a table which had a trigger on it which issued a notice
> for each row it received.  If you don't read from the server the server
> will block and then you'll block sending it data.  See the comments in
> core/v3/QueryExecutorImpl near MAX_BUFFERED_QUERIES for more details.

Changed so that data is sent only while the server stays quiet.
(Apparently I should write a test for this.)

> 6) You mix warnings and errors together.  They should be kept separate.

Separated warnings from errors. After succesful recovery from copy
subprotocol state:

- if any errors were catched, they're thrown and warnings get dropped
as a side effect;
- if any warnings got collected, they're thrown, since I'm unsure
about how to handle them.

> 7) When you get CopyResponses and the user hasn't provided the appropriate
> stream you bail leaving the protocol in an unknown state.  You should
> issue a CopyFail and wait for ReadyForQuery so the whole connection isn't
> lost.

Fixed as suggested.

--
Kalle Hallivuori +358-41-5053073 http://korpiq.iki.fi/

Re: Stream Copy for 8.1 - 8.3dev

От
Oliver Jowett
Дата:
Kalle Hallivuori wrote:

> - if any warnings got collected, they're thrown, since I'm unsure
> about how to handle them.

Usually SQLWarnings get added to the warning chain of the generating
object -- I suppose the Connection in this case.

-O

Re: Stream Copy for 8.1 - 8.3dev

От
"Kalle Hallivuori"
Дата:
Hi!

2007/7/17, Oliver Jowett <oliver@opencloud.com>:
> Usually SQLWarnings get added to the warning chain of the generating
> object -- I suppose the Connection in this case.

Thanks for the hint. Applied. New version available at

http://kato.iki.fi/sw/db/postgresql/jdbc/copy/

Cheers,

--
Kalle Hallivuori +358-41-5053073 http://korpiq.iki.fi/