Обсуждение: New significance of holdable result sets in Java 8

Поиск
Список
Период
Сортировка

New significance of holdable result sets in Java 8

От
Marko Topolnik
Дата:
As of the release of Java 8 and its Streams API a new door has opened for many things, including an important
improvementin the way RESTful services can be implemented. Let me briefly describe the architecture with Spring's REST
support:an MVC framework is used where the Controller dispatches the HTTP request to a Service method; the Service
methodcontacts the database and returns a Model-oriented representation of the response; the View layer then transforms
itinto the actual HTTP response bytes. 

Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a
collection-shapedresponse, you would return a List. This meant eager loading of all data needed for the response, which
causedscalability issues related to the JVM heap space. 

With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very
convenientto make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a
Modelobject. This, however, calls for holdable result sets because the transaction commits when program control leaves
theService layer. 

The Spring team has recognized the relevance of the above use case and with release 4.1.2 they have introduced a
specificenhancement needed to support result sets holdable into the View layer (albeit only when JDBC is used over
Hibernate).This is described in the issue SPR-12349 [1]. Spring also plans to support this use case with additional
helpercode which turns Hibernate's ScrollableResults into a Stream (SPR-12388 [2]). 

The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed by
nativeholdable cursors instead of the current client-side cursors, which don't allow the space complexity to be reduced
fromO(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention into the
FE/BEprotocol, but I would nevertheless propose that this concern be reassessed in the light of new developments in the
Javaecosystem. 

Regards,
Marko Topolnik



 [1] https://jira.spring.io/browse/SPR-12349
 [2] https://jira.spring.io/browse/SPR-12388

Re: New significance of holdable result sets in Java 8

От
Steven Schlansker
Дата:
Thanks for bringing this up!  I'm not an expert on the PG FE/BE protocol, so please excuse my ignorance below :)

On Nov 12, 2014, at 7:22 AM, Marko Topolnik <marko.topolnik@gmail.com> wrote:
>
> Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a
collection-shapedresponse, you would return a List. This meant eager loading of all data needed for the response, which
causedscalability issues related to the JVM heap space. 

We solved this problem (pre-Streams) by inverting the flow of control.  In pseudo-JAX-RS with Jackson:

@Path("/objects")
public StreamingOutput getObjects() {
  return os -> {
    JsonGenerator jg = new JsonGenerator(os);
    ResultSet rs = findObjects(); // make sure to do this in a txn with fetch size set sensibly or you will collect all
rowseagerly 
    while (rs.hasNext()) { jg.writeObject(rs.getString("whatever")); }
    jg.close(); rs.close();
  };
}

> With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very
convenientto make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a
Modelobject. This, however, calls for holdable result sets because the transaction commits when program control leaves
theService layer. 

Having a holdable result set escape the service layer feels like a similar layering violation as having an open
transactionescape the service layer.  Either way a heavyweight database object escapes and must be cleaned up at a
laterdate.  Naïvely I would expect the cost of holding the two objects to be similar, since the actual underlying
transactioncannot commit/rollback until all results are read from the cursor? 

> The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed
bynative holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be
reducedfrom O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention
intothe FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new
developmentsin the Java ecosystem. 

As an alternate thought, Stream is AutoCloseable.  What if we introduce a StatementStream, which takes a
(Prepared)Statement,executes in a transaction with small fetch size, and then returns a Stream<DTO>.  Again very
pseudo-codey:

class DAO {
  Stream<DTO> findObjects() {
    return new StatementResultStream(conn, "SELECT * FROM objects"); // sets fetch size, ensure auto commit off
  }
}

class JsonStreamOutput<T> implements StreamingOutput {
  StreamOutput(Supplier<Stream<T>> streamSupplier) { this.streamSupplier = streamSupplier; }
  public void write(OutputStream os) {
    try (JsonGenerator jg = new JsonGenerator(os);
         Stream<T> str = streamSupplier.get()) {
      str.forEach(obj -> jg.writeObject(obj));
    }
  }
}

@Path("/objects")
class Resource {
  @GET
  public StreamOutput<TransformedDTO> getObjects() {
    return new JsonStreamOutput(() -> dto.findObjects().filter(...).map(dto -> new TransformedDto(dto)));
  }
}

The StatementResultStream and StreamOutput classes are entirely reusable.  The DAO is no more complicated than before,
exceptinstead of using JDBC directly you punt the work until the Stream undergoes a terminal operation.  The main
difficultyhere is figuring out a sensible API for StatementResultStream construction. 

The only bit I'm not happy with is the fact that you have to lift your intermediate stream operations into a lambda.
Youmight be able to avoid this by instead writing the whole bit in terms of a Spliterator instead and using
StreamSupport.stream(spliterator),but that introduces questions about held resources since Spliterator is not
AutoCloseable.

This has the advantage that it's implementable today without changes to either JDBC or PGJDBC, and introduces only
reusableoptional components to e.g. Spring or JDBI or whatever library wishes to do this. 

Thoughts?
Steven



Re: New significance of holdable result sets in Java 8

От
Dave Cramer
Дата:
Marko,

When you say holdable cursors are you referring to a holdable cursor outside of a transaction? It seems so because the transaction commits after leaving the service layer ?

If so these are not without significant cost on the server side.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 12 November 2014 10:22, Marko Topolnik <marko.topolnik@gmail.com> wrote:
As of the release of Java 8 and its Streams API a new door has opened for many things, including an important improvement in the way RESTful services can be implemented. Let me briefly describe the architecture with Spring's REST support: an MVC framework is used where the Controller dispatches the HTTP request to a Service method; the Service method contacts the database and returns a Model-oriented representation of the response; the View layer then transforms it into the actual HTTP response bytes.

Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a collection-shaped response, you would return a List. This meant eager loading of all data needed for the response, which caused scalability issues related to the JVM heap space.

With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very convenient to make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a Model object. This, however, calls for holdable result sets because the transaction commits when program control leaves the Service layer.

The Spring team has recognized the relevance of the above use case and with release 4.1.2 they have introduced a specific enhancement needed to support result sets holdable into the View layer (albeit only when JDBC is used over Hibernate). This is described in the issue SPR-12349 [1]. Spring also plans to support this use case with additional helper code which turns Hibernate's ScrollableResults into a Stream (SPR-12388 [2]).

The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed by native holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be reduced from O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention into the FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new developments in the Java ecosystem.

Regards,
Marko Topolnik



 [1] https://jira.spring.io/browse/SPR-12349
 [2] https://jira.spring.io/browse/SPR-12388

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: New significance of holdable result sets in Java 8

От
Marko Topolnik
Дата:
Yes, that is correct, Dave. The transaction would have to commit and the cursor would have to live on until the JDBC
connectionis released back to the pool. I am aware that this transfers the cost from the JVM into PG; however I also
thinkthat PG is in a much better position to fine-tune its policies around this concern.  

Even if comparing RAM-only solutions at either end, PG would almost surely take less RAM to hold the data in its native
formthan the JVM uses to hold the client-side cursor; and further, PG would have much better opportunities to overflow
todisk than the JDBC driver, which probably doesn't have that option at all. 

--
Marko Topolnik


On 12. stu. 2014., at 19:51, Dave Cramer <pg@fastcrypt.com> wrote:

> Marko,
>
> When you say holdable cursors are you referring to a holdable cursor outside of a transaction? It seems so because
thetransaction commits after leaving the service layer ? 
>
> If so these are not without significant cost on the server side.
>
> Dave Cramer
>
> dave.cramer(at)credativ(dot)ca
> http://www.credativ.ca
>
> On 12 November 2014 10:22, Marko Topolnik <marko.topolnik@gmail.com> wrote:
> As of the release of Java 8 and its Streams API a new door has opened for many things, including an important
improvementin the way RESTful services can be implemented. Let me briefly describe the architecture with Spring's REST
support:an MVC framework is used where the Controller dispatches the HTTP request to a Service method; the Service
methodcontacts the database and returns a Model-oriented representation of the response; the View layer then transforms
itinto the actual HTTP response bytes. 
>
> Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a
collection-shapedresponse, you would return a List. This meant eager loading of all data needed for the response, which
causedscalability issues related to the JVM heap space. 
>
> With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very
convenientto make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a
Modelobject. This, however, calls for holdable result sets because the transaction commits when program control leaves
theService layer. 
>
> The Spring team has recognized the relevance of the above use case and with release 4.1.2 they have introduced a
specificenhancement needed to support result sets holdable into the View layer (albeit only when JDBC is used over
Hibernate).This is described in the issue SPR-12349 [1]. Spring also plans to support this use case with additional
helpercode which turns Hibernate's ScrollableResults into a Stream (SPR-12388 [2]). 
>
> The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed
bynative holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be
reducedfrom O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention
intothe FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new
developmentsin the Java ecosystem. 
>
> Regards,
> Marko Topolnik
>
>
>
>  [1] https://jira.spring.io/browse/SPR-12349
>  [2] https://jira.spring.io/browse/SPR-12388
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc
>



Re: New significance of holdable result sets in Java 8

От
Kevin Wooten
Дата:
I’m inclined to agree with Dave, your usage of holdable cursors sounds like an extremely burdensome (server wise) solution to a “nice to have” application architecture solution.

Why not make a stream adapter that fetches results in groups using LIMIT/OFFSET.  This would work in any situation regardless of cursors, transactions, etc and would *only* cost for large result sets that need to extra round trips.

That being said… pgjdbc-ng uses real cursors when asked for them and respects the foldability requirement.  There are limitations based on postgres’s feature set but I believe what you are asking for works. 

On Nov 12, 2014, at 11:51 AM, Dave Cramer <pg@fastcrypt.com> wrote:

Marko,

When you say holdable cursors are you referring to a holdable cursor outside of a transaction? It seems so because the transaction commits after leaving the service layer ?

If so these are not without significant cost on the server side.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 12 November 2014 10:22, Marko Topolnik <marko.topolnik@gmail.com> wrote:
As of the release of Java 8 and its Streams API a new door has opened for many things, including an important improvement in the way RESTful services can be implemented. Let me briefly describe the architecture with Spring's REST support: an MVC framework is used where the Controller dispatches the HTTP request to a Service method; the Service method contacts the database and returns a Model-oriented representation of the response; the View layer then transforms it into the actual HTTP response bytes.

Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a collection-shaped response, you would return a List. This meant eager loading of all data needed for the response, which caused scalability issues related to the JVM heap space.

With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very convenient to make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a Model object. This, however, calls for holdable result sets because the transaction commits when program control leaves the Service layer.

The Spring team has recognized the relevance of the above use case and with release 4.1.2 they have introduced a specific enhancement needed to support result sets holdable into the View layer (albeit only when JDBC is used over Hibernate). This is described in the issue SPR-12349 [1]. Spring also plans to support this use case with additional helper code which turns Hibernate's ScrollableResults into a Stream (SPR-12388 [2]).

The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed by native holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be reduced from O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention into the FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new developments in the Java ecosystem.

Regards,
Marko Topolnik



 [1] https://jira.spring.io/browse/SPR-12349
 [2] https://jira.spring.io/browse/SPR-12388

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc


Re: New significance of holdable result sets in Java 8

От
Dave Cramer
Дата:
Marko,

While PG might take less RAM per cursor than the client side, we will very quickly run out if there are N clients where N is some non-trivial number and the cursors are large. This would potentially be catastrophic since PostgreSQL will spill to disk and potentially run out of disk space.

You may argue that it might not happen but I'd prefer the client crashed than the server.

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 12 November 2014 13:58, Kevin Wooten <kdubb@me.com> wrote:
I’m inclined to agree with Dave, your usage of holdable cursors sounds like an extremely burdensome (server wise) solution to a “nice to have” application architecture solution.

Why not make a stream adapter that fetches results in groups using LIMIT/OFFSET.  This would work in any situation regardless of cursors, transactions, etc and would *only* cost for large result sets that need to extra round trips.

That being said… pgjdbc-ng uses real cursors when asked for them and respects the foldability requirement.  There are limitations based on postgres’s feature set but I believe what you are asking for works. 

On Nov 12, 2014, at 11:51 AM, Dave Cramer <pg@fastcrypt.com> wrote:

Marko,

When you say holdable cursors are you referring to a holdable cursor outside of a transaction? It seems so because the transaction commits after leaving the service layer ?

If so these are not without significant cost on the server side.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 12 November 2014 10:22, Marko Topolnik <marko.topolnik@gmail.com> wrote:
As of the release of Java 8 and its Streams API a new door has opened for many things, including an important improvement in the way RESTful services can be implemented. Let me briefly describe the architecture with Spring's REST support: an MVC framework is used where the Controller dispatches the HTTP request to a Service method; the Service method contacts the database and returns a Model-oriented representation of the response; the View layer then transforms it into the actual HTTP response bytes.

Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a collection-shaped response, you would return a List. This meant eager loading of all data needed for the response, which caused scalability issues related to the JVM heap space.

With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very convenient to make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a Model object. This, however, calls for holdable result sets because the transaction commits when program control leaves the Service layer.

The Spring team has recognized the relevance of the above use case and with release 4.1.2 they have introduced a specific enhancement needed to support result sets holdable into the View layer (albeit only when JDBC is used over Hibernate). This is described in the issue SPR-12349 [1]. Spring also plans to support this use case with additional helper code which turns Hibernate's ScrollableResults into a Stream (SPR-12388 [2]).

The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed by native holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be reduced from O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention into the FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new developments in the Java ecosystem.

Regards,
Marko Topolnik



 [1] https://jira.spring.io/browse/SPR-12349
 [2] https://jira.spring.io/browse/SPR-12388

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc



Re: New significance of holdable result sets in Java 8

От
Steven Schlansker
Дата:
On Nov 12, 2014, at 10:58 AM, Kevin Wooten <kdubb@me.com> wrote:

> I’m inclined to agree with Dave, your usage of holdable cursors sounds like an extremely burdensome (server wise)
solutionto a “nice to have” application architecture solution. 
>
> Why not make a stream adapter that fetches results in groups using LIMIT/OFFSET.  This would work in any situation
regardlessof cursors, transactions, etc and would *only* cost for large result sets that need to extra round trips. 

IMO LIMIT/OFFSET is generally strictly inferior to holding a transaction open (like I proposed in my earlier mail) for
anythingthat isn't reading rows in database order (unordered) or index order.  This is because you end up paying the
startupcost once per batch instead of once per logical query.  Additionally it is now sensitive to concurrent inserts
andyou may observe e.g. duplicated or missing rows unless you are *very* careful. 

>
> That being said… pgjdbc-ng uses real cursors when asked for them and respects the foldability requirement.  There are
limitationsbased on postgres’s feature set but I believe what you are asking for works.  
>
>> On Nov 12, 2014, at 11:51 AM, Dave Cramer <pg@fastcrypt.com> wrote:
>>
>> Marko,
>>
>> When you say holdable cursors are you referring to a holdable cursor outside of a transaction? It seems so because
thetransaction commits after leaving the service layer ? 
>>
>> If so these are not without significant cost on the server side.
>>
>> Dave Cramer
>>
>> dave.cramer(at)credativ(dot)ca
>> http://www.credativ.ca
>>
>> On 12 November 2014 10:22, Marko Topolnik <marko.topolnik@gmail.com> wrote:
>> As of the release of Java 8 and its Streams API a new door has opened for many things, including an important
improvementin the way RESTful services can be implemented. Let me briefly describe the architecture with Spring's REST
support:an MVC framework is used where the Controller dispatches the HTTP request to a Service method; the Service
methodcontacts the database and returns a Model-oriented representation of the response; the View layer then transforms
itinto the actual HTTP response bytes. 
>>
>> Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a
collection-shapedresponse, you would return a List. This meant eager loading of all data needed for the response, which
causedscalability issues related to the JVM heap space. 
>>
>> With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very
convenientto make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a
Modelobject. This, however, calls for holdable result sets because the transaction commits when program control leaves
theService layer. 
>>
>> The Spring team has recognized the relevance of the above use case and with release 4.1.2 they have introduced a
specificenhancement needed to support result sets holdable into the View layer (albeit only when JDBC is used over
Hibernate).This is described in the issue SPR-12349 [1]. Spring also plans to support this use case with additional
helpercode which turns Hibernate's ScrollableResults into a Stream (SPR-12388 [2]). 
>>
>> The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed
bynative holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be
reducedfrom O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention
intothe FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new
developmentsin the Java ecosystem. 
>>
>> Regards,
>> Marko Topolnik
>>
>>
>>
>>  [1] https://jira.spring.io/browse/SPR-12349
>>  [2] https://jira.spring.io/browse/SPR-12388
>>
>> --
>> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-jdbc
>>
>



Re: New significance of holdable result sets in Java 8

От
Marko Topolnik
Дата:
On 12. stu. 2014., at 19:45, Steven Schlansker <stevenschlansker@gmail.com> wrote:

> Thanks for bringing this up!  I'm not an expert on the PG FE/BE protocol, so please excuse my ignorance below :)
>
> On Nov 12, 2014, at 7:22 AM, Marko Topolnik <marko.topolnik@gmail.com> wrote:
>>
>> Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a
collection-shapedresponse, you would return a List. This meant eager loading of all data needed for the response, which
causedscalability issues related to the JVM heap space. 
>
> We solved this problem (pre-Streams) by inverting the flow of control.  In pseudo-JAX-RS with Jackson:
>
> @Path("/objects")
> public StreamingOutput getObjects() {
>  return os -> {
>    JsonGenerator jg = new JsonGenerator(os);
>    ResultSet rs = findObjects(); // make sure to do this in a txn with fetch size set sensibly or you will collect
allrows eagerly 
>    while (rs.hasNext()) { jg.writeObject(rs.getString("whatever")); }
>    jg.close(); rs.close();
>  };
> }

I am aware of this approach; I even discuss it as a typical workaround in my blog post [1]. The problem is, this
completelydispenses with the detailed and complex setup present in the View layer: the response format can be driven by
theAccept HTTP header, encoding may be adapted to the request, and so on. I did pretty much that in my Clojure projects
becauseClojure's ecosystem isn't as well-developed so roll-your-own solutions are the norm. With Spring, however, you
losea great deal when you give your Controller method access to the raw response. 

I have already developed a Stream-based REST service along the lines I discuss in the mentioned blog post and the
advantagesare quite clear to me. It is a direction that Spring would probably like to move in as well. 

 [1] http://www.airpair.com/v1/posts/spring-streams-memory-efficiency

>
>> With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very
convenientto make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a
Modelobject. This, however, calls for holdable result sets because the transaction commits when program control leaves
theService layer. 
>
> Having a holdable result set escape the service layer feels like a similar layering violation as having an open
transactionescape the service layer.  Either way a heavyweight database object escapes and must be cleaned up at a
laterdate.  Naïvely I would expect the cost of holding the two objects to be similar, since the actual underlying
transactioncannot commit/rollback until all results are read from the cursor? 

I do not propose this as a solution that reduces the overall cost and you are right that the cost would just migrate.
Onthe other hand, note that Oracle and MS SQL Server provide holdable cursors by default; in fact, Oracle supports
nothing_but_ natively holdable result sets. 

>
>> The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed
bynative holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be
reducedfrom O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention
intothe FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new
developmentsin the Java ecosystem. 
>
> As an alternate thought, Stream is AutoCloseable.  What if we introduce a StatementStream, which takes a
(Prepared)Statement,executes in a transaction with small fetch size, and then returns a Stream<DTO>.  Again very
pseudo-codey:
>
> class DAO {
>  Stream<DTO> findObjects() {
>    return new StatementResultStream(conn, "SELECT * FROM objects"); // sets fetch size, ensure auto commit off
>  }
> }
>
> class JsonStreamOutput<T> implements StreamingOutput {
>  StreamOutput(Supplier<Stream<T>> streamSupplier) { this.streamSupplier = streamSupplier; }
>  public void write(OutputStream os) {
>    try (JsonGenerator jg = new JsonGenerator(os);
>         Stream<T> str = streamSupplier.get()) {
>      str.forEach(obj -> jg.writeObject(obj));
>    }
>  }
> }
>
> @Path("/objects")
> class Resource {
>  @GET
>  public StreamOutput<TransformedDTO> getObjects() {
>    return new JsonStreamOutput(() -> dto.findObjects().filter(...).map(dto -> new TransformedDto(dto)));
>  }
> }

Here I'm not sure I follow you. Do you mean to have the transaction boundaries move from the controller into the View
layer?Your StreamOutput-returning controller method would just set up the fetch, then the framework layer above it
(whichI call the View) would consume that, and only upon consumption would the Statement execute. 

You still seem to dictate the output format directly from the Controller, the avoidance of which is motivating my
approach.

> The StatementResultStream and StreamOutput classes are entirely reusable.  The DAO is no more complicated than
before,except instead of using JDBC directly you punt the work until the Stream undergoes a terminal operation.  The
maindifficulty here is figuring out a sensible API for StatementResultStream construction. 
>
> The only bit I'm not happy with is the fact that you have to lift your intermediate stream operations into a lambda.
Youmight be able to avoid this by instead writing the whole bit in terms of a Spliterator instead and using
StreamSupport.stream(spliterator),but that introduces questions about held resources since Spliterator is not
AutoCloseable.

This is easily solved (I already do it for my ScrollableResultsSpliterator): Stream has an onClose hook, so you just
introducea close() method to the spliterator and attach it to the hook. 

> This has the advantage that it's implementable today without changes to either JDBC or PGJDBC, and introduces only
reusableoptional components to e.g. Spring or JDBI or whatever library wishes to do this. 
>
> Thoughts?
> Steven
>



Re: New significance of holdable result sets in Java 8

От
Marko Topolnik
Дата:
On 12. stu. 2014., at 20:19, Steven Schlansker <stevenschlansker@gmail.com> wrote:

>
> On Nov 12, 2014, at 10:58 AM, Kevin Wooten <kdubb@me.com> wrote:
>
>> I’m inclined to agree with Dave, your usage of holdable cursors sounds like an extremely burdensome (server wise)
solutionto a “nice to have” application architecture solution. 
>>
>> Why not make a stream adapter that fetches results in groups using LIMIT/OFFSET.  This would work in any situation
regardlessof cursors, transactions, etc and would *only* cost for large result sets that need to extra round trips. 
>
> IMO LIMIT/OFFSET is generally strictly inferior to holding a transaction open (like I proposed in my earlier mail)
foranything that isn't reading rows in database order (unordered) or index order.  This is because you end up paying
thestartup cost once per batch instead of once per logical query.  Additionally it is now sensitive to concurrent
insertsand you may observe e.g. duplicated or missing rows unless you are *very* careful. 

This is exactly the workaround I mention in my blog post, and those are exactly the concerns I raise about it :)

http://www.airpair.com/v1/posts/spring-streams-memory-efficiency


>
>>
>> That being said… pgjdbc-ng uses real cursors when asked for them and respects the foldability requirement.  There
arelimitations based on postgres’s feature set but I believe what you are asking for works.  
>>
>>> On Nov 12, 2014, at 11:51 AM, Dave Cramer <pg@fastcrypt.com> wrote:
>>>
>>> Marko,
>>>
>>> When you say holdable cursors are you referring to a holdable cursor outside of a transaction? It seems so because
thetransaction commits after leaving the service layer ? 
>>>
>>> If so these are not without significant cost on the server side.
>>>
>>> Dave Cramer
>>>
>>> dave.cramer(at)credativ(dot)ca
>>> http://www.credativ.ca
>>>
>>> On 12 November 2014 10:22, Marko Topolnik <marko.topolnik@gmail.com> wrote:
>>> As of the release of Java 8 and its Streams API a new door has opened for many things, including an important
improvementin the way RESTful services can be implemented. Let me briefly describe the architecture with Spring's REST
support:an MVC framework is used where the Controller dispatches the HTTP request to a Service method; the Service
methodcontacts the database and returns a Model-oriented representation of the response; the View layer then transforms
itinto the actual HTTP response bytes. 
>>>
>>> Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a
collection-shapedresponse, you would return a List. This meant eager loading of all data needed for the response, which
causedscalability issues related to the JVM heap space. 
>>>
>>> With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also
veryconvenient to make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly
intoa Model object. This, however, calls for holdable result sets because the transaction commits when program control
leavesthe Service layer. 
>>>
>>> The Spring team has recognized the relevance of the above use case and with release 4.1.2 they have introduced a
specificenhancement needed to support result sets holdable into the View layer (albeit only when JDBC is used over
Hibernate).This is described in the issue SPR-12349 [1]. Spring also plans to support this use case with additional
helpercode which turns Hibernate's ScrollableResults into a Stream (SPR-12388 [2]). 
>>>
>>> The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed
bynative holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be
reducedfrom O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention
intothe FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new
developmentsin the Java ecosystem. 
>>>
>>> Regards,
>>> Marko Topolnik
>>>
>>>
>>>
>>> [1] https://jira.spring.io/browse/SPR-12349
>>> [2] https://jira.spring.io/browse/SPR-12388
>>>
>>> --
>>> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-jdbc
>>>
>>
>



Re: New significance of holdable result sets in Java 8

От
Steven Schlansker
Дата:
On Nov 12, 2014, at 11:20 AM, Marko Topolnik <marko.topolnik@gmail.com> wrote:
>
> I am aware of this approach; I even discuss it as a typical workaround in my blog post [1]. The problem is, this
completelydispenses with the detailed and complex setup present in the View layer: the response format can be driven by
theAccept HTTP header, encoding may be adapted to the request, and so on. I did pretty much that in my Clojure projects
becauseClojure's ecosystem isn't as well-developed so roll-your-own solutions are the norm. With Spring, however, you
losea great deal when you give your Controller method access to the raw response. 
>
> I have already developed a Stream-based REST service along the lines I discuss in the mentioned blog post and the
advantagesare quite clear to me. It is a direction that Spring would probably like to move in as well. 
>
> [1] http://www.airpair.com/v1/posts/spring-streams-memory-efficiency
> ...
> You still seem to dictate the output format directly from the Controller, the avoidance of which is motivating my
approach.
>

Indeed, after reading your blog post, it looks like we have 95% the exact same solution.

The essential difference I propose is that instead of having a holdable ResultSet (which requires PG-JDBC changes)
you hold the Connection/Statement/ResultSet with transaction open for the duration.

This way the transaction stays open for the duration of your request.  So you establish a stream all the way through
from end client -> PG.

With your suggestion, a holdable ResultSet, you run into what Dave mentioned:

http://www.postgresql.org/docs/9.3/static/sql-declare.html
> In the current implementation, the rows represented by a held cursor are copied into a temporary file or memory area
sothat they remain available for subsequent transactions. 

So in fact you are just materializing the result set server side.  This IMO defeats the entire purpose - you cannot
beginstreaming out results until the materialize finishes (the holdable cursor will not be available until said
materializeis done, I believe?  or at least the transaction will not commit until that point?)  Therefore my proposal
issignificantly more efficient and will never surprisingly materialize gigabytes of temporary disk usage. 

Makes more sense?



Re: New significance of holdable result sets in Java 8

От
Marko Topolnik
Дата:
Yes, your approach makes perfect sense---as long as you don't care about transactional semantics.

If the transaction boundary is moved all the way to the end of serving the response, the following happens:

1. 200 OK status code reported, but transaction still not committed;
2. transaction may roll back due to errors in transfer;
3. thanks to buffering, the transfer may also fail _after_ the transaction has committed.

So, not only have you established quite messy and illogical transaction semantics for your client (for example, a
typicalonSuccess callback on the HTTP request cannot assume the transaction succeeded); the client has no idea _in
principle_what happened to the transaction if transfer is aborted for any reason. 

In an earlier incarnation of my project I did indeed have setup exactly as you describe it, but after realizing the
aboveproblems, I had to back off and revert to standard transaction boundaries. 


On 12. stu. 2014., at 23:04, Steven Schlansker <stevenschlansker@gmail.com> wrote:

>
> On Nov 12, 2014, at 11:20 AM, Marko Topolnik <marko.topolnik@gmail.com> wrote:
>>
>> I am aware of this approach; I even discuss it as a typical workaround in my blog post [1]. The problem is, this
completelydispenses with the detailed and complex setup present in the View layer: the response format can be driven by
theAccept HTTP header, encoding may be adapted to the request, and so on. I did pretty much that in my Clojure projects
becauseClojure's ecosystem isn't as well-developed so roll-your-own solutions are the norm. With Spring, however, you
losea great deal when you give your Controller method access to the raw response. 
>>
>> I have already developed a Stream-based REST service along the lines I discuss in the mentioned blog post and the
advantagesare quite clear to me. It is a direction that Spring would probably like to move in as well. 
>>
>> [1] http://www.airpair.com/v1/posts/spring-streams-memory-efficiency
>> ...
>> You still seem to dictate the output format directly from the Controller, the avoidance of which is motivating my
approach.
>>
>
> Indeed, after reading your blog post, it looks like we have 95% the exact same solution.
>
> The essential difference I propose is that instead of having a holdable ResultSet (which requires PG-JDBC changes)
> you hold the Connection/Statement/ResultSet with transaction open for the duration.
>
> This way the transaction stays open for the duration of your request.  So you establish a stream all the way through
> from end client -> PG.
>
> With your suggestion, a holdable ResultSet, you run into what Dave mentioned:
>
> http://www.postgresql.org/docs/9.3/static/sql-declare.html
>> In the current implementation, the rows represented by a held cursor are copied into a temporary file or memory area
sothat they remain available for subsequent transactions. 
>
> So in fact you are just materializing the result set server side.  This IMO defeats the entire purpose - you cannot
beginstreaming out results until the materialize finishes (the holdable cursor will not be available until said
materializeis done, I believe?  or at least the transaction will not commit until that point?)  Therefore my proposal
issignificantly more efficient and will never surprisingly materialize gigabytes of temporary disk usage. 
>
> Makes more sense?
>



Re: New significance of holdable result sets in Java 8

От
Marko Topolnik
Дата:
Dave,

when you say this, is the hat you are wearing that of a database implementor or that of a system designer?

A system designer may very reasonably want his system designed along the following guidelines:

1. the database is about storage: give it plenty of that, and make it fast (SSD);
2. the middle tier is about business logic: focus on CPU power and internet bandwith, dimension RAM as needed to serve
asmany concurrent requests as the CPU can take. 

--
Marko Topolnik

On 12. stu. 2014., at 20:18, Dave Cramer <pg@fastcrypt.com> wrote:

> Marko,
>
> While PG might take less RAM per cursor than the client side, we will very quickly run out if there are N clients
whereN is some non-trivial number and the cursors are large. This would potentially be catastrophic since PostgreSQL
willspill to disk and potentially run out of disk space. 
>
> You may argue that it might not happen but I'd prefer the client crashed than the server.
>
> Dave
>
> Dave Cramer
>
> dave.cramer(at)credativ(dot)ca
> http://www.credativ.ca
>
> On 12 November 2014 13:58, Kevin Wooten <kdubb@me.com> wrote:
> I’m inclined to agree with Dave, your usage of holdable cursors sounds like an extremely burdensome (server wise)
solutionto a “nice to have” application architecture solution. 
>
> Why not make a stream adapter that fetches results in groups using LIMIT/OFFSET.  This would work in any situation
regardlessof cursors, transactions, etc and would *only* cost for large result sets that need to extra round trips. 
>
> That being said… pgjdbc-ng uses real cursors when asked for them and respects the foldability requirement.  There are
limitationsbased on postgres’s feature set but I believe what you are asking for works.  
>
>> On Nov 12, 2014, at 11:51 AM, Dave Cramer <pg@fastcrypt.com> wrote:
>>
>> Marko,
>>
>> When you say holdable cursors are you referring to a holdable cursor outside of a transaction? It seems so because
thetransaction commits after leaving the service layer ? 
>>
>> If so these are not without significant cost on the server side.
>>
>> Dave Cramer
>>
>> dave.cramer(at)credativ(dot)ca
>> http://www.credativ.ca
>>
>> On 12 November 2014 10:22, Marko Topolnik <marko.topolnik@gmail.com> wrote:
>> As of the release of Java 8 and its Streams API a new door has opened for many things, including an important
improvementin the way RESTful services can be implemented. Let me briefly describe the architecture with Spring's REST
support:an MVC framework is used where the Controller dispatches the HTTP request to a Service method; the Service
methodcontacts the database and returns a Model-oriented representation of the response; the View layer then transforms
itinto the actual HTTP response bytes. 
>>
>> Data is passed from Controller to View as the return value of a method. Traditionally, if you wanted a
collection-shapedresponse, you would return a List. This meant eager loading of all data needed for the response, which
causedscalability issues related to the JVM heap space. 
>>
>> With the Streams API it is now very convenient to return a lazily-evaluated stream of Model objects. It is also very
convenientto make this stream pull data directly from an underlying ResultSet, tronsforming each row on-the-fly into a
Modelobject. This, however, calls for holdable result sets because the transaction commits when program control leaves
theService layer. 
>>
>> The Spring team has recognized the relevance of the above use case and with release 4.1.2 they have introduced a
specificenhancement needed to support result sets holdable into the View layer (albeit only when JDBC is used over
Hibernate).This is described in the issue SPR-12349 [1]. Spring also plans to support this use case with additional
helpercode which turns Hibernate's ScrollableResults into a Stream (SPR-12388 [2]). 
>>
>> The above could raise the level of interest of the PostgreSQL JDBC team in implementing holdable result sets backed
bynative holdable cursors instead of the current client-side cursors, which don't allow the space complexity to be
reducedfrom O(n) to O(1) on the JVM side. I am aware that this is not a trivial endeavor as it requires intervention
intothe FE/BE protocol, but I would nevertheless propose that this concern be reassessed in the light of new
developmentsin the Java ecosystem. 
>>
>> Regards,
>> Marko Topolnik
>>
>>
>>
>>  [1] https://jira.spring.io/browse/SPR-12349
>>  [2] https://jira.spring.io/browse/SPR-12388
>>
>> --
>> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-jdbc
>>
>
>



Re: New significance of holdable result sets in Java 8

От
Florent Guillaume
Дата:
On Thu, Nov 13, 2014 at 1:08 PM, Marko Topolnik
<marko.topolnik@gmail.com> wrote:
> Yes, your approach makes perfect sense---as long as you don't care about transactional semantics.
>
> If the transaction boundary is moved all the way to the end of serving the response, the following happens:
>
> 1. 200 OK status code reported, but transaction still not committed;
> 2. transaction may roll back due to errors in transfer;
> 3. thanks to buffering, the transfer may also fail _after_ the transaction has committed.
>
> So, not only have you established quite messy and illogical transaction semantics for your client (for example, a
typicalonSuccess callback on the HTTP request cannot assume the transaction succeeded); the client has no idea _in
principle_what happened to the transaction if transfer is aborted for any reason. 
>
> In an earlier incarnation of my project I did indeed have setup exactly as you describe it, but after realizing the
aboveproblems, I had to back off and revert to standard transaction boundaries. 
>

Another slightly subtler problem is the following:
1. 200 OK reported but transaction still not committed due to various
async delays
2. client makes another request assuming the transaction is committed
3. result of 2. returned fast to client, transaction in 1. still not committed
4. client doesn't see committed data

Florent

--
Florent Guillaume, Director of R&D, Nuxeo
Open Source Content Management Platform for Business Apps
http://www.nuxeo.com   http://community.nuxeo.com


Re: New significance of holdable result sets in Java 8

От
Steven Schlansker
Дата:
On Nov 13, 2014, at 4:08 AM, Marko Topolnik <marko.topolnik@gmail.com> wrote:

> Yes, your approach makes perfect sense---as long as you don't care about transactional semantics.
>
> If the transaction boundary is moved all the way to the end of serving the response, the following happens:
>
> 1. 200 OK status code reported, but transaction still not committed;
> 2. transaction may roll back due to errors in transfer;
> 3. thanks to buffering, the transfer may also fail _after_ the transaction has committed.
>
> So, not only have you established quite messy and illogical transaction semantics for your client (for example, a
typicalonSuccess callback on the HTTP request cannot assume the transaction succeeded); the client has no idea _in
principle_what happened to the transaction if transfer is aborted for any reason. 
>
> In an earlier incarnation of my project I did indeed have setup exactly as you describe it, but after realizing the
aboveproblems, I had to back off and revert to standard transaction boundaries. 
>

This is entirely correct.

We decided that this was OK for our use case, as in general queries that returned enough data to be worth streaming
weresimple SELECTs which were not expected to commingle with UPDATEs or otherwise promise any transactional guarantees
(otherthan the read was a consistent snapshot). 

We also changed our API such that after all results streamed, it emitted a status report.  e.g.

HTTP 200 OK
{ "results" : [..., ...], "status": {"success": true} }

A little janky, but it worked well for us.

>
> On 12. stu. 2014., at 23:04, Steven Schlansker <stevenschlansker@gmail.com> wrote:
>
>>
>> On Nov 12, 2014, at 11:20 AM, Marko Topolnik <marko.topolnik@gmail.com> wrote:
>>>
>>> I am aware of this approach; I even discuss it as a typical workaround in my blog post [1]. The problem is, this
completelydispenses with the detailed and complex setup present in the View layer: the response format can be driven by
theAccept HTTP header, encoding may be adapted to the request, and so on. I did pretty much that in my Clojure projects
becauseClojure's ecosystem isn't as well-developed so roll-your-own solutions are the norm. With Spring, however, you
losea great deal when you give your Controller method access to the raw response. 
>>>
>>> I have already developed a Stream-based REST service along the lines I discuss in the mentioned blog post and the
advantagesare quite clear to me. It is a direction that Spring would probably like to move in as well. 
>>>
>>> [1] http://www.airpair.com/v1/posts/spring-streams-memory-efficiency
>>> ...
>>> You still seem to dictate the output format directly from the Controller, the avoidance of which is motivating my
approach.
>>>
>>
>> Indeed, after reading your blog post, it looks like we have 95% the exact same solution.
>>
>> The essential difference I propose is that instead of having a holdable ResultSet (which requires PG-JDBC changes)
>> you hold the Connection/Statement/ResultSet with transaction open for the duration.
>>
>> This way the transaction stays open for the duration of your request.  So you establish a stream all the way through
>> from end client -> PG.
>>
>> With your suggestion, a holdable ResultSet, you run into what Dave mentioned:
>>
>> http://www.postgresql.org/docs/9.3/static/sql-declare.html
>>> In the current implementation, the rows represented by a held cursor are copied into a temporary file or memory
areaso that they remain available for subsequent transactions. 
>>
>> So in fact you are just materializing the result set server side.  This IMO defeats the entire purpose - you cannot
beginstreaming out results until the materialize finishes (the holdable cursor will not be available until said
materializeis done, I believe?  or at least the transaction will not commit until that point?)  Therefore my proposal
issignificantly more efficient and will never surprisingly materialize gigabytes of temporary disk usage. 
>>
>> Makes more sense?
>>
>