Re: new String(byte[]) performance
От | Aaron Mulder |
---|---|
Тема | Re: new String(byte[]) performance |
Дата | |
Msg-id | Pine.LNX.4.44.0210221224440.13321-100000@www.princetongames.org обсуждение исходный текст |
Ответ на | Re: new String(byte[]) performance (Barry Lind <blind@xythos.com>) |
Список | pgsql-jdbc |
Barry, Are you saying that the server returns everything as strings/characters no matter what? Like if it sends the number "123456" that will be 7 bytes (6+null), not 4 bytes (an int)? Can we make it send the 4-byte int value instead? Aaron On Mon, 21 Oct 2002, Barry Lind wrote: > Teofilis, > > I don't think the problem you are seeing is as a result of using java. > It is more the result of the architecture of the jdbc driver. I know > other followups to this email have suggested fixes at the IO level, and > while I think that may be interesting to look into, I think there is a > lot that can be done to improve performance within the existing code > that can work on all jdks (1.1, 1.2, 1.3 and 1.4). > > If you look at what is happening in the driver when you do something as > simple as 'select 1', you can see many areas of improvement. > > The first thing that the driver does is allocate byte[] objects for each > value being selected (a two dimensional array of rows and columns). > This makes sense since the values are being read raw off of the socket > and need to be stored somewhere and byte[] seems a reasonable datatype. > However this results in allocating many, many, many small objects off > of the java heap which then need to be garbage collected later (and > garbage collection isn't free, it takes a lot of CPU that could be used > for other things). One design pattern to deal with this problem is to > use an object pool and reuse byte[] objects to avoid the excessive > overhead of the object creation and garbage collection. There have been > two attempts at this in the past, one I did (but lost due to a hard > drive crash) and another that was checked into CVS, but had a > significant number of issues it wasn't ever used. > > However the byte[] objects are only the first problem. For example take > a call to the getInt() method. It converts the raw byte[] data to a > String and then that string is converted to an int. So a bunch more > String objects are created (and then later garbage collected). These > String objects are there because the java API doesn't provide any > methods to convert from byte[] to other objects like int, long, > BigDecimal, Timestamp, etc. So a String intermediary is used. > > So to get the int returned by getInt() both a byte[] and a String object > get created only to be garbage collected later as they are just > temporary objects. > > Now using object pools can help the allocation of byte[] objects, but > doesn't help with String objects. However if the driver started using > char[] objects internally instead of Strings, these could be pooled as > well. But this would probably mean that code like > Integer.parseInt(String) would need to be reimplemented in the driver > since there is no corresponding Integer.parseInt(char[]). > > > Now while I realize that there is a lot of room for improvement, I find > that the overall performance of the Postgresql jdbc driver is similar to > the drivers I have used for other databases (i.e. Oracle and MSSQL). So > I wouldn't characterize the performance as bad, but it could be improved. > > thanks, > --Barry > > > > > > Teofilis Martisius wrote: > > On Sat, Oct 19, 2002 at 08:02:37PM -0700, Barry Lind wrote: > > > >>Teofilis, > >> > >>I have applied this patch. I also made the change that so that when > >>connected to a 7.3 database this optimization will always be used. This > >>is done by having the server do the character set encoding/decoding and > >>always using UTF-8 when dealing with the jdbc client. > >> > >>thanks, > >>--Barry > >> > > > > > > Hi, > > > > Ok, thanks for applying that. Well, after doing some benchmarks, I can > > say that java sucks. Don't get me wrong- it is still my language of > > choice and it is better than many other alternatives, but I have yet to > > see a JVM that has good performance, and no strange bottlenecks. I was > > quite annoyed to see that executing a query via JDBC and iterating over > > it from java took 6x the time it takes to execute it with psql. This > > patch helps a bit, but the performance overhead is still huge. Well, I > > looked over PostgreSQL JDBC driver code serveral times, and now I don't > > see anything more that can be optimized. The things that take up most > > time now is transferring everything over network > > (PG_Stream.receiveTuple if i remember correctly) and allocating memmory > > for byte[] arrays. But I don't know any way to speed them up. > > > > Teofilis Martisius, > > teo@mediaworks.lt > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
В списке pgsql-jdbc по дате отправления: