Re: ODBC driver for Windows & future...
От | Marko Ristola |
---|---|
Тема | Re: ODBC driver for Windows & future... |
Дата | |
Msg-id | 41C7F33F.2070106@kolumbus.fi обсуждение исходный текст |
Ответ на | ODBC driver for Windows & future... ("Philippe Lang" <philippe.lang@attiksystem.ch>) |
Ответы |
Re: ODBC driver for Windows & future...
(markw@mohawksoft.com)
|
Список | pgsql-odbc |
Hi, I have investigated the ODBC driver behaviour, when the query result has very many rows. It seems, that the whole query result is stored as such into a memory buffer before any further processing. That buffer is reallocated, when needed. If the buffer is for example 50Mbytes, and after reallocation it's size will be 100Mbytes. The malloc() or realloc() takes a very long time. I investigated this bottleneck on Linux ODBC driver. The procedure without an ODBC cursor is as follows: 1. Read all query result data from the backend to the huge buffer. (maybe sometimes restructure the buffer, if some column size on the buffer is exceeded.) This seems to be the bottleneck with the large malloc() operation. 2. Read (and convert) the asked results from the buffer for the given row. Allocating huge buffers is inefficient. Linux operating system handles allocating big files much better than allocating big memory areas. More efficient would be to use a temporary file: sequential file scans are rather fast. One way for solving the problem: Maybe the key for solving the bottleneck is to tune the operating system to free enough memory beforehand: If the operating system has 100Mb unused memory, it is a lot faster, than if it has only 2Mb unused memory ready for fast memory allocations. Good way for solving the problem: The bottleneck can be avoided on the program side by using ODBC cursor. With ODBC cursor one can fetch for example 1000 rows in one batch from the database server. You get next 1000 rows with a new fetch. This way there is no limit on the number of rows fetched on any database. On large result sets, there is always a limit with the memory on 32 bit systems. On 64 bit systems this limit goes away, but the limit with some slowdown on nonlocal CPU memory won't go away even on high end machines. (NUMA machines have about 2Gbytes memory near each CPU. Other memory is behind a slower bus ). So memory allocations over 2Gb are not good for speed. So, the ODBC cursor scales well for any huge query result, on any client operating system. It works even on Java, where memory is extremely limited. Other ways to solve the problem? ODBC Code: How about allocating memory in 4Mb chunks? The operating system handles small memory allocations more easily and frees more memory to be available in the background while the ODBC driver fills the allocated chunk. Marko Ristola *Shachar Shemesh wrote: *lso of interest is that this very same client is also interested in the ODBC driver for a different project. We have already did some porting of their application, and have spotted a serious performance issue with ODBC when long query results are retrieved. It is possible (thought it would be best not to count on it) that we will do some work in that direction on ODBC in the foreseeable future. The reason we did not step forward and offered ourselves as full maintainers of the code is that we don't feel we have the resources for that. It is good to know, however, that the facilities for sending patches and having them committed exists.
В списке pgsql-odbc по дате отправления: