Unicode database + JDBC driver performance
От | Jan Ploski |
---|---|
Тема | Unicode database + JDBC driver performance |
Дата | |
Msg-id | 13799530.1040481150216.JavaMail.jpl@remotejava обсуждение исходный текст |
Ответы |
Re: Unicode database + JDBC driver performance
Re: Unicode database + JDBC driver performance |
Список | pgsql-general |
Hello, I have some questions regarding PostgreSQL handling of Unicode databases and their performance. I am using version 7.2.1 and running two benchmarks against a database set up with LATIN1 encoding and the same database with UNICODE. The database consists of a single "test" table: Column | Type | Modifiers --------+---------+----------- id | integer | not null txt | text | not null Primary key: test_pkey The client is written in Java, it relies on the official JDBC driver, and is being run on the same machine as the database. Benchmark 1: Insert 10,000 rows (in 10 transactions, 1000 rows per transaction) into table "test". Each row contains 674 characters, most of which are ASCII. Benchmark 2: select * from test, repeated 10 times in a loop I am measuring the disk space taken by the database in each case (LATIN1 vs UNICODE) and the time it takes to run the benchmarks. I don't understand the results: Disk space change (after inserts and vacuumdb -f): LATIN1 UNICODE 764K 640K I would rather assume that the Unicode database takes more space, even 2 times as more.. Apparently not (and that's nice). Avg. Benchmark execution times (obtained with the 'time' command, repeatedly): Benchmark 1: LATIN1 UNICODE 11.5s 14.5s Benchmark 2: LATIN1 UNICODE 4.7s 8.6s The Unicode database is slower both on INSERTs and especially on SELECTs. I am wondering why. Since Java uses Unicode internally, shouldn't it actually be more efficient to store/retrieve character data in that format, with no recoding? Maybe it is an issue with the JDBC driver? Or is handling Unicode inherently much slower on the backend side? Take care - JPL
В списке pgsql-general по дате отправления: