behavior of PGtokenizer w/ escaped delim
От | Ivonne Lopez |
---|---|
Тема | behavior of PGtokenizer w/ escaped delim |
Дата | |
Msg-id | 887B3E51-6ABF-40D2-89B7-D2BECF8CF124@me.com обсуждение исходный текст |
Ответы |
Re: behavior of PGtokenizer w/ escaped delim
|
Список | pgsql-jdbc |
I have been working on a project that requires using array_agg of composites with left joins in a one to many relationship.
Array cars = rs.getArray("cars");
assertNotNull(cars);
ResultSet rsCars = cars.getResultSet();
assertNotNull(rsCars);
while (rsCars.next()) {
String comp = rsCars.getString(2);
PGtokenizer token = new PGtokenizer(PGtokenizer.removePara(comp),',');
for (int i = 0; i < token.getSize(); i++) {
System.out.println(token.getToken(i));
}
My issue comes when parsing that results back in jdbc using PGtokenizer. Initially after reading the comments in the file I still believed that it would be able to sparse my result correctly, however on further inspection the only nesting it supports is '(' ')' '[' ']' '<' and '>'. If there is a comma within the text, even though pg "escapes" it by surrounding that column in quotes, PGtokenizer doesn't honor this.
That being said, I add that support to my forked copy of pgjdbc https://github.com/ivlo11/pgjdbc/tree/nest_quotes_pgtokenizer and so far it works perfectly. I still have to commit in the junit test cases I used.
So here are my questions:
1. Am I naive to think this approach will work?
2. And if not, what are the possibilities of this getting pulled into pgjdbc's master?
If you want to reproduce this, a sample setup is below.
Thanks,
Ivonne
CREATE TABLE person
(
id serial,
name character(10),
CONSTRAINT pkey_person PRIMARY KEY (id)
);
CREATE TABLE car
(
id serial,
name character(10),
description character varying,
owner integer,
CONSTRAINT pkey_car PRIMARY KEY (id),
CONSTRAINT fkey_car FOREIGN KEY (owner) REFERENCES person (id) ON UPDATE CASCADE ON DELETE CASCADE
);
INSERT INTO person (name) VALUES ('ivonne');
INSERT INTO car (name,description,owner) VALUES ('lexus','December 20, 2011',1),('honda','April 1, 2008',1),('mitsubishi','August 12, 1998',1);
SELECT person.name, array_agg(DISTINCT (car.id, car.name, car.description)) AS cars
FROM person
LEFT JOIN car ON (car.owner = person.id)
WHERE person.id = 1
GROUP BY person.name LIMIT 1
"ivonne ";"{"(1,\"lexus \",\"December 20, 2011\")","(2,\"honda \",\"April 1, 2008\")","(3,mitsubishi,\"August 12, 1998\")"}"
As you can see with the composite type some columns that can have quotes to deal with spaces and commas within them. Considering how escaping works in pg, I assumed that using PGTokenizer would be most efficient versus StringTokenizer.
@Test
public void testSelectPersonWithArrayCompositeLeftJoin() {
DBConnection dbconn = new DBConnection();
ResultSet rs;
try (Connection conn = dbconn.getConnection();)
{
rs = ORMPerson.selectPerson(conn, 1); // uses a prepared statement of the above select query
assertNotNull(rs);
assertTrue(rs.next());
public void testSelectPersonWithArrayCompositeLeftJoin() {
DBConnection dbconn = new DBConnection();
ResultSet rs;
try (Connection conn = dbconn.getConnection();)
{
rs = ORMPerson.selectPerson(conn, 1); // uses a prepared statement of the above select query
assertNotNull(rs);
assertTrue(rs.next());
Array cars = rs.getArray("cars");
assertNotNull(cars);
ResultSet rsCars = cars.getResultSet();
assertNotNull(rsCars);
while (rsCars.next()) {
String comp = rsCars.getString(2);
PGtokenizer token = new PGtokenizer(PGtokenizer.removePara(comp),',');
for (int i = 0; i < token.getSize(); i++) {
System.out.println(token.getToken(i));
}
System.out.println();
}
} catch (SQLException e) {
e.printStackTrace();
fail("SQLException returned, couldn't select person");
}
}
But this returns this:
}
} catch (SQLException e) {
e.printStackTrace();
fail("SQLException returned, couldn't select person");
}
}
But this returns this:
1
"lexus "
"December 20
2011"
"lexus "
"December 20
2011"
2
"honda "
"April 1
2008"
"honda "
"April 1
2008"
3
mitsubishi
"August 12
1998"
Which as you can see isn't right… the date is split into two "columns"
mitsubishi
"August 12
1998"
Which as you can see isn't right… the date is split into two "columns"
В списке pgsql-jdbc по дате отправления: