Обсуждение: COPY command returns "ERROR: invalid XML content"

Поиск
Список
Период
Сортировка

COPY command returns "ERROR: invalid XML content"

От
Konstantin Izmailov
Дата:
Hi,
I'm using libpq (v10) to import lots of xml files into a PG10 table. I noticed if number of records imported exceeds 2100 then the following error is returned:
ERROR:  invalid XML content
DETAIL:  line 1: Couldn't find end of Start Tag timeBasedFileNamingAndTriggerin line 1
logFile.%d{yyyy-MM-dd}.%i.html</fileNamePattern><timeBasedFileNamingAndTriggerin
                                                                               ^
line 1: Premature end of data in tag rollingPolicy line 1
line 1: Premature end of data in tag appender line 1
line 1: Premature end of data in tag configuration line 1
CONTEXT:  COPY xmltest, line 2098, column Roles

The target table was created as:
CREATE TABLE "xmltest" ("id" int NOT NULL primary key,"Roles" xml)

And the code that imports the files is very simple:
PQexec("COPY xmltest(id,\"Roles\") FROM stdin WITH BINARY", 0, 0);
PQputCopyData(conn, ti_buffer, ti_size);
PQputCopyEnd(conn, 0);

If I call PQputCopyEnd() after every 2100 or less records placed into ti_buffer, I can import tens of thousands of xml files. But attempt to import 2101 or more records causes the error. I inspected the ti_buffer and confirmed that it contains all data expected. It appears like libpq does not transmit the buffer to the server after about 9,100,000 characters. I'm running the application on Windows Server 2012R2, compiled with VS2013. The PG10 server is running on the same workstation on Port 5410. I also tried to Copy the records to Postgres 11, and got the same results.

I will appreciate any suggestion on troubleshooting/resolving the issue. Thanks!

Re: COPY command returns "ERROR: invalid XML content"

От
Tomas Vondra
Дата:
On Sun, Oct 06, 2019 at 08:45:40PM -0700, Konstantin Izmailov wrote:
>Hi,
>I'm using libpq (v10) to import lots of xml files into a PG10 table. I
>noticed if number of records imported exceeds 2100 then the following error
>is returned:
>ERROR:  invalid XML content
>DETAIL:  line 1: Couldn't find end of Start Tag
>timeBasedFileNamingAndTriggerin line 1
>logFile.%d{yyyy-MM-dd}.%i.html</fileNamePattern><timeBasedFileNamingAndTriggerin
>

My guess is this is an issue/limitation in libxml2, which we use to
parse and process XML. What libxml2 version you have installed? Can you
share an example of a XML document to reproduce the issue?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: COPY command returns "ERROR: invalid XML content"

От
Konstantin Izmailov
Дата:
Tomas, thank you for your reply! I cannot upload 2100+ xml files. Some of them are huge.

I'm not sure if libpq is using libxml2 on Windows. In debugger I see very strange behavior of pqsecure_write. It seems like it stops sending data from provided buffer after 9,100,000 bytes.

I hoped that someone came across similar issue, and bring some insight. I continue researching the issue.

On Mon, Oct 7, 2019 at 5:13 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Sun, Oct 06, 2019 at 08:45:40PM -0700, Konstantin Izmailov wrote:
>Hi,
>I'm using libpq (v10) to import lots of xml files into a PG10 table. I
>noticed if number of records imported exceeds 2100 then the following error
>is returned:
>ERROR:  invalid XML content
>DETAIL:  line 1: Couldn't find end of Start Tag
>timeBasedFileNamingAndTriggerin line 1
>logFile.%d{yyyy-MM-dd}.%i.html</fileNamePattern><timeBasedFileNamingAndTriggerin
>

My guess is this is an issue/limitation in libxml2, which we use to
parse and process XML. What libxml2 version you have installed? Can you
share an example of a XML document to reproduce the issue?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: COPY command returns "ERROR: invalid XML content"

От
Konstantin Izmailov
Дата:
Please ignore this thread. After several days of debugging I found bug in my application. It was misalignment of data when internal buffers reallocated. After the application fix it all works as expected. Sorry for the false alarm.

On Mon, Oct 7, 2019 at 7:03 PM Konstantin Izmailov <pgfizm@gmail.com> wrote:
Tomas, thank you for your reply! I cannot upload 2100+ xml files. Some of them are huge.

I'm not sure if libpq is using libxml2 on Windows. In debugger I see very strange behavior of pqsecure_write. It seems like it stops sending data from provided buffer after 9,100,000 bytes.

I hoped that someone came across similar issue, and bring some insight. I continue researching the issue.

On Mon, Oct 7, 2019 at 5:13 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Sun, Oct 06, 2019 at 08:45:40PM -0700, Konstantin Izmailov wrote:
>Hi,
>I'm using libpq (v10) to import lots of xml files into a PG10 table. I
>noticed if number of records imported exceeds 2100 then the following error
>is returned:
>ERROR:  invalid XML content
>DETAIL:  line 1: Couldn't find end of Start Tag
>timeBasedFileNamingAndTriggerin line 1
>logFile.%d{yyyy-MM-dd}.%i.html</fileNamePattern><timeBasedFileNamingAndTriggerin
>

My guess is this is an issue/limitation in libxml2, which we use to
parse and process XML. What libxml2 version you have installed? Can you
share an example of a XML document to reproduce the issue?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services