Re: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1
Дата
Msg-id CAKFQuwZ5UAtSjq=hR6rXfq6V++xOhomADEvOf+7i45D8DD_1sA@mail.gmail.com
обсуждение исходный текст
Ответ на [GENERAL] Insertion of large xml files into PostgreSQL 10beta1  (Alain Toussaint <atoussaint1976@gmail.com>)
Ответы Re: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1
Список pgsql-general
On Fri, Jun 23, 2017 at 8:19 AM, Alain Toussaint <atoussaint1976@gmail.com> wrote:
Hello,

I am building up a PostgreSQL server which I intend to load the
entirety of the pubmed database data (23GB bzip2 compressed, 216GB
unpacked) which is available in xml form of which, here is an example:

https://www.ncbi.nlm.nih.gov/pubmed/21833294?report=xml&format=text

I looked at the documentation as well as several examples code for
loading the data and the one example who nearly succeeded is this
procedure:

/usr/bin/psql medline

\set :largexmlfile: 'cat /srv/pgsql/pubmed/medline17n0001.xml'
INSERT INTO samples (xmldata) VALUES :largexmlfile:

​I'll assume you've just mis-keyed this from memory since the syntax of the above doesn't like right.

(from reading the list post here:
https://www.postgresql.org/message-id/20160624083757.GA5459%40msg.df7cb.de)

In which, about 334MB of data from medline17n0001.xml will flood the
monitor.

​If the above general command sequence is done right, and echoing of commands is turned off, you should not see any of the XML file content echoed to the output.​
 

it is possible to turn off validation of the content between the xml
tags of the files.


​You can either turn off validation for the entire file or leave it on - PostgreSQL isn't recognizing tags here (you haven't defined the samples table for us...).​

​Narrowing down the entire file to a small problem region and posting a self-contained example, or at least providing the error messages and content, might help elicit good responses.​  Even if you could load the data without incident using it make end up proving problematic.  That said character encodings and sets are not my strong suit.

David J.

В списке pgsql-general по дате отправления:

Предыдущее
От: Alain Toussaint
Дата:
Сообщение: [GENERAL] Insertion of large xml files into PostgreSQL 10beta1
Следующее
От: "Igal @ Lucee.org"
Дата:
Сообщение: [GENERAL] Download 9.6.3 Binaries