Re: large xml database

Поиск
Список
Период
Сортировка
От Rob Sargent
Тема Re: large xml database
Дата
Msg-id 4CCDDEE9.9010803@gmail.com
обсуждение исходный текст
Ответ на Re: large xml database  (Viktor Bojović <viktor.bojovic@gmail.com>)
Ответы Re: large xml database  (Viktor Bojović <viktor.bojovic@gmail.com>)
Список pgsql-sql

Viktor Bojović wrote:
>
>
> On Sun, Oct 31, 2010 at 9:42 PM, Rob Sargent <robjsargent@gmail.com 
> <mailto:robjsargent@gmail.com>> wrote:
>
>
>
>
>     Viktor Bojovic' wrote:
>
>
>
>         On Sun, Oct 31, 2010 at 2:26 AM, James Cloos
>         <cloos@jhcloos.com <mailto:cloos@jhcloos.com>
>         <mailto:cloos@jhcloos.com <mailto:cloos@jhcloos.com>>> wrote:
>
>            >>>>> "VB" == Viktor Bojovic' <viktor.bojovic@gmail.com
>         <mailto:viktor.bojovic@gmail.com>
>
>            <mailto:viktor.bojovic@gmail.com
>         <mailto:viktor.bojovic@gmail.com>>> writes:
>
>            VB> i have very big XML documment which is larger than 50GB and
>            want to
>            VB> import it into databse, and transform it to relational
>         schema.
>
>            Were I doing such a conversion, I'd use perl to convert the
>         xml into
>            something which COPY can grok. Any other language, script
>         or compiled,
>            would work just as well. The goal is to avoid having to
>         slurp the
>            whole
>            xml structure into memory.
>
>            -JimC
>            --
>            James Cloos <cloos@jhcloos.com <mailto:cloos@jhcloos.com>
>         <mailto:cloos@jhcloos.com <mailto:cloos@jhcloos.com>>>
>
>            OpenPGP: 1024D/ED7DAEA6
>
>
>         The insertion into dabase is not very big problem.
>         I insert it as XML docs, or as varchar lines or as XML docs in
>         varchar format. Usually i use transaction and commit after
>         block of 1000 inserts and it goes very fast. so insertion is
>         over after few hours.
>         But the problem occurs when i want to transform it inside
>         database from XML(varchar or XML format) into tables by parsing.
>         That processing takes too much time in database no matter if
>         it is stored as varchar lines, varchar nodes or XML data type.
>
>         -- 
>         ---------------------------------------
>         Viktor Bojovic'
>
>         ---------------------------------------
>         Wherever I go, Murphy goes with me
>
>
>     Are you saying you first load the xml into the database, then
>     parse that xml into instance of objects (rows in tables)?
>
>
> Yes. That way takes less ram then using twig or simple xml, so I tried 
> using postgre xml functions or regexes.
>
>
>
> -- 
> ---------------------------------------
> Viktor Bojović
> ---------------------------------------
> Wherever I go, Murphy goes with me
Is the entire load a set of "entry" elements as your example contains?  
This I believe would parse nicely into a tidy but non-trivial schema 
directly without the "middle-man" of having xml in db (unless of course 
you prefer xpath to sql ;) )

The single most significant caveat I would have for you is Beware: 
Biologists involved. Inconsistency (at least overloaded concepts)  
almost assured :).  EMBL too is suspect imho, but I've been out of that 
arena for a while.





В списке pgsql-sql по дате отправления:

Предыдущее
От: Viktor Bojović
Дата:
Сообщение: Re: large xml database
Следующее
От: Viktor Bojović
Дата:
Сообщение: Re: large xml database