Re: Ideas for building a system that parses medical research publications/articles

Поиск
Список
Период
Сортировка
От Laura Smith
Тема Re: Ideas for building a system that parses medical research publications/articles
Дата
Msg-id -e-eLsFysFFf83c19vAO3HQIXL2-1FJThcuoshjfKjr10-UBDxp7TujDwtTpB4eSi-KkJ2lFGNOamjcBAxbQmPGizCDv34trju3eyEO1oFU=@protonmail.ch
обсуждение исходный текст
Ответ на Ideas for building a system that parses medical research publications/articles  (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
Ответы Re: Ideas for building a system that parses medical research publications/articles  (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
Список pgsql-general
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, 5 June 2021 10:49, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:

> Hello
>
> I am imagining a system that can parse papers from various sources
> (web/files/etc) and in various formats (text, pdf, etc) and can store
> metadata for this paper ,some kind of global ID if applicable, authors,
> areas of research, whether the paper is "new", "highlighted",
> "historical", type (e.g. Case reports, Clinical trials), symptoms (e.g.
> tics, GI pain, psychological changes, anxiety, ), and other key
> attributes (I guess dynamic), it must be full text searchable, etc.
>
> I am at the very beginning in this and it is done on a fully volunteer
> basis.
>
> Lots of questions : is there any scientific/scholar analysis software
> already available? If yes and is really good and open source , then this
> will influence the rest of decisions. Otherwise , I'll have to form a
> team that can write one, in this case I'll have to decide DB, language,
> etc. I work 20 years with pgsql so it is the natural choice for any kind
> of data, I just ask this for the sake of completeness.
>
> All ideas welcome.

Hello Achilleas

Not wishing to be discouraging, but you have very ambitious goals for what sounds like a one-person project ?

You are effectively looking at competing with platforms such as Elsevier Scopus/Scival which are market-leaders in the
areafor good reason (i.e. it takes a lot of manpower to write algorithms, manage metadata etc., and the only way to
consistentlymaintain that manpower is to employ people, lots of them).   There are also things like Google Scholar
aroundthe place. 

I think before starting on the technical side of Postgres etc., the honest truth is that you need to do more planning,
bothin terms of implementation and long-term sustainability. 

For example, before we even get to metadata, you talk of various sources and formats.  Have you considered licensing
issues?  Have you considered how to keep the dataset clean ? (If you are thinking you can just scrape the web, then
you'llbe in for a surprise). 

Laura



В списке pgsql-general по дате отправления:

Предыдущее
От: Achilleas Mantzios
Дата:
Сообщение: Ideas for building a system that parses medical research publications/articles
Следующее
От: Achilleas Mantzios
Дата:
Сообщение: Re: Ideas for building a system that parses medical research publications/articles