Обсуждение: kind of a bag of attributes in a DB . . .
Say, you get lots of data and their corresponding metadata, which in some cases may be undefined or undeclared (left as an empty string). Think of youtube json files or the result of the "file" command. I need to be able to "instantly" search that metadata and I think DBs are best for such jobs and get some metrics out of it. I know this is not exactly a kosher way to deal with data which can't be represented in a nice tabular form, but I don't find the idea that half way off either. What is the pattern, anti-pattern or whatever relating to such design? Do you know of such implementations with such data? lbrtchx
On 9/7/19 5:45 AM, Albretch Mueller wrote: > Say, you get lots of data and their corresponding metadata, which in > some cases may be undefined or undeclared (left as an empty string). > Think of youtube json files or the result of the "file" command. > > I need to be able to "instantly" search that metadata and I think DBs > are best for such jobs and get some metrics out of it. Is the metadata uniform or are you dealing with a variety of different data? > > I know this is not exactly a kosher way to deal with data which can't > be represented in a nice tabular form, but I don't find the idea that > half way off either. > > What is the pattern, anti-pattern or whatever relating to such design? > > Do you know of such implementations with such data? > > lbrtchx > > > -- Adrian Klaver adrian.klaver@aklaver.com
On Sat, Sep 7, 2019 at 5:17 PM Albretch Mueller <lbrtchx@gmail.com> wrote:
Say, you get lots of data and their corresponding metadata, which in
some cases may be undefined or undeclared (left as an empty string).
Think of youtube json files or the result of the "file" command.
I need to be able to "instantly" search that metadata and I think DBs
are best for such jobs and get some metrics out of it.
I know this is not exactly a kosher way to deal with data which can't
be represented in a nice tabular form, but I don't find the idea that
half way off either.
What is the pattern, anti-pattern or whatever relating to such design?
Do you know of such implementations with such data?
We do the debug logs of JSONB with some indexing. It works in some limited cases but you need to have a good sense of index possibilities and how the indexes actually work.
lbrtchx
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.
On 9/7/19, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > Is the metadata uniform or are you dealing with a variety of different > data? You can expect for all files to have a filename and size, but their kinds (the metadata describing them) can be really colorful and wild when it comes to formatting. lbrtchx
On 9/10/19 9:59 AM, Albretch Mueller wrote: > On 9/7/19, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> Is the metadata uniform or are you dealing with a variety of different >> data? > > You can expect for all files to have a filename and size, but their > kinds (the metadata describing them) can be really colorful and wild > when it comes to formatting. If there is no rhyme or reason to the metadata I am not sure how you could come up with an efficient search strategy. Seems it would be a brute search over everything. > > lbrtchx > -- Adrian Klaver adrian.klaver@aklaver.com
On 9/10/19, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > If there is no rhyme or reason to the metadata I am not sure how you > could come up with an efficient search strategy. Seems it would be a > brute search over everything. Not exactly. Say some things have colours but now weight. You could still Group them as being "weighty" and then tell about how heavy they are, with the colorful ones you could specify the colours and then see if there is some correlation between weights and colours ... lbrtchx
On 9/11/19 9:46 AM, Albretch Mueller wrote: > On 9/10/19, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> If there is no rhyme or reason to the metadata I am not sure how you >> could come up with an efficient search strategy. Seems it would be a >> brute search over everything. > > Not exactly. Say some things have colours but now weight. You could > still Group them as being "weighty" and then tell about how heavy they > are, with the colorful ones you could specify the colours and then see > if there is some correlation between weights and colours ... It would help to see some sample data, otherwise any answer would be pure speculation. > > lbrtchx > -- Adrian Klaver adrian.klaver@aklaver.com
just download a bunch of json info files from youtube data Feeds Actually, does postgresql has a json Driver of import feature? the metadata contained in json files would require more than one small databases, but such an import feature should be trivial C
On 9/14/19 2:06 AM, Albretch Mueller wrote: > just download a bunch of json info files from youtube data Feeds > > Actually, does postgresql has a json Driver of import feature? Not sure what you mean by above? Postgres has json(b) data types that you can import JSON into: https://www.postgresql.org/docs/11/datatype-json.html > > the metadata contained in json files would require more than one > small databases, but such an import feature should be trivial Again, not sure I understand why small databases are required? > > C > -- Adrian Klaver adrian.klaver@aklaver.com
On 9/14/19 2:06 AM, Albretch Mueller wrote: > just download a bunch of json info files from youtube data Feeds > > Actually, does postgresql has a json Driver of import feature? I'm working without a net(coffee) and so I forgot to mention that for Python there is: http://initd.org/psycopg/docs/extras.html?highlight=json Not sure if this is what you are looking for or not. > > the metadata contained in json files would require more than one > small databases, but such an import feature should be trivial > > C > -- Adrian Klaver adrian.klaver@aklaver.com
On Sat, Sep 14, 2019 at 5:11 PM Albretch Mueller <lbrtchx@gmail.com> wrote:
just download a bunch of json info files from youtube data Feeds
Actually, does postgresql has a json Driver of import feature?
Sort of.... There are a bunch of features around JSON and JSONB data types which could be useful.
the metadata contained in json files would require more than one
small databases, but such an import feature should be trivial
It is not at all trivial for a bunch of reasons inherent to the JSON specification. How to handle duplicate keys, for example.
However writing an import for JSON objects into a particular database is indeed trivial.
C
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.