Log Data Analytics : Confused about the choice of Database

Поиск
Список
Период
Сортировка
От Peeyush Agarwal
Тема Log Data Analytics : Confused about the choice of Database
Дата
Msg-id CA+-6-YJq69Py7hsFGL=4k5wKXsa5UsNEkvRfmrfxstfiUYDaAw@mail.gmail.com
обсуждение исходный текст
Список pgsql-general
Hi,

I have log data of the following format:


Session    Timestamp    Event                Parameters
1          1            Started Session      
1          2            Logged In            Username:"user1"
2          3            Started Session
1          3            Started Challenge    title:"Challenge 1", level:"2"
2          4            Logged In            Username:"user2"
Now, a person wants to carry out analytics on this log data (And would like to receive it as a JSON blob after appropriate transformations). For example, he may want to receive a JSON blob where the Log Data is grouped by Session and TimeFromSessionStart and CountOfEvents are added before the data is sent so that he can carry out meaningful analysis. Here I should return:


[ {   "session":1,"CountOfEvents":3,"Actions":[{"TimeFromSessionStart":0,"Event":"Session Started"}, {"TimeFromSessionStart":1, "Event":"Logged In", "Username":"user1"}, {"TimeFromSessionStart":2, "Event":"Startd Challenge", "title":"Challenge 1", "level":"2" }] }, {    "session":2, "CountOfEvents":2,"Actions":[{"TimeFromSessionStart":0,"Event":"Session     Started"}, {"TimeFromSessionStart":2, "Event":"Logged In", "Username":"user2"}]  }
]

Here, TimeFromSessionStartCountOfEvents etc. [Let's call it synthetic additional data] will not be hard coded and I will make a web interface to allow the person to decide what kind of synthetic data he requires in the JSON blob. I would like to provide a good amount of flexibility to the person to decide what kind of synthetic data he wants in the JSON blob.

If I use PostgreSQL, I can store the data in the following manner: Session and Event can be stringTimestamp can be date and Parameters can be hstore(key value pairs available in PostgreSQL). After that, I can use SQL queries to compute the synthetic (or additional) data, store it temporarily in variables in a Rails Application (which will interact with PostgreSQL database and act as interface for the person who wants the JSON blob) and create JSON blob from it.

However I am not sure if PostgreSQL is the best choice for this use case. I have put the detailed question on SO at http://stackoverflow.com/questions/23544604/log-data-analytics

Looking for some help from the community.

Peeyush Agarwal

В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Question about synchronous replication
Следующее
От: Vegard Bønes
Дата:
Сообщение: better performance on poorer machine?