Help normalizing table(s)
От | Kevin Old |
---|---|
Тема | Help normalizing table(s) |
Дата | |
Msg-id | 1034867525.3249.31.camel@oc обсуждение исходный текст |
Ответы |
Re: Help normalizing table(s)
|
Список | pgsql-general |
Hello everyone, First let me start off by saying that the situation I'm dealing with is very complex. Here's the scoop: I receive a stream of data from one of our internal servers, which I put into a textfile. The fields are separated by pipes and records by newline characters. As of now, I'm loading the data via Perl DBI into a table. The catch is: each record has 502 fields and I receive about 100,000 records per hour. I need to make these records searchable via a web interface for the user. With that said, I have almost everything working, but have hit a brick wall with the searching. I am currently using MySQL to store the data (not by choice...the users left to maintain this only know how to use MySQL). Because of the type of data this is (cellular calls) it has become very difficult to keep the records unique. Most of the 502 fields cannot be "unique" fields because of the replicated data that must exist in each record. There are 3 fields that combined can make a unique key - something like the following 2002101411-54738-107 - the first 10 digits are the date, the second are a call sequence number assigned to the call and the third is the tower it took place on. There's a catch to just using the call sequence number, as it starts over again after it hits 1 million. My solution to this is to store the data in tables by hour. This keeps the tables relatively small (50-100K records) and helps with searching. Now on to the real questions: I need to be able to search more than one table at a time, but just "union"ing the queries isn't a very good option as some users want to search 48 hours worth of data.....that would result in 48 unioned queries!!!! My solution to this so far is to take the fields that the users are searching on (only about 7) and stuffing them into a "master" table along with a unique identifier (bigint). I attach the same unique key to the table with the remaining data so I can join them and retrieve stuff later. This will allow the actually searching to be done on the smaller table and then if results are found get the data from the larger table. How do I keep this table rather small (1 million records or so) and still be able to search archived data? Are there any other ways that I can do this? Any help is appreciated!!! Kevin kold@carolina.rr.com
В списке pgsql-general по дате отправления: