custom crosstab question

Поиск
Список
Период
Сортировка
От Michael Swierczek
Тема custom crosstab question
Дата
Msg-id 68b5b5880704250736v41261ffg5b506ecb7978af0d@mail.gmail.com
обсуждение исходный текст
Ответы Re: custom crosstab question
Список pgsql-novice
Ladies and Gentlemen,
This will be long, I apologize.   I'm mostly looking for someone to
tell me there's an obvious solution I'm missing.

I have a table 'answer' with medical survey answers.  Periodically the
data must be exported to CSV files in a quasi-crosstab format so they
can be examined by statisticians.

The complication is that many of the questions in the medical surveys
can be skipped entirely (if you report not using drugs at all, we
don't ask how recently you used marijuana, and so forth...)  For
analysis, the exported data needs a 'category' column for each survey
question even if it has no corresponding answer and not just the
survey questions with answers that actually exist in the database.

Here is a simplified example of the tables, showing sample data for one survey:
survey_question: (id |  survey_id  |  question_code  | question_order)
1  |   1  |   'drug'  |   1
2  |   1  |   'marijuana' |  2
3  |   1  |   'sick'  |  3

answer:  (id |  survey_event_id |  question_code  |  answer_order |
answer_value)
1  |  1  |   'drug' |  1  |   1
2  |  1  |   'sick' |  2  |   1

Using tablefunc/crosstab, the resulting export would be
(survey_event_id |  'drug' | 'sick'  ....  )
But I need  (survey_event_id | 'drug' | 'marijuana' | 'sick' .... )

I wrote a program that takes a 'survey_id' input, queries PostgreSQL
for the set of question_codes, and then generates a massive query in
the form:
--- begin huge query
select an0.survey_event_id, an0.answer_value, an1.answer_value,
an2.answer_value,
an3.answer_value .....
from answer an0
left join answer an1 where an0.survey_event_id = an1.survey_event_id
and an1.question_code = 'drug'
left join answer an2 where an0.survey_event_id = an2.survey_event_id and
an2.question_code = 'marijuana'
left join....
--- end huge query

Depending upon the survey, the resulting queries have between 10 and
250 joins on the answer_table.  I run them through psql and pipe the
results to a csv file.   They work, but they're pretty slow.  There
are 14,300 total survey_event entries from 50 different surveys in the
test database I'm using, and exporting all of the data in this way
takes 3 hours.   I'm using PostgreSQL 8.1.9 and 8.2.4 (the latter is
faster, although I don't have exact figures how much).

Thanks,
Mike

В списке pgsql-novice по дате отправления:

Предыдущее
От: Francois Deliege
Дата:
Сообщение: moving data from windows to linux
Следующее
От: Alan Hodgson
Дата:
Сообщение: Re: moving data from windows to linux