Обсуждение: Re: Speeding up a query.

Поиск
Список
Период
Сортировка

Re: Speeding up a query.

От
Alberto Dalmaso
Дата:
yes, I have to make that because the data on the table need to be
pivoted so it is joined many times with different filter on the column
that describe the meaning of the column called numeric_value I'm going
to show.
That could be very ineffective, event because that table contains
something like 25000000 rows...
There are two tables in this condition (as you can se in the explain)
and both are the table with the higher number of rows in the database.
But I don's see any other choice to obtain that information.

P.S.: i'm trying with all enable_* to on and pumping to higher values
from_collapse_limit and join_collapse_limit that I've put to 30.
The result is that the query, after an hour of work, goes out of memory
(SQL State 53200)...


Re: Speeding up a query.

От
"Kevin Grittner"
Дата:
Alberto Dalmaso <dalmaso@clesius.it> wrote:

> P.S.: i'm trying with all enable_* to on and pumping to higher
> values from_collapse_limit and join_collapse_limit that I've put to
> 30.

Tom suggested that you set those numbers higher than the number of
tables joined in the query.  I don't think 30 will do that.

> The result is that the query, after an hour of work, goes out of
> memory (SQL State 53200)...

Ouch!  Can you provide more details?  All information from the
PostgreSQL log about that event would be good.  If there's anything
which might be related in the OS logs from around that time, please
include that, too.

Also, with those settings at a high value, try running just an EXPLAIN
(no ANALYZE) of the query, to see how long that takes, and whether you
have a memory issue during the planning phase.  (You can use \timing
in psql to get a report of the run time of the EXPLAIN.)

-Kevin

Re: Speeding up a query.

От
"Hartman, Matthew"
Дата:

I promised to provide more details of the query (or the function as it is). Here goes.

 

Scenario:

A chemotherapy regimen requires chair time and nursing time. A patient might sit in the chair for three hours but the nurse only has to be with them for the first hour. Therefore, nurses can manage multiple chairs at a time. Each regimen has a different time requirement.

 

To efficiently manage our chair and nursing resources, we want to schedule against these constraints. Our room currently has 17 chairs and around 8 nurses per day. We administer several hundred different regimens and the time for each regimen varies based on the day of the regimen as well as the course. All of these variables are entered and maintained through a web application I wrote.

 

Scheduling algorithm:

            Written in PostgreSQL (naturally), the algorithm is a single function call. It gathers the data for a day into a temporary table and cycles through each appointment. Appointments are scheduled in the following order: locked appointments (previously scheduled and assigned to a nurse and chair), reserved appointments (a desired time slot has been selected), open appointments (ordered by the required chair time descending and the required nurse time descending). Here’s the busy part that loops through each appointment. The table definition follows. Anything beginning with an underscore is a declared variable.

 

 

-- Reserved and unscheduled appointments.

FOR _APPOINTMENT IN SELECT * FROM MATRIX_UNSCHEDULED WHERE APPT_STATUS <> 'L' ORDER BY ROW_NUM

LOOP

            -- Initialize the variables for this record.

        RAISE NOTICE 'Status ''%'' - %', _APPOINTMENT.APPT_STATUS, _APPOINTMENT;

            _AVAILABLE := null;

            select into _UNIT_INTERVALS, _NURSE_INTERVALS, _UNIT_REQUIRED, _NURSE_REQUIRED

                        _APPOINTMENT.total_unit_time / 5,

                        _APPOINTMENT.total_nurse_time / 5,

                        (_APPOINTMENT.total_unit_time || ' minutes')::INTERVAL,

                        (_APPOINTMENT.total_nurse_time || ' minutes')::INTERVAL;

           

 

            -- Find the first available row for the required unit and nurse time.

            select into _AVAILABLE unit.row_num

            from     (

                                    select   m1.row_num

                                    from     matrix m1,

                                                matrix m2

                                    where    m1.unit_id = m2.unit_id

                                                and m1.nurse_id = m2.nurse_id

                                                and m1.unit_scheduled = false

                                                and m2.unit_scheduled = false

                                                and (_APPOINTMENT.reserved_time is null or m1.timeslot = _APPOINTMENT.reserved_time)

                                                and m2.timeslot between m1.timeslot and (m1.timeslot + _UNIT_REQUIRED)

                                    group by m1.row_num

                                    having count(m2.row_num) = _UNIT_INTERVALS + 1

                        ) unit,

                        (

                                    select   m1.row_num

                                    from     matrix m1,

                                                matrix m2

                                    where    m1.unit_id = m2.unit_id

                                                and m1.nurse_id = m2.nurse_id

                                                and m1.nurse_scheduled = false

                                                and m2.nurse_scheduled = false

                                                and (_APPOINTMENT.reserved_time is null or m1.timeslot = _APPOINTMENT.reserved_time)

                                                and m2.timeslot between m1.timeslot and (m1.timeslot + _NURSE_REQUIRED)

                                    group by m1.row_num

                                    having count(m1.row_num) = _NURSE_INTERVALS + 1

                        ) nurse

            where    nurse.row_num = unit.row_num

            order by unit.row_num

            limit 1;

 

            -- Assign the time, unit, and nurse to the unscheduled appointment.

            update matrix_unscheduled set

                        appt_time = matrix.timeslot,

                        unit_id = matrix.unit_id,

                        nurse_id = matrix.nurse_id,

                        appt_status = 'S'

            from     matrix

            where    schedule_appt_id = _APPOINTMENT.schedule_appt_id

                        and matrix.row_num = _AVAILABLE;

 

            -- Mark the unit as scheduled for that time.

            update matrix set

                        unit_scheduled = true

            from     (select timeslot, unit_id from matrix where row_num = _AVAILABLE) m2

            where    matrix.unit_id = m2.unit_id

                        and matrix.timeslot between m2.timeslot and (m2.timeslot + _UNIT_REQUIRED);

           

            -- Mark the nurse as scheduled for that time.

            update matrix set

                        nurse_scheduled = true

            from     (select timeslot, nurse_id from matrix where row_num = _AVAILABLE) m2

            where    matrix.nurse_id = m2.nurse_id

                        and matrix.timeslot between m2.timeslot and (m2.timeslot + _NURSE_REQUIRED);

 

END LOOP;

 

 

CREATE TABLE matrix_unscheduled

(

  row_num serial NOT NULL,

  schedule_appt_id integer NOT NULL,

  appt_time timestamp without time zone,

  reserved_time timestamp without time zone,

  appt_status character(1) NOT NULL,

  unit_id integer,

  nurse_id integer,

  total_unit_time integer NOT NULL,

  total_nurse_time integer NOT NULL,

  CONSTRAINT pk_matrix_unscheduled PRIMARY KEY (row_num)

)

WITH (OIDS=FALSE);

 

CREATE TABLE matrix

(

  row_num serial NOT NULL,

  timeslot timestamp without time zone NOT NULL,

  unit_id integer NOT NULL,

  nurse_id integer NOT NULL,

  unit_scheduled boolean NOT NULL,

  nurse_scheduled boolean NOT NULL,

  CONSTRAINT pk_matrix PRIMARY KEY (row_num)

)

WITH (OIDS=FALSE);

 

There are indexes on “matrix” for “timeslot,unit_id”, “timeslot,nurse_id”, and “unit_id,nurse_id”.


Matthew Hartman
Programmer/Analyst
Information Management, ICP
Kingston General Hospital
(613) 549-6666 x4294

 

 

Re: Speeding up a query.

От
Tom Lane
Дата:
Alberto Dalmaso <dalmaso@clesius.it> writes:
> P.S.: i'm trying with all enable_* to on and pumping to higher values
> from_collapse_limit and join_collapse_limit that I've put to 30.
> The result is that the query, after an hour of work, goes out of memory
> (SQL State 53200)...

Hmm, is that happening during planning (ie, do you get the same error
if you just try to EXPLAIN the query with those settings)?  If not,
please show the EXPLAIN output.

            regards, tom lane