Re: Highly Efficient Custom Sorting

Поиск
Список
Период
Сортировка
От Eliot Gable
Тема Re: Highly Efficient Custom Sorting
Дата
Msg-id AANLkTimaUPkOn6OkCspsZ7iTiY9S6MFTrKTgFXKckl2u@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Highly Efficient Custom Sorting  (Craig Ringer <craig@postnewspapers.com.au>)
Ответы Re: Highly Efficient Custom Sorting  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-performance
Well, I re-wrote the algorithm in Perl. However, it did not solve the speed issue. Running time now is a whopping 240+ ms instead of the 31.8ms I was getting before (15.2 of which is sorting). Here is the Perl code on the sorting. I won't post the pl/pgsql code, because this is far more clear (in my opinion) on what the algorithm does:

DROP TYPE IF EXISTS glbtype CASCADE;
CREATE TYPE glbtype AS (
id INTEGER,
"group" TEXT,
priority INTEGER,
weight INTEGER
);

DROP TYPE IF EXISTS glbtype_result CASCADE;
CREATE TYPE glbtype_result AS (
id INTEGER,
priority INTEGER,
weight INTEGER,
"order" BIGINT
);

CREATE OR REPLACE FUNCTION GroupedRandomWeightedLB(glbtype[]) RETURNS SETOF glbtype_result AS
$BODY$
# Input is an array of a composite type
my ($input) = @_;
my %groups;
$input =~ s/^{|}$//g;
$input =~ s/[)(]//g;
my @rows;
my $count = 0;
while ($input && $count < 10000) {
my ($id, $group, $prio, $weight, @rest) = split(/,/, $input);
push(@rows, {id => $id, group => $group, priority => $prio, weight => $weight});
$count++;
$input = join(',', @rest);
}

if(scalar @rows < 1) {
elog(NOTICE, '  No rows sent for sorting.');
return undef;
} else {
elog(NOTICE, '  '.(scalar @rows).' rows sent for sorting.');
}

foreach $rw (@rows) {
if($rw->{group} && $rw->{priority} && $rw->{weight}) {
push( @{ $groups{$rw->{group}}{$rw->{priority}} }, $rw);
elog(NOTICE, '  Pushing '.$rw->{group}.' with prio ('.$rw->{priority}.'), weight ('.$rw->{weight}.') onto array.');
} else {
elog(NOTICE, '  Invalid sort row: Group ('.$rw->{group}.'), Prio ('.$rw->{priority}.'), Weight ('.$rw->{weight}.')');
}
}

foreach $group (keys %groups) {
elog(NOTICE, '  Sorting group '.$group.'...');
foreach $prio (keys %{$groups{$group}}) {
my @rows = @{ $groups{$group}{$prio} };
elog(NOTICE, '    Sorting '.(scalar @rows).' rows in priority '.$prio.'...');
my @zeros;
my @nonzeros;
my $total_weight = 0;
my $row_order = 1;
for($row_id = 0; $row_id < scalar @rows; $row_id++) {
my $row = $rows[$row_id];
$total_weight += $row->{weight};
elog(NOTICE, '    Total Weight ('.$total_weight.')');
if($row->{weight} == 0) {
push(@zeros, $row);
} else {
push(@nonzeros, $row);
}
}
my @first_order = (@zeros, @nonzeros);
undef(@zeros);
undef(@nonzeros);
while(scalar @first_order) {
elog(NOTICE, '      '.(scalar @first_order).' items remaining ...');
my $rand = int(rand($total_weight));
elog(NOTICE, '      Random weight ('.$rand.')');
my $running_weight = 0;
for($row_id = 0; $row_id < scalar @first_order; $row_id++) {
my $row = $first_order[$row_id];
$running_weight += $row->{weight};
elog(NOTICE, '      Running weight ('.$running_weight.') Current Weight ('.$row->{weight}.')');
if($running_weight >= $rand) {
elog(NOTICE, '        : Priority ('.($row->{priority}).') Weight ('.($row->{weight}).')');
return_next(
{ id => int($row->{id}),
 priority => int($row->{priority}),
 weight => int($row->{weight}),
 order => int($row_order) }
);
$row_order++;
splice(@first_order, $row_id, 1);
# Recalculate total weight
$total_weight = 0;
foreach $row (@first_order) {
$total_weight += $row->{weight};
}
elog(NOTICE, '        : Remaining Weight ('.$total_weight.')');
break;
}
}
}
}
}
return undef;
$BODY$
LANGUAGE plperl VOLATILE;

5 rows sent for sorting.
Pushing GROUP_7 with prio (1), weight (0) onto array.
Pushing GROUP_7 with prio (1), weight (5) onto array.
Pushing GROUP_8 with prio (1), weight (1) onto array.
Pushing GROUP_8 with prio (1), weight (5) onto array.
Pushing GROUP_8 with prio (1), weight (5) onto array.
Sorting group GROUP_7...
Sorting 2 rows in priority 1...
Total Weight (0)
Total Weight (5)
2 items remaining ...
Random weight (0)
Running weight (0) Current Weight (0)
: Priority (1) Weight (0)
: Remaining Weight (5)
1 items remaining ...
Random weight (0)
Running weight (5) Current Weight (5)
: Priority (1) Weight (5)
: Remaining Weight (0)
Sorting group GROUP_8...
Sorting 3 rows in priority 1...
Total Weight (1)
Total Weight (6)
Total Weight (11)
3 items remaining ...
Random weight (8)
Running weight (1) Current Weight (1)
Running weight (6) Current Weight (5)
Running weight (11) Current Weight (5)
: Priority (1) Weight (5)
: Remaining Weight (6)
2 items remaining ...
Random weight (2)
Running weight (1) Current Weight (1)
Running weight (6) Current Weight (5)
: Priority (1) Weight (5)
: Remaining Weight (1)
1 items remaining ...
Random weight (0)
Running weight (1) Current Weight (1)
: Priority (1) Weight (1)
: Remaining Weight (0)

2 rows sent for sorting.
Pushing GROUP_1 with prio (1), weight (0) onto array.
Pushing GROUP_1 with prio (2), weight (4) onto array.
Sorting group GROUP_1...
Sorting 1 rows in priority 1...
Total Weight (0)
1 items remaining ...
Random weight (0)
Running weight (0) Current Weight (0)
: Priority (1) Weight (0)
: Remaining Weight (0)
Sorting 1 rows in priority 2...
Total Weight (4)
1 items remaining ...
Random weight (2)
Running weight (4) Current Weight (4)
: Priority (2) Weight (4)
: Remaining Weight (0)

Total runtime: 244.101 ms


On Fri, Jul 2, 2010 at 9:44 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
On 03/07/10 00:36, Craig James wrote:

> Perl itself is written in C, and some of it's operations are extremely
> fast.

The same is true of PL/PgSQL, though ;-)

The main advantage of PL/Perl is that it doesn't rely on the SPI to do
everything. It's interpreted not compiled, but it has a much faster
approach to interpretation than PL/PgSQL.

Really, the right choice depends on exactly what the OP is doing and
how, which they're not saying.

Where's the code?

--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero

В списке pgsql-performance по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: Highly Efficient Custom Sorting
Следующее
От: Merlin Moncure
Дата:
Сообщение: Re: Highly Efficient Custom Sorting