Обсуждение: counting words in a text or char varying field

Поиск
Список
Период
Сортировка

counting words in a text or char varying field

От
"Thomas T. Thai"
Дата:
i'm looking for a way to count the words in text field or char varying
separated by white spaces within postgresql itself.


--
Thomas T. Thai
Minnesota.com, Inc.




Re: counting words in a text or char varying field

От
"Greg Sabino Mullane"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> i'm looking for a way to count the words in text field or char varying
> separated by white spaces within postgresql itself.

That depends on what you mean by "postgresql itself". You cannot do this
with ordinary SQL: you are going to need a procedural language. Perl
would make an excellent choice, but that would be too easy :), so I'll
try and come up with a quick plpgsql example. This is only a test script,
and 'whitespace' in this case means, literally, one or more space
characters. It's a start. :)


CREATE FUNCTION countwords(TEXT) RETURNS INTEGER AS '
DECLARE
  mystring ALIAS FOR $1;
  words    INTEGER := 0;
  inspace  BOOL := true;       -- ## Catch the first word
BEGIN
  FOR pos IN 1 .. CHAR_LENGTH(mystring) LOOP

    -- There are 2 single quotes, a space, and 2 single quotes below:
    IF SUBSTRING(mystring,pos,1) =  '' '' THEN
      inspace := true;
    ELSE
      IF inspace THEN
        inspace := false;
        words := words+1;
      END IF;
    END IF;
  END LOOP;
  IF inspace is false THEN     -- ## Catch the last word
    words := words+1;
  END IF;

  RETURN words;
END;
' LANGUAGE 'plpgsql';

Let's give it a whirl:

select to_char(cdate, 'YYYYMMDD'), countwords(body)
from turnstep_mail where whofrom like '%tom@minnesota.com%'
order by 1 desc limit 5;

 to_char  | countwords
- ----------+------------
 20020304 |        493
 20020304 |         35
 20020303 |        194
 20020303 |         74
 20020302 |         44


Greg Sabino Mullane  greg@turnstep.com
PGP Key: 0x14964AC8 200203041730

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE8g/m/vJuQZxSWSsgRAqpfAKDSh7WYjF6l3cWfIw+8O/UPeKgjmwCfRlsc
OI69IIGHlMKMIbR0+Mc/q/g=
=0ral
-----END PGP SIGNATURE-----