Обсуждение: tesearch2 question
Hi All,<br /><br />I'm trying to udpate a table containing <span style="font-family: monospace;"></span>13149741
records.And its taking forever to complete this process. <br /><br />The update query i'm trying to run is for full
textindexing similiar to <br /><br /><pre>UPDATE tblMessages SET idxFTI=to_tsvector(strMessage);</pre><br />Below are
someof the stats which might be helpful for analyzing this<br /><br />$top<br /><br /> PID USERNAME LWP PRI NICE
SIZE RES STATE TIME CPU COMMAND <br /> 3091 postgres 1 43 0 46M 38M cpu/1 200:06 3.20% postgres<br
/> 5052 postgres 1 60 0 149M 134M sleep 0:17 3.12% postgres<br /><br /><<<here are the top 2
processes,out of which the first process i have been running almost for a day and a half and it is still running, <br
/><br/>This table which i'm trying to update has 10 indexes<br /><br
/>=========================================================<br/> "a_article_pk" PRIMARY KEY, btree (id)<br />
"a_article_uk_pmid"UNIQUE, btree (pmid) <br /> "a_article_idx_abstract" btree ("substring"(abstract::text, 0,
255))<br/> "a_article_idx_date_cr_year" btree (date_cr_year)<br /> "a_article_idx_ml_journal_info_medline_ta"
btree(ml_journal_info_a_ta) <br /> "a_article_idx_owner" btree ("owner")<br /> "a_article_idx_pmid" btree
(pmid)<br/> "a_article_idx_status" btree (status)<br /> "a_article_idx_title" btree (article_title) <br />
"a_master_t_idx_year_published"btree (published_year)<br />========================================================<br
/>Butno indexes on the field i'm trying to update. The field i'm trying to add is a new field. <br />Can anyone help me
outto figure out why is it taking so much time to update the table.<br /><br />Also as u see in the above indexes, I
havesome indexes on some varchar column which i feel are totally useless unless u so a exact string match. <br />But
doesthat help in any sense for improving the speed of retreiving the string just normally without any search on it?<br
/><br/>Thanks,<br />Sumeet.<br />
On Wed, 7 Mar 2007, Sumeet wrote:
> Hi All,
>
> I'm trying to udpate a table containing 13149741 records. And its taking
> forever to complete this process.
>
> The update query i'm trying to run is for full text indexing similiar to
>
> UPDATE tblMessages SET idxFTI=to_tsvector(strMessage);
>
How big are your strMessage ? and what's your tsearch2 configuration ?
Can you estimate how long takes updating, for example, 1000 rows ?
It looks like your system is IO bound. What's your hardware ?
>
> Below are some of the stats which might be helpful for analyzing this
>
> $top
>
> PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
> 3091 postgres 1 43 0 46M 38M cpu/1 200:06 3.20% postgres
> 5052 postgres 1 60 0 149M 134M sleep 0:17 3.12% postgres
>
> <<<here are the top 2 processes, out of which the first process i have been
> running almost for a day and a half and it is still running,
>
> This table which i'm trying to update has 10 indexes
>
> =========================================================
> "a_article_pk" PRIMARY KEY, btree (id)
> "a_article_uk_pmid" UNIQUE, btree (pmid)
> "a_article_idx_abstract" btree ("substring"(abstract::text, 0, 255))
> "a_article_idx_date_cr_year" btree (date_cr_year)
> "a_article_idx_ml_journal_info_medline_ta" btree (ml_journal_info_a_ta)
> "a_article_idx_owner" btree ("owner")
> "a_article_idx_pmid" btree (pmid)
> "a_article_idx_status" btree (status)
> "a_article_idx_title" btree (article_title)
> "a_master_t_idx_year_published" btree (published_year)
> ========================================================
> But no indexes on the field i'm trying to update. The field i'm trying to
> add is a new field.
> Can anyone help me out to figure out why is it taking so much time to update
> the table.
>
> Also as u see in the above indexes, I have some indexes on some varchar
> column which i feel are totally useless unless u so a exact string match.
> But does that help in any sense for improving the speed of retreiving the
> string just normally without any search on it?
>
> Thanks,
> Sumeet.
>
Regards, Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
Thanks Oleg,
My String message are Abstracts of papers, I did a
$ select avg(len) from (select length(abstract) as len from master_table limit 500) E;
avg
-----------------------
1355.5907859078590786
(1 row)
so length is approx 1400.
I couldn't find any appropriate way to analyze the time for update queries, but what i did was a explain analyze
$ explain analyze select to_tsvector(article_title) from master_table limit 1000;
The total runtime was approx 500ms.
The server is Sun OS 5.10, with around 8gigs of RAM and 6cpus.
Thanks,
Sumeet.
--
Thanks,
Sumeet Ambre
Master of Information Science Candidate,
Indiana University.
My String message are Abstracts of papers, I did a
$ select avg(len) from (select length(abstract) as len from master_table limit 500) E;
avg
-----------------------
1355.5907859078590786
(1 row)
so length is approx 1400.
I couldn't find any appropriate way to analyze the time for update queries, but what i did was a explain analyze
$ explain analyze select to_tsvector(article_title) from master_table limit 1000;
The total runtime was approx 500ms.
The server is Sun OS 5.10, with around 8gigs of RAM and 6cpus.
Thanks,
Sumeet.
On 3/7/07, Oleg Bartunov <oleg@sai.msu.su> wrote:
On Wed, 7 Mar 2007, Sumeet wrote:
> Hi All,
>
> I'm trying to udpate a table containing 13149741 records. And its taking
> forever to complete this process.
>
> The update query i'm trying to run is for full text indexing similiar to
>
> UPDATE tblMessages SET idxFTI=to_tsvector(strMessage);
>
How big are your strMessage ? and what's your tsearch2 configuration ?
Can you estimate how long takes updating, for example, 1000 rows ?
It looks like your system is IO bound. What's your hardware ?
>
> Below are some of the stats which might be helpful for analyzing this
>
> $top
>
> PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
> 3091 postgres 1 43 0 46M 38M cpu/1 200:06 3.20% postgres
> 5052 postgres 1 60 0 149M 134M sleep 0:17 3.12% postgres
>
> <<<here are the top 2 processes, out of which the first process i have been
> running almost for a day and a half and it is still running,
>
> This table which i'm trying to update has 10 indexes
>
> =========================================================
> "a_article_pk" PRIMARY KEY, btree (id)
> "a_article_uk_pmid" UNIQUE, btree (pmid)
> "a_article_idx_abstract" btree ("substring"(abstract::text, 0, 255))
> "a_article_idx_date_cr_year" btree (date_cr_year)
> "a_article_idx_ml_journal_info_medline_ta" btree (ml_journal_info_a_ta)
> "a_article_idx_owner" btree ("owner")
> "a_article_idx_pmid" btree (pmid)
> "a_article_idx_status" btree (status)
> "a_article_idx_title" btree (article_title)
> "a_master_t_idx_year_published" btree (published_year)
> ========================================================
> But no indexes on the field i'm trying to update. The field i'm trying to
> add is a new field.
> Can anyone help me out to figure out why is it taking so much time to update
> the table.
>
> Also as u see in the above indexes, I have some indexes on some varchar
> column which i feel are totally useless unless u so a exact string match.
> But does that help in any sense for improving the speed of retreiving the
> string just normally without any search on it?
>
> Thanks,
> Sumeet.
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet ( www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
--
Thanks,
Sumeet Ambre
Master of Information Science Candidate,
Indiana University.
On Wed, 7 Mar 2007, Sumeet wrote:
> Thanks Oleg,
>
> My String message are Abstracts of papers, I did a
>
> $ select avg(len) from (select length(abstract) as len from master_table
> limit 500) E;
>
> avg
> -----------------------
> 1355.5907859078590786
> (1 row)
>
> so length is approx 1400.
that'is about 18 Gb of text ! What's your hardware ?
>
>
> I couldn't find any appropriate way to analyze the time for update queries,
> but what i did was a explain analyze
>
> $ explain analyze select to_tsvector(article_title) from master_table limit
> 1000;
>
> The total runtime was approx 500ms.
just issue \timing in psql before executing update command.
Then you could estimate total time.
>
> The server is Sun OS 5.10, with around 8gigs of RAM and 6cpus.
>
> Thanks,
> Sumeet.
>
>
>
> On 3/7/07, Oleg Bartunov <oleg@sai.msu.su> wrote:
>>
>> On Wed, 7 Mar 2007, Sumeet wrote:
>>
>> > Hi All,
>> >
>> > I'm trying to udpate a table containing 13149741 records. And its
>> taking
>> > forever to complete this process.
>> >
>> > The update query i'm trying to run is for full text indexing similiar to
>> >
>> > UPDATE tblMessages SET idxFTI=to_tsvector(strMessage);
>> >
>>
>> How big are your strMessage ? and what's your tsearch2 configuration ?
>> Can you estimate how long takes updating, for example, 1000 rows ?
>> It looks like your system is IO bound. What's your hardware ?
>>
>> >
>> > Below are some of the stats which might be helpful for analyzing this
>> >
>> > $top
>> >
>> > PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
>> > 3091 postgres 1 43 0 46M 38M cpu/1 200:06 3.20% postgres
>> > 5052 postgres 1 60 0 149M 134M sleep 0:17 3.12% postgres
>> >
>> > <<<here are the top 2 processes, out of which the first process i have
>> been
>> > running almost for a day and a half and it is still running,
>> >
>> > This table which i'm trying to update has 10 indexes
>> >
>> > =========================================================
>> > "a_article_pk" PRIMARY KEY, btree (id)
>> > "a_article_uk_pmid" UNIQUE, btree (pmid)
>> > "a_article_idx_abstract" btree ("substring"(abstract::text, 0, 255))
>> > "a_article_idx_date_cr_year" btree (date_cr_year)
>> > "a_article_idx_ml_journal_info_medline_ta" btree
>> (ml_journal_info_a_ta)
>> > "a_article_idx_owner" btree ("owner")
>> > "a_article_idx_pmid" btree (pmid)
>> > "a_article_idx_status" btree (status)
>> > "a_article_idx_title" btree (article_title)
>> > "a_master_t_idx_year_published" btree (published_year)
>> > ========================================================
>> > But no indexes on the field i'm trying to update. The field i'm trying
>> to
>> > add is a new field.
>> > Can anyone help me out to figure out why is it taking so much time to
>> update
>> > the table.
>> >
>> > Also as u see in the above indexes, I have some indexes on some varchar
>> > column which i feel are totally useless unless u so a exact string
>> match.
>> > But does that help in any sense for improving the speed of retreiving
>> the
>> > string just normally without any search on it?
>> >
>> > Thanks,
>> > Sumeet.
>> >
>>
>> Regards,
>> Oleg
>> _____________________________________________________________
>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
>> Sternberg Astronomical Institute, Moscow University, Russia
>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>> phone: +007(495)939-16-83, +007(495)939-23-83
>>
>
>
>
>
Regards, Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83