Re: robots.txt on git.postgresql.org

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: robots.txt on git.postgresql.org
Дата
Msg-id CABUevEy9pS6ERtg3xqzo31wv_93br=AzEHfbeM5m4kidypgTRA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: robots.txt on git.postgresql.org  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
On Thu, Jul 11, 2013 at 3:43 PM, Greg Stark <stark@mit.edu> wrote:
> On Wed, Jul 10, 2013 at 9:36 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> We already run this, that's what we did to make it survive at all. The
>> problem is there are so many thousands of different URLs you can get
>> to on that site, and google indexes them all by default.
>
> There's also https://support.google.com/webmasters/answer/48620?hl=en
> which lets us control how fast the Google crawler crawls. I think it's
> adaptive though so if the pages are slow it should be crawling slowly

Sure, but there are plenty of other search engines as well, not just
google... Google is actually "reasonably good" at scaling back it's
own speed, in my experience. Which is not true of all the others. Of
course, it's also got the problem of it then taking a long time to
actually crawl the site, since there are so many different URLs...

--Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: robots.txt on git.postgresql.org
Следующее
От: Claudio Freire
Дата:
Сообщение: Re: SSL renegotiation