Re: robots.txt on git.postgresql.org

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: robots.txt on git.postgresql.org
Дата
Msg-id 51DC315C.4080806@dunslane.net
обсуждение исходный текст
Ответ на robots.txt on git.postgresql.org  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
On 07/09/2013 11:24 AM, Greg Stark wrote:
> I note that git.postgresql.org's robot.txt refuses permission to crawl
> the git repository:
>
> http://git.postgresql.org/robots.txt
>
> User-agent: *
> Disallow: /
>
>
> I'm curious what motivates this. It's certainly useful to be able to
> search for commits. I frequently type git commit hashes into Google to
> find the commit in other projects. I think I've even done it in
> Postgres before and not had a problem. Maybe Google brought up github
> or something else.
>
> Fwiw the reason I noticed this is because I searched for "postgresql
> git log" and the first hit was for "see the commit that fixed the
> issue, with all the gory details" which linked to
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6e0cd7b76c04acc8c8f868a3bcd0f9ff13e16c8
>
> This was indexed despite the robot.txt because it was linked to from
> elsewhere (Hence the interesting link title). There are ways to ask
> Google not to index pages if that's really what we're after but I
> don't see why we would be.



It's certainly not universal. For example, the only reason I found 
buildfarm client commit d533edea5441115d40ffcd02bd97e64c4d5814d9, for 
which the repo is housed at GitHub, is that Google has indexed the 
buildfarm commits mailing list on pgfoundry. Do we have a robots.txt on 
the postgres mailing list archives site?

cheers

andrew



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: Patch to add regression tests for SCHEMA
Следующее
От: Dimitri Fontaine
Дата:
Сообщение: Re: robots.txt on git.postgresql.org