robots.txt on git.postgresql.org

Поиск
Список
Период
Сортировка
От Greg Stark
Тема robots.txt on git.postgresql.org
Дата
Msg-id CAM-w4HOzgXu9WVOygrXRpUn4dh9PL7idD1TDTnLb4HKzwfiHdA@mail.gmail.com
обсуждение исходный текст
Ответы Re: robots.txt on git.postgresql.org  (Andres Freund <andres@2ndquadrant.com>)
Re: robots.txt on git.postgresql.org  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
I note that git.postgresql.org's robot.txt refuses permission to crawl
the git repository:

http://git.postgresql.org/robots.txt

User-agent: *
Disallow: /


I'm curious what motivates this. It's certainly useful to be able to
search for commits. I frequently type git commit hashes into Google to
find the commit in other projects. I think I've even done it in
Postgres before and not had a problem. Maybe Google brought up github
or something else.

Fwiw the reason I noticed this is because I searched for "postgresql
git log" and the first hit was for "see the commit that fixed the
issue, with all the gory details" which linked to
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6e0cd7b76c04acc8c8f868a3bcd0f9ff13e16c8

This was indexed despite the robot.txt because it was linked to from
elsewhere (Hence the interesting link title). There are ways to ask
Google not to index pages if that's really what we're after but I
don't see why we would be.

-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dmitriy Igrishin
Дата:
Сообщение: Re: Millisecond-precision connect_timeout for libpq
Следующее
От: Andres Freund
Дата:
Сообщение: Re: robots.txt on git.postgresql.org