Re: robots.txt on git.postgresql.org
| От | Andres Freund |
|---|---|
| Тема | Re: robots.txt on git.postgresql.org |
| Дата | |
| Msg-id | 20130711135058.GG27898@alap2.anarazel.de обсуждение |
| Ответ на | Re: robots.txt on git.postgresql.org (Greg Stark <stark@mit.edu>) |
| Список | pgsql-hackers |
On 2013-07-11 14:43:21 +0100, Greg Stark wrote: > On Wed, Jul 10, 2013 at 9:36 AM, Magnus Hagander <magnus@hagander.net> wrote: > > We already run this, that's what we did to make it survive at all. The > > problem is there are so many thousands of different URLs you can get > > to on that site, and google indexes them all by default. > > There's also https://support.google.com/webmasters/answer/48620?hl=en > which lets us control how fast the Google crawler crawls. I think it's > adaptive though so if the pages are slow it should be crawling slowly The problem is that gitweb gives you access to more than a million pages... Revisions: git rev-list --all origin/master|wc -l => 77123 Branches: git branch --all|grep origin|wc - Views per commit: commit, commitdiff, tree So, slow crawling isn't going to help very much. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: