Re: web archiving
| От | Matt Price | 
|---|---|
| Тема | Re: web archiving | 
| Дата | |
| Msg-id | 1026409963.17825.72.camel@anarres обсуждение исходный текст | 
| Ответ на | Re: web archiving (Philip Hallstrom <philip@adhesivemedia.com>) | 
| Ответы | Re: web archiving | 
| Список | pgsql-novice | 
Hi Phialip, et al, well, wget is nice, and htdig/mngosearch both seem great; but I want to be able to enter extra data about the web pages (author names, comments, subject/key word entries...)so that the database starts to resemble a bibliographic database. That is, I want other people to be able to take advantage of work that I and other data-entry slaves do when we enter the url's. does htat seem silly? matt On Wed, 2002-07-10 at 18:21, Philip Hallstrom wrote: > Not to discourage you from using postgresql or writing it yourself, but > you might want to take a look at wget (for downloading the web pages) and > mngosearch or htdig for searching them. > > mngosearch supports postgresql and has a PHP interface so you can have fun > with that... > > On 10 Jul 2002, Matt Price wrote: > > > Hi there, > > > > I've just moved up from non-free os's to debian linux, and installed > > postgresql, with the hope of getting started on some projects I've been > > thinking about. Several of these projects involve web archives. The > > idea is, a url is entered with a bunch of bibliographic-type data in > > other fields (keywords, author, date, etc). The html (and hopefully, > > accompanying images/css's/etc) are then grabbed using curl, and archived > > in a postgresql database. A web or other gui interface then provides > > fully-searchable access to the archive for later use. > > > > So my question: does anyone know of a similar tool which already > > exists? I'm a complete novice at database programming (and at php, too, > > which is what I figured I'd use as the scripting language, though I'd > > consider learning perl or java if folks think that's a much better > > idea), and I'd rather work with some pre-existing code than start from > > the ground up. Any suggestings? Is this the right list to be asking > > this quesiton on? > > > > Thanks loads, > > Matt > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
В списке pgsql-novice по дате отправления: