I’ve just recreated my list of magazines from Google Books for the University’s e-journals site.
Google now hosts 199 digitised magazine titles, and for the sake of 10 minutes’ work every few months it would be a shame to miss out on the extra full-text coverage, which often complements the “library” sources for a title.
E.g. for the frankly un-put-downable Estonian Journal of Archaeology (available as an Open Access (OA) journal from 2006-, and indexed in Art Full Text), Google provides the missing articles from 1997 (vol.1) up to 2006.
I’d like to be able to harvest the Google Books content to build my list using the standard mashlib toolkit (Google spreadsheets; Yahoo! Pipes; some coffee)… but while use of Google’s =ImportHtml() function is limited to 50 per spreadsheet, and because Google search pages block robots.txt files, I can’t figure out a way of doing so.
Instead, I’ve been copying-and-pasting the search results pages into an ordinary Microsoft Excel spreadsheet (thanks, again, Google, for making this possible through your magazine browse page), then using a custom Excel function to ‘unmask’ the URL hidden behind each hyperlinked magazine title.
Finally, I use a bit of text-to-column splitting, search/replace, and filling-in of package-wide fields, to give me a compatible, tab-delimited text file which I then upload to our e-journals knowledge base (which happens to be EBSCO A-to-Z) – I used EBSCO’s custom notes feature to link to Google’s cover image to each entry in the file.