https://drive.google.com/file/d/0BxlbeayLWxeRVE5jeXNWczJKamc/view?usp=sharing
Tag Archives: data harvesting
VALA2014 Session 7 Barwick
Hunters and collectors: seeking social media content for cultural heritage collectionsVALA2014 CONCURRENT SESSION 7: Think Social Kathryn Barwick and Mylee JosephState Library of New South Wales Cecile Paris and Stephen WanCommonwealth Scientific and Industrial Research Organisation, Australia Please tag your comments, tweets, and blog posts about this session: #vala14 and #s19 | |
Abstract
A novel approach to collecting digital content for heritage collections is being explored and assessed in a trial of Vizie, an innovative social media tool researched and developed by the Commonwealth Scientific and Industrial Research Organisation. Collecting digital content for heritage collections is a priority for research libraries and other cultural institutions. This paper reports on the progress and learnings to date of the ongoing collaboration between the CSIRO and the State Library of New South Wales. The aim of the collaboration is to gather and curate online content centred around significant events and every day life in Australia and New South Wales.
This work is licensed under a Creative Commons Attribution-NonCommercial License.
VALA2014 Session 2 Kreunen
Hacking the library catalogue: a voyage of discoveryVALA2014 CONCURRENT SESSION 2: It’s All About the Data Ben Kreunen and Joe ArthurThe University of Melbourne, Vic Please tag your comments, tweets, and blog posts about this session: #vala14 and #s4 | |
Abstract
The University Digitisation Centre (UDC) at the University of Melbourne has been working towards implementing the Embedded Metadata Manifesto since its inception in 2009. This paper follows the evolution of the in-house information systems developed by UDC to incorporate the embedding of descriptive metadata as part of a standard digitisation process. Central to this has been the development of novel ways of accessing metadata from the various library catalogues via their public interfaces. Challenges arising from the re-use of catalogue metadata in non-library systems may provide additional insights as libraries attempt to re-invent the catalogue.
This work is licensed under a Creative Commons Attribution-NonCommercial License.
VALA2014 Session 2 Balnaves
Complex harvesting for content from public sources and emailVALA2014 CONCURRENT SESSION 2: It’s All About the Data Edmund BalnavesProsentient Systems, NSW Please tag your comments, tweets, and blog posts about this session: #vala14 and #s6 | |
Abstract
This paper presents the results of a project for complex harvesting system from web and email sources integrated with open source platforms to improve discovery of information about or relevant to the organisation from public internet sources. The paper discusses methods of harvesting, drawing on a mix of RSS, Google API search and simple web parsing. The paper presents the results of automated metadata allocation and subsequent manual curation. The project highlights the need to use multiple web scanning techniques, so as to be sufficiently exhaustive to catch relevant references, but also sufficiently specific to avoid unduly large false positive candidates for selection.
This work is licensed under a Creative Commons Attribution-NonCommercial License.