VALA2022 Lightning Talk Musgrave 2

The Language Data Commons of Australia

VALA2022 Lightning Talk

Simon Musgrave
  • Senior Project Officer
  • University of Queensland
K Kaiser
  • University of Queensland
Leah Gustafson
  • Griffith University

Please tag your comments, tweets, and blog posts about this session: #vala2022

Abstract

Australia is a nation of great linguistic diversity, including the languages of Indigenous Australians and those of later arrivals. There is also a strong tradition of research on language in Australia which has produced significant collections of language data and continues to do so. These materials are held in different geographic locations by a variety of institutions as well as in digital repositories, meaning that discovery and access procedures are inconsistent and, at least in some cases, difficult.

The Language Data Commons of Australia (LDaCA) project, which commenced in June 2021, aims to make it easier to find and use resources for research and study based on languages in Australia and to ensure long-lasting access to these invaluable collections for analysis and reuse in a culturally, ethically and legally appropriate manner. The project is primarily intended to meet the needs of researchers, but it will provide access to language resources for other groups. For example, people with a non-academic interest in languages will be able to use it to explore materials, and teachers and students at different educational levels will be able to find data relevant to their interests and needs.

This lightning talk will give an introduction to the project, explaining its structure and aims. LDaCA is not a primarily a repository in its own right, rather it is a point of aggregation. It will make existing resources available through a single portal, while also partnering with data stewards to work towards making more of Australia’s language data visible and usable. The work of LDaCA will be guided by both FAIR and CARE principles, making language data available within an ethically responsible framework. As its coverage increases, LDaCA will become a valuable resource for libraries, providing a reliable tool for discovering and accessing language resources.

Biography

Simon Musgrave was a member of the linguistics program at Monash University from 2003 until 2020. His research interests included the use of computational tools in linguistic research and the relationship between linguistics and digital humanities. He was involved in the Australian National Corpus project, an important piece of digital research infrastructure, and has been a member of the executive of the Australasian Association for Digital Humanities since 2015. Simon currently is part of the team delivering various language-related infrastructures including the Australian Text Analytics Platform and the Language Data Commons of Australia.

 

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License