OLAC Record oai:scholarspace.manoa.hawaii.edu:10125/4999 |
Metadata | ||
Title: | OLAC: Accessing the world's language resources | |
Bibliographic Citation: | Bird, Steven, Simons, Gary, Bird, Steven, Simons, Gary; 2009-03-12; Language resources are the bread and butter of language documentation and linguistic investigation. They include the primary objects of study such as texts and recordings, the outputs of research such as dictionaries and grammars, and the enabling technologies such as software tools and interchange standards. Increasingly, these resources are maintained and distributed in digital form. Searching on the web for language resources in many languages is a hit-and-miss affair for three reasons: (i) resources are housed in archives that have never put their catalog online, (ii) resources are exposed online but are hidden behind form-based interfaces such that search engines cannot find them, or (iii) resources are exposed to online search engines but they are described in ad hoc ways so that searches do not retrieve desired results with precision. The Open Language Archives Community (OLAC) is addressing these problems by building on digital library standards to provide a standard format for describing language resources, which makes use of standardized identifiers for languages, linguistic data types, and other things of particular interest to linguists. For instance, all resources from all archives that are in or about the same language use the same three-letter language code from the ISO 639-3 standard. OLAC also provides a portal that permits users to simultaneously query the holdings of the three dozen participating language archives in a single search. Since resource description uses precise language identifiers, a search for a particular language return all and only the relevant resources. However, the current usage and coverage of OLAC is only the tip of the iceberg. Many more linguists should be using it to find many more resources. This paper describes research that is being done to make language resources maximally accessible to linguists. We describe new methods for greatly improving search access to archived language resources, new services that encourage language archives to use best common practices to produce resource descriptions that are maximally useful for searching, and new data providers that use digital library services and web-mining technologies to find language resources in the library, institutional repository, and web domains.; Kaipuleohone University of Hawai'i Digital Language Archive;http://hdl.handle.net/10125/4999. | |
Contributor (speaker): | Bird, Steven | |
Simons, Gary | ||
Creator: | Bird, Steven | |
Simons, Gary | ||
Date (W3CDTF): | 2009-03-14 | |
Description: | Language resources are the bread and butter of language documentation and linguistic investigation. They include the primary objects of study such as texts and recordings, the outputs of research such as dictionaries and grammars, and the enabling technologies such as software tools and interchange standards. Increasingly, these resources are maintained and distributed in digital form. Searching on the web for language resources in many languages is a hit-and-miss affair for three reasons: (i) resources are housed in archives that have never put their catalog online, (ii) resources are exposed online but are hidden behind form-based interfaces such that search engines cannot find them, or (iii) resources are exposed to online search engines but they are described in ad hoc ways so that searches do not retrieve desired results with precision. The Open Language Archives Community (OLAC) is addressing these problems by building on digital library standards to provide a standard format for describing language resources, which makes use of standardized identifiers for languages, linguistic data types, and other things of particular interest to linguists. For instance, all resources from all archives that are in or about the same language use the same three-letter language code from the ISO 639-3 standard. OLAC also provides a portal that permits users to simultaneously query the holdings of the three dozen participating language archives in a single search. Since resource description uses precise language identifiers, a search for a particular language return all and only the relevant resources. However, the current usage and coverage of OLAC is only the tip of the iceberg. Many more linguists should be using it to find many more resources. This paper describes research that is being done to make language resources maximally accessible to linguists. We describe new methods for greatly improving search access to archived language resources, new services that encourage language archives to use best common practices to produce resource descriptions that are maximally useful for searching, and new data providers that use digital library services and web-mining technologies to find language resources in the library, institutional repository, and web domains. | |
Identifier (URI): | http://hdl.handle.net/10125/4999 | |
Language: | English | |
Language (ISO639): | eng | |
Rights: | Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported | |
Table Of Contents: | 4999-01.JPG | |
4999-02.jpg | ||
4999.mp3 | ||
4999.pdf | ||
OLAC Info |
||
Archive: | Language Documentation and Conservation | |
Description: | http://www.language-archives.org/archive/ldc.scholarspace.manoa.hawaii.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:scholarspace.manoa.hawaii.edu:10125/4999 | |
DateStamp: | 2024-09-01 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Bird, Steven; Simons, Gary. 2009. Language Documentation and Conservation. | |
Terms: | area_Europe country_GB iso639_eng |