OLAC Record: Computing in the field: language modeling for elicitation and documentation of Shughni

OLAC Record
oai:scholarspace.manoa.hawaii.edu:10125/5066

Metadata

Title: Computing in the field: language modeling for elicitation and documentation of Shughni

Bibliographic Citation: Hippisley, Andrew, Stump, Gregory, Raphael, Finkel, Hippisley, Andrew, Stump, Gregory, Raphael, Finkel; 2009-03-12; We propose a way of enhancing computer-based approaches to language documentation by making use not only of the engineering capability of computing but also its modeling capacity. Our proposal arises from a documentation pilot project where we used computational modeling as an elicitation tool for documenting the complex verbal morphology of the underdocumented East Iranian Pamir language Shughni. Using the computable lexical knowledge representation language DATR (Evans & Gazdar 1996) and its variant KATR (Author et al. 2002), we wrote a theory of a fragment of the Shughni verb system based on what little we knew about the language. We then presented its theorem to our group of Shughni consultants, and based on their responses refined the model, and then consulted them on the new theorem, and so on to the next refinement. Cycling through these steps allowed us to refine our model and so lead to a more accurate account of the data. Equally importantly, this method gave us an automated ‘questionnaire generator’, i.e. the model's theorem. This provided not only elicitation queries that, given enough time, we may have thought of ourselves but those which may never have occurred to us. Both types of query were available to us precisely because our understanding of the grammar was formal and computationally implemented, and could thereby automatically generate theorems. Computing plays a key language engineering role in language documentation and its accessibility to the wider audience, from standard mark-up of data to its storage in a relational database for query-based retrieval. But computing serves a second purpose for linguists, that of language modeling: this is “the instrumental use of computation in the pursuit of linguistic goals” (Thompson 1983: 23). As we develop new methods for documentation, we need to explore the possibility of harnessing this other language modeling capacity of computing. We demonstrate through our work on Shughni that computer modeling can be a means of furnishing the field-worker with elicitation tasks whose results feed into an enhanced understanding of the data, which in turn show the path to the next stage of elicitation, ultimately leading to a well-informed and robust account of the data which is already digitized and therefore exchangeable. Advances in technology, such as palm-held computers, mean that an automated model-theorem-refinement method is both a practical and potentially highly valuable addition to the field-worker’s toolkit, both while in the field and back in the lab.; Kaipuleohone University of Hawai'i Digital Language Archive;http://hdl.handle.net/10125/5066.

Contributor (speaker): Hippisley, Andrew

Stump, Gregory

Raphael, Finkel

Creator: Hippisley, Andrew

Stump, Gregory

Raphael, Finkel

Date (W3CDTF): 2009-03-14

Description: We propose a way of enhancing computer-based approaches to language documentation by making use not only of the engineering capability of computing but also its modeling capacity. Our proposal arises from a documentation pilot project where we used computational modeling as an elicitation tool for documenting the complex verbal morphology of the underdocumented East Iranian Pamir language Shughni. Using the computable lexical knowledge representation language DATR (Evans & Gazdar 1996) and its variant KATR (Author et al. 2002), we wrote a theory of a fragment of the Shughni verb system based on what little we knew about the language. We then presented its theorem to our group of Shughni consultants, and based on their responses refined the model, and then consulted them on the new theorem, and so on to the next refinement. Cycling through these steps allowed us to refine our model and so lead to a more accurate account of the data. Equally importantly, this method gave us an automated ‘questionnaire generator’, i.e. the model's theorem. This provided not only elicitation queries that, given enough time, we may have thought of ourselves but those which may never have occurred to us. Both types of query were available to us precisely because our understanding of the grammar was formal and computationally implemented, and could thereby automatically generate theorems. Computing plays a key language engineering role in language documentation and its accessibility to the wider audience, from standard mark-up of data to its storage in a relational database for query-based retrieval. But computing serves a second purpose for linguists, that of language modeling: this is “the instrumental use of computation in the pursuit of linguistic goals” (Thompson 1983: 23). As we develop new methods for documentation, we need to explore the possibility of harnessing this other language modeling capacity of computing. We demonstrate through our work on Shughni that computer modeling can be a means of furnishing the field-worker with elicitation tasks whose results feed into an enhanced understanding of the data, which in turn show the path to the next stage of elicitation, ultimately leading to a well-informed and robust account of the data which is already digitized and therefore exchangeable. Advances in technology, such as palm-held computers, mean that an automated model-theorem-refinement method is both a practical and potentially highly valuable addition to the field-worker’s toolkit, both while in the field and back in the lab.

Identifier (URI): http://hdl.handle.net/10125/5066

Language: English

Language (ISO639): eng

Rights: Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported

Table Of Contents: 5066-01.jpg

5066-03.jpg

5066.mp3

5066.pdf

OLAC Info

Archive: Language Documentation and Conservation

Description: http://www.language-archives.org/archive/ldc.scholarspace.manoa.hawaii.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:scholarspace.manoa.hawaii.edu:10125/5066

DateStamp: 2024-08-27

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Hippisley, Andrew; Stump, Gregory; Raphael, Finkel. 2009. Language Documentation and Conservation.
Terms: area_Europe country_GB iso639_eng

http://www.language-archives.org/item.php/oai:scholarspace.manoa.hawaii.edu:10125/5066
Up-to-date as of: Mon Nov 18 7:28:50 EST 2024

Metadata
Title:		Computing in the field: language modeling for elicitation and documentation of Shughni
Bibliographic Citation:		Hippisley, Andrew, Stump, Gregory, Raphael, Finkel, Hippisley, Andrew, Stump, Gregory, Raphael, Finkel; 2009-03-12; We propose a way of enhancing computer-based approaches to language documentation by making use not only of the engineering capability of computing but also its modeling capacity. Our proposal arises from a documentation pilot project where we used computational modeling as an elicitation tool for documenting the complex verbal morphology of the underdocumented East Iranian Pamir language Shughni. Using the computable lexical knowledge representation language DATR (Evans & Gazdar 1996) and its variant KATR (Author et al. 2002), we wrote a theory of a fragment of the Shughni verb system based on what little we knew about the language. We then presented its theorem to our group of Shughni consultants, and based on their responses refined the model, and then consulted them on the new theorem, and so on to the next refinement. Cycling through these steps allowed us to refine our model and so lead to a more accurate account of the data. Equally importantly, this method gave us an automated ‘questionnaire generator’, i.e. the model's theorem. This provided not only elicitation queries that, given enough time, we may have thought of ourselves but those which may never have occurred to us. Both types of query were available to us precisely because our understanding of the grammar was formal and computationally implemented, and could thereby automatically generate theorems. Computing plays a key language engineering role in language documentation and its accessibility to the wider audience, from standard mark-up of data to its storage in a relational database for query-based retrieval. But computing serves a second purpose for linguists, that of language modeling: this is “the instrumental use of computation in the pursuit of linguistic goals” (Thompson 1983: 23). As we develop new methods for documentation, we need to explore the possibility of harnessing this other language modeling capacity of computing. We demonstrate through our work on Shughni that computer modeling can be a means of furnishing the field-worker with elicitation tasks whose results feed into an enhanced understanding of the data, which in turn show the path to the next stage of elicitation, ultimately leading to a well-informed and robust account of the data which is already digitized and therefore exchangeable. Advances in technology, such as palm-held computers, mean that an automated model-theorem-refinement method is both a practical and potentially highly valuable addition to the field-worker’s toolkit, both while in the field and back in the lab.; Kaipuleohone University of Hawai'i Digital Language Archive;http://hdl.handle.net/10125/5066.
Contributor (speaker):		Hippisley, Andrew
		Stump, Gregory
		Raphael, Finkel
Creator:		Hippisley, Andrew
		Stump, Gregory
		Raphael, Finkel
Date (W3CDTF):		2009-03-14
Description:		We propose a way of enhancing computer-based approaches to language documentation by making use not only of the engineering capability of computing but also its modeling capacity. Our proposal arises from a documentation pilot project where we used computational modeling as an elicitation tool for documenting the complex verbal morphology of the underdocumented East Iranian Pamir language Shughni. Using the computable lexical knowledge representation language DATR (Evans & Gazdar 1996) and its variant KATR (Author et al. 2002), we wrote a theory of a fragment of the Shughni verb system based on what little we knew about the language. We then presented its theorem to our group of Shughni consultants, and based on their responses refined the model, and then consulted them on the new theorem, and so on to the next refinement. Cycling through these steps allowed us to refine our model and so lead to a more accurate account of the data. Equally importantly, this method gave us an automated ‘questionnaire generator’, i.e. the model's theorem. This provided not only elicitation queries that, given enough time, we may have thought of ourselves but those which may never have occurred to us. Both types of query were available to us precisely because our understanding of the grammar was formal and computationally implemented, and could thereby automatically generate theorems. Computing plays a key language engineering role in language documentation and its accessibility to the wider audience, from standard mark-up of data to its storage in a relational database for query-based retrieval. But computing serves a second purpose for linguists, that of language modeling: this is “the instrumental use of computation in the pursuit of linguistic goals” (Thompson 1983: 23). As we develop new methods for documentation, we need to explore the possibility of harnessing this other language modeling capacity of computing. We demonstrate through our work on Shughni that computer modeling can be a means of furnishing the field-worker with elicitation tasks whose results feed into an enhanced understanding of the data, which in turn show the path to the next stage of elicitation, ultimately leading to a well-informed and robust account of the data which is already digitized and therefore exchangeable. Advances in technology, such as palm-held computers, mean that an automated model-theorem-refinement method is both a practical and potentially highly valuable addition to the field-worker’s toolkit, both while in the field and back in the lab.
Identifier (URI):		http://hdl.handle.net/10125/5066
Language:		English
Language (ISO639):		eng
Rights:		Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported
Table Of Contents:		5066-01.jpg
		5066-03.jpg
		5066.mp3
		5066.pdf
OLAC Info
Archive:		Language Documentation and Conservation
Description:		http://www.language-archives.org/archive/ldc.scholarspace.manoa.hawaii.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:scholarspace.manoa.hawaii.edu:10125/5066
DateStamp:		2024-08-27
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Hippisley, Andrew; Stump, Gregory; Raphael, Finkel. 2009. Language Documentation and Conservation.
Terms:		area_Europe country_GB iso639_eng