OLAC Record
oai:lindat.mff.cuni.cz:11234/1-4639

Metadata
Title:GECCC Grammar Error Correction Corpus for Czech
Bibliographic Citation:http://hdl.handle.net/11234/1-4639
Creator:Náplava, Jakub
Straka, Milan
Straková, Jana
Rosen, Alexandr
Date (W3CDTF):2022-01-17T09:19:51Z
Date Available:2022-01-17T09:19:51Z
Description:Grammar Error Correction Corpus for Czech (GECCC) consists of 83 058 sentences and covers four diverse domains, including essays written by native students, informal website texts, essays written by Romani ethnic minority children and teenagers and essays written by nonnative speakers. All domains are professionally annotated for GEC errors in a unified manner, and errors were automatically categorized with a Czech-specific version of ERRANT released at https://github.com/ufal/errant_czech The dataset was introduced in the paper Czech Grammar Error Correction with a Large and Diverse Corpus that was accepted to TACL. Until published in TACL, see the arXiv version: https://arxiv.org/pdf/2201.05590.pdf
Identifier (URI):http://hdl.handle.net/11234/1-4639
Is Replaced By (URI):http://hdl.handle.net/11234/1-4861
Language:Czech
Language (ISO639):ces
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
http://creativecommons.org/licenses/by-sa/4.0/
Subject:gec
grammatical error correction
dataset
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-4639
DateStamp:  2023-06-28
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Náplava, Jakub; Straka, Milan; Straková, Jana; Rosen, Alexandr. 2022. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Europe country_CZ dcmi_Text iso639_ces olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-4639
Up-to-date as of: Thu Oct 5 0:43:09 EDT 2023