Projektowanie metadanych w korpusie tekst&#243;w polskich do 1500 roku – wielopoziomowa struktura informacji

Mariusz Leńczuk

Projektowanie metadanych w korpusie tekstów polskich do 1500 roku – wielopoziomowa struktura informacji
Metadata Creation in the Language Corpus of Polish Texts until 1500 – a Multi-level Data Structure

Author(s): Mariusz Leńczuk
Subject(s): Language studies, Language and Literature Studies, Theoretical Linguistics, Applied Linguistics, Western Slavic Languages, Philology
Published by: Wydawnictwo Uniwersytetu Śląskiego
Keywords: language corpus; metadata; text; glosses; 13th–15th century

Summary/Abstract: The subject of research are selected metadata that should characterize the texts collected in the corpus of the oldest attestations of the Polish language. The author of the article compares and analyses the factors affecting the development of the basic data structure used in synchronic and diachronic corpora (author, title, date of the text, text channel, text classification, source of citation). Without those factors taken into account the disambiguation of the object in the database becomes impossible, and the use of grammatical information is unreliable and impractical. The result of the presented analysis is a proposal to extend the level of description for individual markers.

Details
Contents

Journal: Forum Lingwistyczne

Issue Year: 2020
Issue No: 7
Page Range: 59-69
Page Count: 11
Language: Polish

Content File-PDF

Back to list

Projektowanie metadanych w korpusie tekstów polskich do 1500 roku – wielopoziomowa struktura informacji Metadata Creation in the Language Corpus of Polish Texts until 1500 – a Multi-level Data Structure

Projektowanie metadanych w korpusie tekstów polskich do 1500 roku – wielopoziomowa struktura informacji
Metadata Creation in the Language Corpus of Polish Texts until 1500 – a Multi-level Data Structure