Projects / Programmes
FORMANT COMBINATORICS IN SLOVENIAN
Code |
Science |
Field |
Subfield |
6.05.00 |
Humanities |
Linguistics |
|
Code |
Science |
Field |
6.02 |
Humanities |
Languages and Literature |
Word Formation, Word Families, Semantic Functions of Morphemes, Morphotactics, Digital Humanities, Deep Learning, Natural Language Learning, Slovenian Language
Organisations (3)
, Researchers (11)
0618 Research Centre of the Slovenian Academy of Sciences and Arts
no. |
Code |
Name and surname |
Research area |
Role |
Period |
No. of publicationsNo. of publications |
1. |
32204 |
PhD Boris Kern |
Linguistics |
Head |
2021 - 2025 |
313 |
2. |
29395 |
PhD Nina Ledinek |
Linguistics |
Researcher |
2021 - 2025 |
150 |
3. |
12845 |
PhD Andreja Žele |
Linguistics |
Researcher |
2021 - 2025 |
906 |
0106 Jožef Stefan Institute
no. |
Code |
Name and surname |
Research area |
Role |
Period |
No. of publicationsNo. of publications |
1. |
05023 |
PhD Tomaž Erjavec |
Linguistics |
Researcher |
2021 - 2025 |
694 |
2. |
50070 |
PhD Matej Martinc |
Linguistics |
Researcher |
2021 - 2025 |
97 |
3. |
53800 |
Andraž Pelicon |
Linguistics |
Researcher |
2021 - 2023 |
53 |
4. |
29539 |
PhD Vid Podpečan |
Computer science and informatics |
Researcher |
2023 - 2025 |
114 |
5. |
31844 |
PhD Senja Pollak |
Linguistics |
Researcher |
2021 - 2025 |
338 |
6. |
56524 |
Marko Pranjić |
Linguistics |
Researcher |
2023 - 2025 |
28 |
2565 University of Maribor Faculty of Arts
no. |
Code |
Name and surname |
Research area |
Role |
Period |
No. of publicationsNo. of publications |
1. |
00299 |
PhD Irena Stramljič Breznik |
Humanities |
Researcher |
2021 - 2025 |
664 |
2. |
33264 |
PhD Ines Voršič |
Linguistics |
Researcher |
2021 - 2025 |
119 |
Abstract
Slovenian, like other Slavic languages, is characterized by an extremely rich morphemic structure of words, which is a result of multistage word formation; for example, in the first stage, the adjective mlad ‘young’ yields the noun mladost ‘youth’, which in turn yields the adjective mladosten ‘youthful’ in the second stage, which in turn yields the noun mladostnik ‘adolescent’ in the third stage, which in turn yields the possessive adjective mladostnikov ‘adolescent’s’ in the fourth stage. This example shows the compatibility of four suffixal formants: -ost + -en + -ik + -ov. The compatibility of formants is considered to be the ability of different word-formation morphemes to coexist within multistage formation, taking into account the semantic-extension aspect. The proposed project focuses on the analysis and description of the compatibility of wordformation morphemes (formants) within multistage formation (mlad › mladost › mladosten › mladostnik › mladostnikov), which will generate a new research field in Slovenian linguistics: morphotactics. This is an innovation because this field does not yet exist in Slovenian linguistics. Such an analysis – by determining (a) the systemic predictability of formation in terms of the compatibility of suffixal formants and (b) its limitations – will make it possible to present the characteristics of word-formation and semantic-extension mechanisms of Slovenian, on contemporary language material, including all contemporary Slovenian dictionaries and corpora, by integrating the most state-of-the-art research methods in linguistics and language technologies (including deep learning). In so doing, we will perform the first comprehensive analysis of Slovenian word formation in 70 years. The language technology objective of the project is a pioneering compilation of the first training set and the first language technology application allowing automatic morpheme segmentation of Slovenian words. This is also of key importance for the development of semantic language resources and language technologies for Slovenian, and it is of course undoubtedly important for linguistics as well.