Loading...
Projects / Programmes source: ARIS

Digital humanities: resources, tools and methods

Periods
January 1, 2022 - December 31, 2027
Research activity

Code Science Field Subfield
6.05.00  Humanities  Linguistics   
2.07.00  Engineering sciences and technologies  Computer science and informatics   

Code Science Field
6.02  Humanities  Languages and Literature 
1.02  Natural Sciences  Computer and information sciences 
Keywords
Digital Humanities, Digital Editions, Historical Collections, Oral History, Corpus Linguistics, Machine Learning, Mixed Methods, Language Technologies, Speech Technologies, Computer Vision
Evaluation (metodology)
source: COBISS
Points
11,553.34
A''
2,010.04
A'
5,612.86
A1/2
7,016.71
CI10
1,346
CImax
82
h10
17
A1
39.49
A3
7.11
Data for the last 5 years (citations for the last 10 years) on October 15, 2025; Data for score A3 calculation refer to period 2020-2024
Data for ARIS tenders ( 04.04.2019 – Programme tender, archive )
Database Linked records Citations Pure citations Average pure citations
WoS  99  559  490  4.95 
Scopus  211  1,464  1,268  6.01 
Organisations (3) , Researchers (23)
0501  Institute for Contemporary History
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  53681  PhD Ksenija Bogetić Pejović  Linguistics  Researcher  2023 - 2025  92 
2.  57130  Filip Dobranić  Linguistics  Researcher  2022 - 2025  34 
3.  57652  Bojan Evkoski  Linguistics  Researcher  2023 
4.  26294  PhD Darja Fišer  Linguistics  Head  2022 - 2025  431 
5.  56952  PhD Ganna Kryvenko  Linguistics  Researcher  2022 - 2025  66 
6.  50983  PhD Jakob Lenardič  Linguistics  Researcher  2022 - 2025  66 
7.  16350  PhD Andrej Pančur  Historiography  Researcher  2022 - 2025  281 
8.  08411  PhD Jurij Perovšek  Historiography  Retired researcher  2022  861 
9.  38461  PhD Ajda Pretnar Žagar  Computer science and informatics  Researcher  2022 - 2025  56 
10.  56554  Jure Skubic  Sociology  Researcher  2022 - 2025  35 
11.  17106  PhD Mojca Šorn  Historiography  Researcher  2022 - 2025  194 
0581  University of Ljubljana, Faculty of Arts
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  55593  David Bordon  Linguistics  Young researcher  2022 - 2025  18 
2.  14681  PhD Vojko Gorjanc  Linguistics  Researcher  2022 - 2025  499 
3.  50983  PhD Jakob Lenardič  Linguistics  Researcher  2022  66 
4.  53255  Kristina Pahor de Maiti Tekavčič  Linguistics  Researcher  2022 - 2025  38 
5.  33783  PhD Damjan Popič  Linguistics  Researcher  2025  131 
6.  56554  Jure Skubic  Sociology  Researcher  2022 - 2023  35 
1539  University of Ljubljana, Faculty of Computer and Information Science
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  30062  PhD Ciril Bohak  Computer science and informatics  Researcher  2024 - 2025  206 
2.  56644  Andraž Juvan    Technical associate  2024 - 2025 
3.  16131  PhD Alenka Kavčič  Computer science and informatics  Researcher  2022 - 2025  149 
4.  15677  PhD Matija Marolt  Computer science and informatics  Researcher  2022 - 2025  436 
5.  35071  PhD Matevž Pesek  Computer science and informatics  Researcher  2022 - 2025  192 
6.  59790  Uroš Šmajdek  Computer science and informatics  Young researcher  2024 - 2025 
Abstract
Digitisation of cultural heritage in Slovenia falls below the European average. With the exception of corpus linguistics and literary history where richly annotated corpora and critical editions are plentiful and internationally renowned, datalization work on DH collections has been highly fragmented and largely incompatible due to different encoding standards used. This clearly shows that the requirements of supporting state-of-the art data-driven DH research has not yet been met in Slovenia which significantly hinders research in the national context but also its international visibility. The research programme addresses three broad research problems: 1) the development and integration of advanced workflows for creating and publishing reliable, interpretable, interoperable and richly annotated complex digital editions; 2) the development and promotion of novel interdisciplinary quantitative and qualitative methods for Slovenian DH; and 3) the development and evaluation of advanced technologies for the processing, enrichment and visualization of historical heterogeneous multilingual and multimodal data. The scientific framework of the proposed research programme is highly interdisciplinary but rooted in corpus and computational linguistics and machine learning of heterogeneous structured and unstructured collections from Slovenian contemporary history, periodical studies, political studies and anthropology that are both mono- and multilingual, and contain textual, speech and image data and metadata. Developing and integrating these methods into Slovene DH is crucial because of their importance and potential to contribute to a comprehensive understanding of past and present cultural phenomena in the European context. The programme will also contribute novel technologies to support digitization of DH research data and cultural heritage. The programme will perform the complete cycle of DH research activities: capture, organization and storage; enrichment; analysis, visualization, interpretation, and dissemination of the results. While the methods and technologies developed within the programme are not limited to specific data collections, the programme will contribute 6 major new open-source datasets containing text, speech and images. It will improve methods for digitizing complex historical documents, develop novel methods for transcribing oral history recordings, and engage in development of internationally novel methods for processing images in historical documents. The programme will adapt language and text enrichment technologies to historical and dialectal language as well as develop support for exploring multilingual documents important for investigations of Slovenian history. The programme will develop advanced mixed methods for network, geospatial and temporal analysis of DH data.
Significance for science
The impact of the proposed digital scholarship research programme will be manifold. First and foremost, it will engage in traditional humanities research in a significant new way but will also generate novel research questions, methodological approaches, findings and theoretical paradigms at the interface between language-driven analysis of historical multimodal data and data science. Beyond ensuring preservation and accessibility of cultural heritage data, it will enable the enrichment of DH data and metadata using state-of-the-art data science methods. The programme will also address the exploitation of the contents of the created digital resources and the adaptation and development of appropriate language technologies to search and retrieve information from Big Data of the Past. Considering the fact that the members of the research group are already involved in international DH research, especially in DH text and speech processing, we expect that the results will achieve international impact and recognition and be relevant for other languages. The proposed programme will solve open and topical problems with long-term goals which have a high scientific and technological potential. It will promote critical, transparent and reproducible research in humanities, covering data, code, workflows, methods and documentation. Successful implementation of the presented challenges requires strong cooperation between humanities and data science experts. By engaging in cross-domain knowledge-transfer and promoting an interdisciplinary approach focusing on new digital methods and tools for digital humanities research and teaching, it will have a lasting disciplinary impact.
Significance for the country
The proposed programme will importantly contribute to the infrastructure for cultural heritage, which is also addressed by the following objectives of the Resolution of the national programme for language policy 2021-2025: digitization, description, preservation and open access of cultural heritage Slovene language materials. The resources, tools and methods developed in the programme will be woven into humanities and data science curricula, which will develop interdisciplinary profiles of human data analysts that are in great demand in the public sector but also in the industry. The results of the project will also enable cultural innovation and algorithmic creativity in the field of representation and mediation of the arts and humanities. With the open-source historical digital collections, language models, training sets and toolchains that will enable automated handling of complex documents, audio archives and images containing non-canonical Slovene, the proposed programme will close the gap on processing of non-canonical (historical, dialectal, multilingual) language data that is important in today's information society and economy. The urgent need for digitisation of all aspects of the society and economy has been recognized by the newly established national Strategic council for digitization which shares many objectives of the proposed research programme.
Views history
Favourite