Projects / Programmes
Data compression paradigm based on omitting self-evident information
Code |
Science |
Field |
Subfield |
2.07.00 |
Engineering sciences and technologies |
Computer science and informatics |
|
Code |
Science |
Field |
1.02 |
Natural Sciences |
Computer and information sciences |
data compression, feature, optimization, restoration algorithms, universal platform
Data for the last 5 years (citations for the last 10 years) on
October 15, 2025;
Data for score A3 calculation refer to period
2020-2024
Data for ARIS tenders (
04.04.2019 – Programme tender,
archive
)
Database |
Linked records |
Citations |
Pure citations |
Average pure citations |
WoS |
213
|
2,927
|
2,563
|
12.03
|
Scopus |
273
|
4,111
|
3,671
|
13.45
|
Organisations (1)
, Researchers (13)
0796 University of Maribor, Faculty of Electrical Engineering and Computer Science
no. |
Code |
Name and surname |
Research area |
Role |
Period |
No. of publicationsNo. of publications |
1. |
53590 |
PhD Jernej Cukjati |
Computer science and informatics |
Young researcher |
2022 - 2023 |
6 |
2. |
53755 |
Aljaž Jeromel |
Computer science and informatics |
Researcher |
2022 - 2025 |
28 |
3. |
37222 |
PhD Štefan Kohek |
Computer science and informatics |
Researcher |
2022 - 2025 |
134 |
4. |
16259 |
PhD Simon Kolmanič |
Computer science and informatics |
Researcher |
2022 - 2025 |
211 |
5. |
21318 |
PhD Bogdan Lipuš |
Computer science and informatics |
Researcher |
2022 - 2025 |
58 |
6. |
33709 |
PhD Niko Lukač |
Computer science and informatics |
Researcher |
2022 - 2025 |
233 |
7. |
29243 |
PhD Domen Mongus |
Computer science and informatics |
Researcher |
2022 - 2025 |
297 |
8. |
32690 |
Sašo Pečnik |
Computer science and informatics |
Researcher |
2022 - 2025 |
24 |
9. |
15671 |
PhD David Podgorelec |
Computer science and informatics |
Researcher |
2022 - 2025 |
221 |
10. |
08638 |
PhD Krista Rizman Žalik |
Computer science and informatics |
Researcher |
2022 - 2025 |
192 |
11. |
18726 |
PhD Damjan Strnad |
Computer science and informatics |
Researcher |
2022 - 2025 |
262 |
12. |
06671 |
PhD Borut Žalik |
Computer science and informatics |
Head |
2022 - 2025 |
876 |
13. |
31475 |
Denis Žganec |
Computer science and informatics |
Technical associate |
2022 - 2025 |
19 |
Abstract
Data compression is one of the traditional disciplines of Computer Science, but one that has made no significant progress in recent decades. It has also failed to keep up with new scientific trends, where new devices collect ever-increasing amounts of highly heterogeneous data. These data are compressed using either domain-dependent or general-purpose methods. The latter are well-known lossless solutions from 30 years ago (e.g., RAR or ZIP). They achieve generality by handling the data stream on the level of bytes, ignoring potential higher-level relations in the data. Domain-dependent methods are lossy, near lossless, or lossless. Lossy methods operate by transforming the data into frequency space, performing the quantisation there, and encoding the remaining values in a lossless manner, whereby the lossless part is typically domain dependent as well. Near lossless and lossless methods are significantly different and typically prediction-based. However, the prediction is made from a narrow spatial and/or temporal context, which reduces its efficiency. Most methods are symmetric, which means that decoding is performed by the same pipeline as encoding, only in a reversed order. The disadvantage is that the time complexity of decoding is the same as that of encoding, which requires similar infrastructure for both the encoder and the decoder. Finally, each type of data requires a specific solution that is not transferable to other types of data (e.g. audio compression is completely different from compression of raster images). In the COMPROMISE project, we aim to develop a new data compression methodology which will be largely domain independent and asymmetric. By using a unified pipeline of procedures, the methodology will be suitable for lossy, near lossless, and lossless compression. Domain independence will be achieved by forming feature repertoires in different domains and linking those repertoires to a unified domain-independent taxonomy. In our case, a feature will be any piece of information with high discriminative or predictive value for human interpretation or machine processing (e.g., computer vision, classification) of a data stream. The obtained repertoire of features will be reduced through a domain-independent iterative optimisation process, as long as the set of remaining features will allow the restoration techniques to perform satisfactory reconstruction of the input data. The compression pipeline will be the same for lossy, lossless, and near lossless compression, except that the output in the latter two cases will include the residuals, obtained as the difference between the original and the restored data. The data decompression will be much simpler and will consist of features and residuals decoding, restoration of data from features, and applying residuals in cases of lossless or near lossless mode. This will set the requirements for the decoder substantially lower than those for the encoder. The concept of domain-independent features also allows the information about higher-level relations in the data to be preserved in the compressed form, which improves the reusability of data on different semantic levels. In order to demonstrate the universality and domain independence of the methodology we will use raster images, digital audio, biomedical signals, and sparse voxel grids in our study. These domains differ in both the data dimensionality and dynamism, while addressing two human perceptual systems – vision and hearing. The proposed domain independent methodology will be implemented with a unified platform, which will be used to demonstrate the efficiency and universality of the COMPROMISE methodology, to validate the key performance indicators, and to verify the scientific hypothesis. By using the methodology, we expect to achieve better lossless and near lossless compression ratios than existing domain-dependent methods, which will set the foundation for a new generation of data compression methods.