Loading...
Projects / Programmes source: ARIS

Data compression paradigm based on omitting self-evident information

Research activity

Code Science Field Subfield
2.07.00  Engineering sciences and technologies  Computer science and informatics   

Code Science Field
1.02  Natural Sciences  Computer and information sciences 
Keywords
data compression, feature, optimization, restoration algorithms, universal platform
Evaluation (metodology)
source: COBISS
Points
6,179.85
A''
1,370.3
A'
2,815.42
A1/2
4,385.77
CI10
3,553
CImax
224
h10
28
A1
22.2
A3
30.39
Data for the last 5 years (citations for the last 10 years) on October 15, 2025; Data for score A3 calculation refer to period 2020-2024
Data for ARIS tenders ( 04.04.2019 – Programme tender, archive )
Database Linked records Citations Pure citations Average pure citations
WoS  213  2,927  2,563  12.03 
Scopus  273  4,111  3,671  13.45 
Organisations (1) , Researchers (13)
0796  University of Maribor, Faculty of Electrical Engineering and Computer Science
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  53590  PhD Jernej Cukjati  Computer science and informatics  Young researcher  2022 - 2023 
2.  53755  Aljaž Jeromel  Computer science and informatics  Researcher  2022 - 2025  28 
3.  37222  PhD Štefan Kohek  Computer science and informatics  Researcher  2022 - 2025  134 
4.  16259  PhD Simon Kolmanič  Computer science and informatics  Researcher  2022 - 2025  211 
5.  21318  PhD Bogdan Lipuš  Computer science and informatics  Researcher  2022 - 2025  58 
6.  33709  PhD Niko Lukač  Computer science and informatics  Researcher  2022 - 2025  233 
7.  29243  PhD Domen Mongus  Computer science and informatics  Researcher  2022 - 2025  297 
8.  32690  Sašo Pečnik  Computer science and informatics  Researcher  2022 - 2025  24 
9.  15671  PhD David Podgorelec  Computer science and informatics  Researcher  2022 - 2025  221 
10.  08638  PhD Krista Rizman Žalik  Computer science and informatics  Researcher  2022 - 2025  192 
11.  18726  PhD Damjan Strnad  Computer science and informatics  Researcher  2022 - 2025  262 
12.  06671  PhD Borut Žalik  Computer science and informatics  Head  2022 - 2025  876 
13.  31475  Denis Žganec  Computer science and informatics  Technical associate  2022 - 2025  19 
Abstract
Data compression is one of the traditional disciplines of Computer Science, but one that has made no significant progress in recent decades. It has also failed to keep up with new scientific trends, where new devices collect ever-increasing amounts of highly heterogeneous data. These data are compressed using either domain-dependent or general-purpose methods. The latter are well-known lossless solutions from 30 years ago (e.g., RAR or ZIP). They achieve generality by handling the data stream on the level of bytes, ignoring potential higher-level relations in the data. Domain-dependent methods are lossy, near lossless, or lossless. Lossy methods operate by transforming the data into frequency space, performing the quantisation there, and encoding the remaining values in a lossless manner, whereby the lossless part is typically domain dependent as well. Near lossless and lossless methods are significantly different and typically prediction-based. However, the prediction is made from a narrow spatial and/or temporal context, which reduces its efficiency. Most methods are symmetric, which means that decoding is performed by the same pipeline as encoding, only in a reversed order. The disadvantage is that the time complexity of decoding is the same as that of encoding, which requires similar infrastructure for both the encoder and the decoder. Finally, each type of data requires a specific solution that is not transferable to other types of data (e.g. audio compression is completely different from compression of raster images). In the COMPROMISE project, we aim to develop a new data compression methodology which will be largely domain independent and asymmetric. By using a unified pipeline of procedures, the methodology will be suitable for lossy, near lossless, and lossless compression. Domain independence will be achieved by forming feature repertoires in different domains and linking those repertoires to a unified domain-independent taxonomy. In our case, a feature will be any piece of information with high discriminative or predictive value for human interpretation or machine processing (e.g., computer vision, classification) of a data stream. The obtained repertoire of features will be reduced through a domain-independent iterative optimisation process, as long as the set of remaining features will allow the restoration techniques to perform satisfactory reconstruction of the input data. The compression pipeline will be the same for lossy, lossless, and near lossless compression, except that the output in the latter two cases will include the residuals, obtained as the difference between the original and the restored data. The data decompression will be much simpler and will consist of features and residuals decoding, restoration of data from features, and applying residuals in cases of lossless or near lossless mode. This will set the requirements for the decoder substantially lower than those for the encoder. The concept of domain-independent features also allows the information about higher-level relations in the data to be preserved in the compressed form, which improves the reusability of data on different semantic levels. In order to demonstrate the universality and domain independence of the methodology we will use raster images, digital audio, biomedical signals, and sparse voxel grids in our study. These domains differ in both the data dimensionality and dynamism, while addressing two human perceptual systems – vision and hearing. The proposed domain independent methodology will be implemented with a unified platform, which will be used to demonstrate the efficiency and universality of the COMPROMISE methodology, to validate the key performance indicators, and to verify the scientific hypothesis. By using the methodology, we expect to achieve better lossless and near lossless compression ratios than existing domain-dependent methods, which will set the foundation for a new generation of data compression methods.
Views history
Favourite