End: 30/06/2026
Funding: Catalan
Status: On going
Adaptive Processing Technologies (ADAPT)
Acronym: UNIDACT
Call ID: ERC-2023-POC
Code: 101158232
UNIDACT is a universal data compression algorithm based on circular context trees originally developed for binary sources within the scope of the ITUL project, funded by an ERC Consolidator Grant. Its initial implementation tested with a wide range of simulated data demonstrated an average of at least 82% improvement over commercial algorithms like Lempel-Ziv (ZIP), Burrows-Wheeler transform compression (BZIP), and more complex state-of-the-art algorithms like prediction by partial matching (PPM) and context tree weighting (CTW). The proposed algorithm has linear complexity. The goal of this proof-of-concept project is to extend the current binary implementation to arbitrary alphabets, provide fast encoder/decoder implementation, develop specific applications to compress satellite observation data and genomic data, develop an indexed version of the compression algorithm in order to access some of the data without full decompression, and investigate specific software licensing options and opportunities.
Coordinator