WP3: Compression and Indexing Techniques for Repetitive Data

Work Package Info:

Lead beneficiary: UH, Start month: 1, End month 48

Objectives:

  • O3.1: Share the expertise of UH, INESC-ID, UCHILE, UDC and UDEC on compressing and indexing repetitive
    collections of data in different fields.
  • O3.2: Make new contributions in compression and indexing techniques for repetitive data.
  • O3.3: Identify areas of future collaborative research between the partners in this research task

Tasks:

  • T3.1: Review of the state-of-the-art in compression and indexing of repetitive data collections in the field of
    bioinformatics and information retrieval.
  • T3.2: Exploration of new compression techniques and indexes, including: extensions and combinations of grammar and Lempel-Ziv (LZ77) compression; relative LZ77 (RLZ) combined with fast direct access; scalable construction of LZ77 and of grammar for massive data; and efficient representations of suffix trees for repetitive collections.
  • T3.3: Implementation, experimentations, and evaluation of proposed techniques.

List of deliverables:

There is one deliverable for this WP consisting in a document that includes a previous study and research done during the project in compression and indexing of repetitive data collections in the field of bioinformatics and information retrieval.

Deliv. Number Deliverable Name Lead Beneficiary Dissemination Level Due Date (month) Status
D3.1 Final report of WP3 UH Public 48 Pending