1.

Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization

We developed a weighted archetypal analysis method and adapted it for query-focused multi-document summarization. The method performs better than the best known methods which were till now used for this kind of problems.

COBISS.SI-ID: 10228052

2.

A combinatorial approach to graphlet counting

We developed a new algorithm for counting the graphlet frequencies and orbits in sparse large graphs. The algorithm is applicable to many areas, in particular in the field of bioinformatics. Its time complexity is smaller than that of the existing algorithms for an order of magnitude; in practical terms, the execution times are hundred-fold shorter on the graphs we typically encounter in bioinformatics.

COBISS.SI-ID: 10322516

3.

Epitope predictions indicate the presence of two distinct types of epitope-antibody-reactivities determined byepitope profiling of intravenous immunoglobulins

We developed computer methods for epitope prediction based on peptide-array data. 75,534 peptides on microarrays were incubated with IVIG, and their reactivities were measured. These data were used to train an ensemble of classifiers, which performed better than the state of the art. In addition, we discovered that there are two types of epitopes, which are recognized by antibodies differently. The first group is presented by MHC complexes, which leads us to believe that T-cells are involved in their recognition in addition to B-cells; for the second group this is not true.

COBISS.SI-ID: 27278375

4.

Conserved developmental transcriptomes in evolutionarily divergent species

We used RNA Seq data to compare the developmental transcriptome of D. discoideum and D. purpureum, which are morphologically very similar even though their genomes are as divergent as those of man and jawed fish. We found surprisingly high level of conservation between the two transcriptomes, indicating that the expression regulatory networks have been highly conserved, and that developmental programs are remarkably well conserved at the transcriptome level across great evolutionary distances. Data analysis performed by FRI included mapping of RNA Seq reads to the two reference genomes, quantification of gene expression, clustering and comparison of developmental steps in the two species based on gene expression, identification of differentially expressed genes and Gene Ontology term enrichment analysis.

COBISS.SI-ID: 9921108

5.

Discovering disease-disease associations by fusing systems-level molecular data

The paper describes a new data fusion approach to find new disease interactions from a plethora of heterogenous molecular biology data sources. The approach was able to discover a multi-level hierarchy of disease classes that significantly overlaps with existing disease classification. In it, we find 14 disease-disease associations currently not present in Disease Ontology and provide evidence for their relationships through comorbidity data and literature curation. Interestingly, even though the number of known human genetic interactions is currently very small, we find they are the most important predictor of a link between diseases. Finally, we show that omission of any one of the included data sources reduces prediction quality, further highlighting the importance in the paradigm shift towards systems-level data fusion.

COBISS.SI-ID: 10253396

P2-0209 — Annual report 2013

1.

Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization

2.

A combinatorial approach to graphlet counting

3.

Epitope predictions indicate the presence of two distinct types of epitope-antibody-reactivities determined byepitope profiling of intravenous immunoglobulins

4.

Conserved developmental transcriptomes in evolutionarily divergent species

5.

Discovering disease-disease associations by fusing systems-level molecular data