DATALAB specializes in a broad range of data management topics. Current research activities include the following areas.

Core Data Management Technologies

We conduct research on core database technologies, including preference queries, adaptive query processing, similarity query processing, query processing in spatial and spatiotemporal databases, indexing, distributed query processing. We also investigate optimization of modern data analytics flows.

Data Mining and Knowledge Discovery

We perform research on the following topics: web mining; graph mining; data mining for data streams and massive datasets/social networks, outlier detection, recommendation systems for social networks, collaborative filtering, model-based recommender systems, encrypted analytics, causality.

We contribute to the MOA – Massive Online Analysis tool (http://moa.cms.waikato.ac.nz/)

Graph Management and Mining

Areas of interest and expertise in this field include graph triangulation, community detection, dense subgraph mining, uncertainty, multi-layer graphs, evolving graphs, virus propagation, and graph analytics.

Distributed and Massively Parallel Systems to support Data Science

The general fields of our interest cover massive parallel and distributed data analytics. Specific topics can be described as follows: data management in MapReduce, Spark and cloud environments, resource elasticity, resource optimization, data management on GPUs, queries over Web Services, autonomic data management for wide-area environments.

Informetrics

The general fields of our interest cover bibliometrics, scientometrics, and webometrics. Specific topics can be described as follows: informetric laws (e.g., Lotka, Zipf, but also laws of growth and ageing or obsolescence) hereby also modelling generalised bibliographies, citation theory, linking theory, indicators (definitions and properties), evaluation techniques for scientific output (literature, persons) and for documentary systems (information retrieval) including ranking theory, graph-theoretic and topological analysis of networks (including Internet, intranets, citation and collaboration networks), visualisation and mapping of science (persons, fields, institutes, topics).

Theory

We perform research on the following topics: Data Structure for Main and Secondary Memory (mainly known “hard problems”), P2P Algorithms and Data Structures (design of overlays), Design and Analysis of Algorithms (fundamental problems), Computational Geometry (fundamental operations and apps to other areas), String Algorithms, Dynamic Graph Algorithms, Complexity (mainly interest on how Physics interact with Informatics)

Bioinformatics – Biomedical

We investigate advanced topics, such as 3D protein structure similarity, and motif discovery in strings or graphs coming from EEGs as well as from biological data like weighted sequences that model the uncertainty of nucleobases while reading a gene.

Recent projects (after 2012):

cHiPSet: High-performance modelling and simulation for big data applications EU (COST) http://chipset-cost.eu/
KnowEscape: Analyzing the dynamics of information and knowledge landscapes EU (COST) http://knowescape.org/
MOVE: Knowledge discovery from moving objects EU (COST) http://www.move-cost.info/
Cloud9: A multidisciplinary, holistic approach to internet-scale cloud computing General Secretariat of Research and Technology (Thales) https://sites.google.com/site/thaliscloud9/project-definition
EICOS: foundations for perSOnalized Cooperative Information Ecosystems General Secretariat of Research and Technology (Thales) http://web.imis.athena-innovation.gr/projects/eicos/en/project.html
Blogforever EU (FP7) http://blogforever.eu/
TRACER – Identifying software vulnerabilities and securing legacy systems General Secretariat of Research and Technology (Cooperation) https://istlab.dmst.aueb.gr/content/projects/p_tracer.html
SFINX General Secretariat of Research and Technology (Cooperation) http://sphinx.vtrip.net/
Recommendation engine for Web 2.0 General Secretariat of Research and Technology (Bilateral)
Data mining from Web 2.0 General Secretariat of Research and Technology (Bilateral)