My main research interests concern data models for semi-structured data and information retrieval methodologies.
A new line of research I am pursuing regards data citation. More information are available here: Data Citation.
My main research contributions are within the fields of "Information Retrieval", "Digital Archives and Libraries" and "Data Models".
HEREDITARY aims to significantly transform the way we approach disease detection, prepare treatment response, and explore medical knowledge by building a robust, interoperable, trustworthy and secure framework that integrates multimodal health data (including genetic data) while ensuring compliance with cross-national privacy-preserving policies. The HEREDITARY framework comprises five interconnected layers, from federated data processing and semantic data integration to visual interaction.
By utilizing advanced federated analytics and learning workflows, we aim to identify new risk factors and treatment responses focusing, as exploratory use cases, on neurodegenerative and gut microbiome related disorders. HEREDITARY is harmonizing and linking various sources of clinical, genomic, and environmental data on a large scale. This enables clinicians, researchers, and policymakers to understand these diseases better and develop more effective treatment strategies. HEREDITARY adheres to the citizen science paradigm to ensure that patients and the public have a primary role in guiding scientific and medical research while maintaining full control of their data. Our goal is to change the way we approach healthcare by unlocking insights that were previously impossible to obtain.
Role: Project Coordinator
Project No: 101137074
Call: ORIZON-HLTH-2023-TOOL-05
Topic: Tools and technologies for a healthy society
Funding (UNIPD): 1.138.046€
Website: https://hereditary-project.eu/
Amyotrophic Lateral Sclerosis (ALS) and Multiple Sclerosis (MS) are chronic diseases characterized by progressive or alternate impairment of neurological functions (motor, sensory, visual, cognitive). Artificial Intelligence is the key to successfully satisfy these needs to: i) better describe disease mechanisms; ii) stratify patients according to their phenotype assessed all over the disease evolution; iii) predict disease progression in a probabilistic, time dependent fashion; iv) investigate the role of the environment; v) suggest interventions that can delay the progression of the disease. BRAINTEASER will integrate large clinical datasets with novel personal and environmental data collected using low-cost sensors and apps.
We are leader of the "Open Science and FAIR Data" WP. The main goals of the WP are:
Exascale volumes of diverse data from distributed sources are continuously produced. Healthcare data stand out in the size produced (production is expected to be over 2000 exabytes in 2020), heterogeneity (many media, acquisition methods), included knowledge (e.g. diagnosis) and commercial value. The supervised nature of deep learning models requires large labeled, annotated data, which precludes models to extract knowledge and value. Examode solves this by allowing easy & fast, weakly supervised knowledge discovery of exascale heterogeneous data, limiting human interaction.
We are leader of the "Semantic knowledge discovery and visualisation" WP. The main goals of the WP are:
CDC is a Supporting TAlent in ReSearch@University of Padova (STARS Grants).
The computational problem targeted by CDC is to automatically generate complete citations for general queries over evolving data sources represented by diverse data models. The aim of this research program is to design the first well-founded model as well as to develop efficient algorithms and a solid citation system for citing data.
This research program is timely because the paradigm shift towards data-intensive science is happening now and scientific communication must adapt as quickly as possible to the new ways in which science progresses; and, it is ambitious because it shapes a new field in computer science as well as it tackles with a uniform approach a range of computational issues, query languages and data models that have never been treated with a shared vision before.
The broader impact of this research will be on scientists and data centers that curate, elaborate and publish data, on government agencies that direct research investments, and on research performance measures (e.g., the h-index) that will be based also on data and not only on text-based contributions.
Role: Principal Investigator
Funding: 130.000€
PREFORMA is Pre-Commercial Procurement (PCP) project (Contract n. 258191) co-funded by the European Commission under its FP7-ICT Programme..
The main goal of the project is to address the challenge of implementing good quality standardised file formats and to give memory institutions full control of the process of the conformity tests of files to be ingested into archives.
Role: I collaborate in the activities of the WP7 Validation and testing and WP8 Competitive Evaluation and Monitoring of the RTD work. Leader of Task 7.1 and task 8.1.
Sistema Informativo Archivistico Regionale, Regional Archival Information System (SIAR) Project.
It is a project aimed to develop a distributed Digital
Library System (DLS) for describing, managing, accessing and sharing archival resources.
SIAR is a joint project with the Italian Veneto Region and the "Sopraintendenza Archivistica per il Veneto" (Archival Regional Board of the Ministry of Cultural Heritage).
Role: Participant of the unit of the Department; I'm working on the design and developement of the infrastructure of the SIAR system.
It was a STREP project co-financed by the European Commission the goal of which is to pioneer the development of the next generation of adaptive systems which will provide new forms of multi-dimensional adaptivity. The main challenge it faces is to instigate, increase and enhance engagement with digital humanities collections. To achieve this, it aims at changing the way cultural artifacts are experienced and contributed to by communities.
Role: Within CULTURA I collaborated in the activities about user requirements analysis for developing models and systems able to manage digital archives of illuminated manuscripts of interest for different domains such as history of art, history of science, botany, astronomy and medicine.
It aimed at providing a virtual laboratory for conducting participative research and
experimentation to carry out, advance and bring automation into the
evaluation and benchmarking of complex multilingual and multimedia
information systems, by facilitating management and offering access,
curation, preservation, re-use, analysis, visualization, and mining of the collected experimental data.
Role: Participant of the unit of the Department; I worked on the design and developement of the PROMISE evaluation infrastructure.