Aims of the course
The main aim of this course is to provide an overview of the opportunities for exploiting shared data within RIs. Specifically, the course aims to raise awareness of these opportunities and encourage the implementation of data-driven approaches through the application of advanced statistical techniques.
The focus is primarily on the RISIS European Infrastructure, on topics related to Science, Technology, and Innovation (STI), one of the three infrastructures that initiated the FOSSR project, and on advanced statistical techniques that FOSSR is treating in its research work packages.
Drawing inspiration from the data available within RISIS, the course will explore aspects of data access, interoperability and processing in depth. From data exploration to data processing, the process will be substantiated by the exemplary application of advanced statistical techniques, particularly Network Science and Bayesian Modelling, that will be treated in theory and practise.
After completing this course, learners will be aware of the opportunities linked to RIs in SSH, particularly the RISIS infrastructure, and will gain in-depth knowledge of techniques developed within the FOSSR research Work Packages.
Course description
The course will guide participants through the process of engaging with a RI and making reasoned use of its resources for research purposes. Furthermore, it will present the application of advanced statistical techniques on infrastructure data, particularly Network Models and Bayesian Modeling.
From an organizational perspective the course will be structured in four on-line training modules, each one built on frontal lessons and several interactions.
In particular:
Module 1. Research Infrastructures and Open Science
The module will outline the contours of the new paradigm for scientific knowledge production, emphasizing openness, the value of collaboration, and data availability. It will explore the features of data-intensive science, which underpin the creation and strengthening of Research Infrastructures (RI). The characteristics of these sociotechnical platforms will be discussed by analyzing both the Italian and European contexts, with reference to FOSSR’s efforts in developing the Italian Open Cloud for Social Sciences.
Module 2. Accessing and querying interoperable RI data
The module will present examples of accessing and querying data from research infrastructures, highlighting the importance of data interoperability. Specific cases will illustrate the use of the RISIS infrastructure for science, technology and innovation studies. We will present cases in which research questions are addressed differently depending on the nature of the data, analysing the ways of managing datasets and their enrichment with data external to the infrastructure. The content presented will highlight the value of shared database access.
Module 3. Network models applied to RI data
The module aims to illustrate the basic concepts and statistical measures of network science and provide an overview of the main statistical network models. The module will conclude with two applications where networks are analysed using data from research infrastructures. The two applications that will be covered in this module are:
- Application 1: Complex networks and academic project funding;
- Application 2: Research collaborations and research productivity.
Module 4. Causal Bayesian networks and applications to RI data
The module focuses on Bayesian networks as a tool for modelling complex causal relationships. A comparison between causal Bayesian networks and potential outcomes is carried out to highlight how the two approaches can be implemented synergistically. The module will include two applications that employ causal Bayesian networks on research infrastructure data. The two applications covered in this module are:
- Application 1: Research collaborations and research productivity;
- Application 2: Remote working and firm revenues during Covid.
The trainers
The course is curated by Andrea Orazio Spinello
Andrea Orazio Spinello, researcher at CNR-IRCrES, Researcher at CNR-IRCRES, he focuses on public funding for R&D and individual organization of scientific work. He is a WP Leader in the PNRR project – FOSSR and is among the managers of the EFIL dataset, a node of the European infrastructure RISIS for research and innovation studies.
Emanuela Varinetti, researcher at CNR-IRCrES, she focuses on data collection, documentation, and preparation activities for open access to research databases on Science, Technology, and Innovation. She is currently involved in updating the EFIL dataset, part of the European infrastructure RISIS for research and innovation studies.
Lucio Morettini, researcher at CNR-IRCrES, he focuses on higher education policy, research evaluation and impact assessment, effects of university design on job market, high skilled workers’ career dynamics and analysis of high human capital impact on firm innovation.
Antonio Zinilli, researcher at he focuses on Network Science, dynamic processes on knowledge and innovation systems, and Text Mining. He is the coordinator of the IRCRES School in “Data Science: tools and methods for analyzing complex Science, Technology and Innovation (STI) systems”. He is a WP Leader in the PNRR project – FOSSR and is among the managers of the EFIL dataset, part of the European infrastructure RISIS for research and innovation studies.
Lorenzo Giammei, researcher at CNR-IRCrES. His studies focus on causal inference implementing approaches from both potential outcomes and causale Bayesian networks. He applied the mentioned methods on microeconomic research questions related to firms, gender gap and research productivity.
Audience
The course is designed for individuals who are or wish to be involved in creating, capturing, analysing, or generally managing research data within the social science disciplines. The target audience includes, but is not limited to, early-career researchers, researchers aspiring to advance their careers, technicians, data stewards, and data managers.




