Theme: Cloud Computing & Data Infrastructures
Scientific Methodology and Performance Evaluation
(3ECTS - 18h)
Overview
The course aims to provide the fundamental basis for a sound scientific methodology of experimental evaluation in computer science. This lecture emphasizes on methodological aspects of measurement and on the statistics (confidence interval, linear regression and significance of the estimates) needed to analyze computer systems, human-computer interaction systems, and even machine learning systems. We first sensibilize the audience to reproducibility issues related to empirical and experimental research in computer science as well as scientific integrity aspects. Then we present tools that help address the aforementioned issues and we give the audience the basis of probabilities and statistics required to develop sound experiment designs.
The content of the lecture is therefore both theoretical (but without a single proof of theorems, solely interpretations of major results) and practical, illustrated by a lot of case studies and practical homeworks. The goal is not to provide analysis recipes or techniques that researchers can blindly apply but instead to make students develop critical thinking and understand some simple (and possibly not-so-simple) tools they can both readily use and explore later on.
Prerequisite
- Familiarity with the use of a computational notebook (Jupyter, R, org-mode, …)
- Familiarity with the use of Git/GitLab/GitHub
- Having already failed in properly conducting experiments and reaching a sound conclusion
All the examples given in this series of lecture use the R language as it is more appropriate than Python for this kind of work but every important concept will be introduced and resources will be provided to allow students to quickly level up.
Targeted skills
- Criticizing (and proposing improvements):
- the graphical representation of experimental results presented in slides or articles;
- the experimental protocol and its shortcomings;
- the quality of the data;
- the proposed data analysis and conclusion reached;
- the research question.
- Working in a fully transparent (journal, version tracking, archiving, …), rigorous, and controlled (experiment design, software environment) way
- Fluency with modern data analysis frameworks such as the Tidyverse (in particular ggplot2 and dplyr.
Outline
The content adapts to students but the lecture’s webpage (with slides, material, exercises, etc.) for the last session is here.
Here are the topics that are typically covered during the lecture.
- Epistemology of computer science
- Computer Science is an Experimental Science: Randomness is unavoidable whenever human beings are involved but can also not be ignored anymore given the complexity of modern computer systems (network, cpus, hardware/software stack) or when working in a machine learning context which relies on observational data and remains empirical.
- Science is defined by its method, not by its results: Claude Bernard, Karl Popper, Thomas Kuhn, Imre Lakatos, …
- Credibility crisis, scientific integrity
- Open Science and Reproducible Research
- Laboratory notebook
- Computational document (jupyter, Rstudio, orgmode)
- Version control, data management, and archiving
- Data curation (missing data, outliers, typing issues)
- Data visualization and hypothesis checking (Exploratory Data Analysis)
- Communicating results
- Introduction to statistics
- Random variables, central limit theorem, confidence interval, statistical test
- Bayesian framework: Bayes rules, Maximum likelihood vs. Posterior sampling, Credible interval
- ANOVA, Linear regression and extensions (mostly logistic)
- Gaussian Process
- Observation vs. Experiment
- Correlation, Causation: mostly “dont’s”
- Notions of bias (statistical, experimental, observational/sampling, etc.)
- Metrology: measurement and tracing, precision, practical computer science issues and tools
- Experimental Design
- Methodology (fishbone, experiment structure)
- Difference between quantitative/qualitative observational/experimental data/analysis
- Sequential vs. incremental approach
- 2-level factorial designs, screening designs, LHS/MaxiMin designs
- Active/online learning with bandits (\epsilon-Greedy, UCB, Thompson) and extensions (surrogates: GP-UCB, EI)
Evaluation
Evaluation process:
- Continuous assessment through homeworks: 50% (this grade cannot be replaced in the second session)
- Final exam (3 hours): 50% (this grade is replaced by the second session, which may be an oral or a written exam depending on the number of students)