Filter by Type

Filter by Year

Sort by Year

Algorithmic Fairness Datasets: the Story so Far

Alessandro Fabris, Stefano Messina, Gianmaria Silvello and Gian Antonio Susto
Journal Paper Data Mining and Knowledge Discovery. Volume 36, Issue 6, 2022.

Abstract

Data-driven algorithms are studied and deployed in diverse domains to support critical decisions, directly impacting people's well-being. As a result, a growing community of researchers has been investigating the equity of existing algorithms and proposing novel ones, advancing the understanding of risks and opportunities of automated decision-making for historically disadvantaged populations. Progress in fair Machine Learning (ML) and equitable algorithm design hinges on data, which can be appropriately used only if adequately documented. Unfortunately, the algorithmic fairness community, as a whole, suffers from a collective data documentation debt caused by a lack of information on specific resources (\emph{opacity}) and scatteredness of available information (\emph{sparsity}). In this work, we target this data documentation debt by surveying over two hundred datasets employed in algorithmic fairness research, and producing standardized and searchable documentation for each of them. Moreover we rigorously identify the three most popular fairness datasets, namely Adult, COMPAS, and German Credit, for which we compile in-depth documentation. This unifying documentation effort supports multiple contributions. Firstly, we summarize the merits and limitations of Adult, COMPAS, and German Credit, adding to and unifying recent scholarship, calling into question their suitability as general-purpose fairness benchmarks. Secondly, we document hundreds of available alternatives, annotating their domain and supported fairness tasks, along with additional properties of interest for fairness practitioners and researchers, including their format, cardinality, and the sensitive attributes they encode. We summarize this information, zooming in on the tasks, domains, and roles of these resources. Finally, we analyze these datasets from the perspective of five important data curation topics: anonymization, consent, inclusivity, labeling of sensitive attributes, and transparency. We discuss different approaches and levels of attention to these topics, making them tangible, and distill them into a set of best practices for the curation of novel resources.

Tackling Documentation Debt: A Survey on Algorithmic Fairness Datasets

Alessandro Fabris, Stefano Messina, Gianmaria Silvello and Gian Antonio Susto
Conference Paper Proceedings of the second ACM conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EEAMO 2022). In print.

Abstract

Measuring Gender Stereotype Reinforcement in Information Retrieval Systems

Alessandro Fabris, Alberto Purpura, Gianmaria Silvello and Gian Antonio Susto
Workshop Paper Proceedings of the 12th Italian Information Retrieval Workshop (IIR 2021), Bari, 2021.

Abstract

Can we measure the tendency of an Information Retrieval (IR) system to reinforce gender stereotypes in its users? In this abstract, we define the construct of Gender Stereotype Reinforcement (GSR) in the context of IR and propose a measure for it based on Word Embeddings. We briefly discuss the validity of our measure and summarize our experiments on different families of IR systems.

Algorithmic Audit of Italian Car Insurance: Evidence of Unfairness in Access and Pricing

Alessandro Fabris, Alan Mishler, Stefano Gottardi, Mattia Carletti, Matteo Daicampi, Gian Antonio Susto and Gianmaria Silvello
Conference Paper Proceedings of the 4th AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES 2021), Virtual Event, 2021.

Abstract

We conduct an audit of pricing algorithms employed by companies in the Italian car insurance industry, primarily by gathering quotes through a popular comparison website. While acknowledging the complexity of the industry, we find evidence of several problematic practices. We show that birthplace and gender have a direct and sizeable impact on the prices quoted to drivers, despite national and international regulations against their use. Birthplace, in particular, is used quite frequently to the disadvantage of foreign-born drivers and drivers born in certain Italian cities. In extreme cases, a driver born in Laos may be charged 1,000€ more than a driver born in Milan, all else being equal. For a subset of our sample, we collect quotes directly on a company website, where the direct influence of gender and birthplace is confirmed. Finally, we find that drivers with riskier profiles tend to see fewer quotes in the aggregator result pages, substantiating concerns of differential treatment raised in the past by Italian insurance regulators.

Incentives for Item Duplication under Fair Ranking Policies

Giorgio Maria Di Nunzio, Alessandro Fabris, Gianmaria Silvello and Gian Antonio Susto
Workshop Paper Proceedings of the 2nd International Workshop on Algorithmic Bias in Search and Recommendation (BIAS@ECIR2021), Virtual Event, 2021.

Abstract

Ranking is a fundamental operation in information access systems, to filter information and direct user attention towards items deemed most relevant to them. Due to position bias, items of similar relevance may receive significantly different exposure, raising fairness concerns for item providers and motivating recent research into fair ranking. While the area has progressed dramatically over recent years, no study to date has investigated the potential problem posed by duplicated items. Duplicates and near-duplicates are common in several domains, including marketplaces and document collections available to search engines. In this work, we study the behaviour of different fair ranking policies in the presence of duplicates, quantifying the extra-exposure gained by redundant items. We find that fairness-aware ranking policies may conflict with diversity, due to their potential to incentivize duplication more than policies solely focused on relevance. This fact poses a problem for system owners who, as a result of this incentive, may have to deal with increased redundancy, which is at odds with user satisfaction. Finally, we argue that this aspect represents a blind spot in the normative reasoning underlying common fair ranking metrics, as rewarding providers who duplicate their items with increased exposure seems unfair for the remaining providers.

Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by Ranking Algorithms

Alessandro Fabris, Alberto Purpura, Gianmaria Silvello and Gian Antonio Susto
Journal Paper IP&M 2020 Ph.D. Paper AwardInformation Processing and Management (IP&M), Volume 57, Issue 6, 102377, 2020.

Abstract

Search Engines (SE) have been shown to perpetuate well-known gender stereotypes identified in psychology literature and to influence users accordingly. Similar biases were found encoded in Word Embeddings (WEs) learned from large online corpora. In this context, we propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a SE to support gender stereotypes, leveraging gender-related information encoded in WEs. Through the critical lens of construct validity, we validate the proposed measure on synthetic and real collections. Subsequently, we use GSR to compare widely-used Information Retrieval ranking algorithms, including lexical, semantic, and neural models. We check if and how ranking algorithms based on WEs inherit the biases of the underlying embeddings. We also consider the most common debiasing approaches for WEs proposed in the literature and test their impact in terms of GSR and common performance measures. To the best of our knowledge, GSR is the first specifically tailored measure for IR, capable of quantifying representational harms.

Gender Bias in Italian Word Embeddings

Davide Biasion, Alessandro Fabris, Gianmaria Silvello and Gian Antonio Susto
Conference Paper Proceedings of the Seventh Italian Conference on Computational Linguistics (CLIC-IT 2020), Bologna, Italy, 2020

Abstract

In this work we study gender bias in Italian word embeddings (WEs), evaluating whether they encode gender stereotypes studied in social psychology or present in the labor market. We find strong associations with gender in job-related WEs. Weaker gender stereotypes are present in other domains where grammatical gender plays a significant role.

Dynamic Probabilistic Linear Discriminant Analysis for Video Classification

Alessandro Fabris, Mihalis A. Nicolaou, Irene Kotsia and Stefanos Zafeiriou
Conference Paper Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017), New Orleans, LA, 2017, pp. 2781-2785

Abstract

Component Analysis (CA) comprises of statistical techniques that decompose signals into appropriate latent components, relevant to a task-at-hand (e.g., clustering, segmentation, classification). Recently, an explosion of research in CA has been witnessed, with several novel probabilistic models proposed (e.g., Probabilistic Principal CA, Probabilistic Linear Discriminant Analysis (PLDA), Probabilistic Canonical Correlation Analysis). PLDA is a popular generative probabilistic CA method, that incorporates knowledge regarding class-labels and furthermore introduces class-specific and sample-specific latent spaces. While PLDA has been shown to outperform several state-of-the-art methods, it is nevertheless a static model; any feature-level temporal dependencies that arise in the data are ignored. As has been repeatedly shown, appropriate modelling of temporal dynamics is crucial for the analysis of temporal data (e.g., videos). In this light, we propose the first, to the best of our knowledge, probabilistic LDA formulation that models dynamics, the so-called Dynamic-PLDA (DPLDA). DPLDA is a generative model suitable for video classification and is able to jointly model the label information (e.g., face identity, consistent over videos of the same subject), as well as dynamic variations of each individual video. Experiments on video classification tasks such as face and facial expression recognition show the efficacy of the proposed method.