MIDS Capstone Project Fall 2023

Optimizing Efficiency, Ensuring Equity: Advancing Knowledge Distillation with a Focus on Bias Reduction

Team members:

Impact

Our knowledge distillation research has proposed advancements in machine learning by addressing two critical challenges simultaneously. First, we successfully reduced the model size through a knowledge distillation process, enhancing the efficiency of deployment and making it more accessible for resource-constrained environments. Second, our approach prioritized fairness and mitigated biases in model predictions, contributing to a more equitable and inclusive artificial intelligence ecosystem. By achieving both goals, our project contributes to existing research in machine learning fairness and distillation, fostering a more responsible and accessible future for artificial intelligence.

Problem Statement

Artificial intelligence models are accomplishing more exciting and complex tasks by the day. Yet, this performance boost comes at the cost of rapidly increased model sizes. We are living in a world where top models are comprised of billions of parameters, necessitating increasingly high compute and storage resources. This renders models inaccessible to many users, especially those on edge devices, which are resource-constrained by definition. The field of neural network compression has emerged to mitigate this, which is dedicated to shrinking the model size while maintaining accuracy. Delivering models at reduced computational and energy costs renders machine learning more sustainable and cost-effective.

Our research looks to a particular technique in neural network compression known as knowledge distillation, which involves compressing a large performant model into a significantly smaller model and has been shown to be advantageous in terms of security and robustness against domain shift. Knowledge distillation employs a student-teacher framework, where the large, performant model acts as teacher and a smaller, predefined student model learns to mimic the outputs of the teacher. The resulting student model can then be deployed on an edge device given the drastically reduced compute and storage requirements. However, an often overlooked aspect of knowledge distillation is the inadvertent transfer of bias as an inherent byproduct of this technique. While a teacher model can learn a task very well, it can also learn stereotypes. As the student learns from the teacher, these stereotypes can be transmitted and even exaggerated. This research aims to mitigate the bias inflation found in student networks. The authors aim to establish comprehensive evaluation metrics, including bias, and to integrate debiasing into the knowledge distillation framework for image classification.

Our Solution

This research introduces a framework that combines knowledge distillation and adversarial learning for model compression while addressing debiasing. This work contributes to the ongoing research in model compression and fairness in machine learning, offering a comprehensive approach that simultaneously addresses efficiency, performance, and bias concerns. The proposed framework has the potential to advance the deployment of machine learning models in real-world applications where resource constraints and ethical considerations continue to pose challenges to AI advancement.