Auditing Mannequin Bias with Balanced Datasets with Mimesis

May 26, 2026

79

# Introduction

Whether or not they’re well-established classifiers or state-of-the-art large fashions like massive language fashions (LLMs), constructing machine studying options typically entails a threat: algorithms may silently undertake prejudices inherent within the historic coaching dataset they had been skilled on. However in a high-stakes state of affairs or one the place knowledge is delicate, how can we audit whether or not a mannequin is biased with out compromising real-world data?

This hands-on article guides you in coaching a easy classification mannequin for “mortgage approval” on biased knowledge. Based mostly on this, we’ll use Mimesis, an open-source library that may assist generate a superbly balanced, counterfactual dataset. You can take a look at “pretend” customers with an identical monetary backgrounds however totally different demographic traits, thereby figuring out whether or not the mannequin discriminates in opposition to sure teams or not.

# Step-by-Step Information

Begin by putting in the Mimesis library if you’re new to utilizing it, or you might be engaged on a cloud pocket book atmosphere like Colab:

Earlier than auditing a mannequin, we really must get one! On this instance, we’ll synthetically generate a dataset of 1,000 financial institution prospects, with simply two options: gender and revenue. These options are categorical and numerical, respectively. The info creation can be deliberately manipulated in order that the gender attribute unfairly influences the binary consequence: mortgage approval. Particularly, for labeling the dataset, we’ll think about a state of affairs during which males are usually accredited, whereas girls are solely accredited once they have remarkably excessive revenue.

The method to create this clearly biased dataset and practice a call tree classifier on it’s proven under:

import pandas as pd
import numpy as np
from sklearn.tree import DecisionTreeClassifier

# 1. Simulating biased historic knowledge (1000 situations)
np.random.seed(42)
n_train = 1000
genders = np.random.alternative(['Male', 'Female'], n_train)
incomes = np.random.randint(30000, 120000, n_train)

approvals = []
for gender, revenue in zip(genders, incomes):
    if gender == 'Male':
        # Traditionally, males are accredited
        approvals.append(1)
    else:
        # Solely females with excessive revenue are accredited
        approvals.append(1 if revenue > 80000 else 0)

train_df = pd.DataFrame({'Gender': genders, 'Earnings': incomes, 'Authorised': approvals})

# Changing classes to numbers for the machine studying mannequin
train_df['Gender_Code'] = train_df['Gender'].map({'Male': 1, 'Feminine': 0})

# 2. Coaching a Resolution Tree classifier
mannequin = DecisionTreeClassifier(max_depth=3)
mannequin.match(train_df[['Gender_Code', 'Income']], train_df['Approved'])

The following step reveals Mimesis in motion. We’ll use this library to generate a small set of take a look at topics utilizing the Generic class. This can be performed by defining three base monetary profiles that comprise random UUIDs (universally distinctive identifiers) and a average revenue ranging between 40K and 70K. Discover that these profiles won’t have gender data integrated but:

from mimesis import Generic

generic = Generic('en')

# Producing 3 base monetary profiles
base_profiles = []
for _ in vary(3):
    profile = {
        'Applicant_ID': generic.cryptographic.uuid(),
        'Earnings': generic.random.randint(40000, 70000) # Average revenue
    }
    base_profiles.append(profile)

For instance, the three newly created profiles could look one thing like:

[{'Applicant_ID': '1f1721e1-19af-4bd1-8488-6abf01404ef9', 'Income': 44815},
 {'Applicant_ID': '5c862597-7f55-43f4-9d6e-ac9cc0b9083e', 'Income': 47436},
 {'Applicant_ID': '3479d4cf-0d9b-4f06-9c43-1c3b7e787830', 'Income': 58194}]

Let’s end constructing our counterfactual set of examples, which constitutes the core of our auditing course of! For every of the three base profiles, we’ll create two cloned counterfactual situations: one being male and the opposite being feminine. For every pair of take a look at prospects, their utility ID and revenue can be completely an identical, so the one distinction would be the gender: any distinction in how our skilled determination tree mannequin treats them will undoubtedly be proof of gender bias.

counterfactual_data = []

for profile in base_profiles:
    # Model A: Male Counterfactual
    counterfactual_data.append({
        'Applicant_ID': profile['Applicant_ID'], 
        'Gender': 'Male', 
        'Gender_Code': 1, 
        'Earnings': profile['Income']
    })
    
    # Model B: Feminine Counterfactual
    counterfactual_data.append({
        'Applicant_ID': profile['Applicant_ID'], 
        'Gender': 'Feminine', 
        'Gender_Code': 0, 
        'Earnings': profile['Income']
    })

audit_df = pd.DataFrame(counterfactual_data)

That is what the three pairs of consumers could appear to be:

1f1721e1-19af-4bd1-8488-6abf01404ef9	Male	1	44815
1	1f1721e1-19af-4bd1-8488-6abf01404ef9	Feminine	0	44815
2	5c862597-7f55-43f4-9d6e-ac9cc0b9083e	Male	1	47436
3	5c862597-7f55-43f4-9d6e-ac9cc0b9083e	Feminine	0	47436
4	3479d4cf-0d9b-4f06-9c43-1c3b7e787830	Male	1	58194
5	3479d4cf-0d9b-4f06-9c43-1c3b7e787830	Feminine	0	58194

A key level to insist on right here: we have now simply used Mimesis to immediately construct completely matched “clones” of mortgage candidates with an identical revenue however totally different genders. This underlines the library’s worth in offering whole statistical management, isolating a protected attribute.

Now it is time to probe the mannequin and see what it reveals.

# Asking the mannequin to foretell approval for our counterfactuals
audit_df['Predicted_Approval'] = mannequin.predict(audit_df[['Gender_Code', 'Income']])

# Formatting the output for readability (1 = Authorised, 0 = Denied)
audit_df['Predicted_Approval'] = audit_df['Predicted_Approval'].map({1: 'Authorised', 0: 'Denied'})

print("n--- Mannequin Audit Outcomes ---")
print(audit_df[['Applicant_ID', 'Gender', 'Income', 'Predicted_Approval']].sort_values('Applicant_ID'))

The choice-making outcomes yielded by our mannequin couldn’t be clearer:

--- Mannequin Audit Outcomes ---
                           Applicant_ID  Gender  Earnings Predicted_Approval
0  1f1721e1-19af-4bd1-8488-6abf01404ef9    Male   44815           Authorised
1  1f1721e1-19af-4bd1-8488-6abf01404ef9  Feminine   44815             Denied
4  3479d4cf-0d9b-4f06-9c43-1c3b7e787830    Male   58194           Authorised
5  3479d4cf-0d9b-4f06-9c43-1c3b7e787830  Feminine   58194             Denied
2  5c862597-7f55-43f4-9d6e-ac9cc0b9083e    Male   47436           Authorised
3  5c862597-7f55-43f4-9d6e-ac9cc0b9083e  Feminine   47436             Denied

Discover that for the very same Applicant_ID and Earnings, male clones are accredited for the mortgage. In the meantime, feminine clones with such average revenue are usually denied. The Mimesis functionalities we used based mostly on profiles helped us maintain all different variables fixed, thereby efficiently isolating and exposing the mannequin’s discriminatory decision-making.

# Wrapping Up

All through this hands-on article, we have now proven how Mimesis can be utilized to generate balanced, counterfactual knowledge examples — with out privateness or delicate knowledge constraints — that may assist audit a mannequin’s habits and establish whether or not the mannequin is behaving in a biased method or not. Subsequent steps to take in case your mannequin is biased could embody:

Augmenting your coaching knowledge with extra balanced profiles to appropriate historic skewness or bias.
Relying on the mannequin sort, utilizing mannequin re-weighting methods.
Using open-source toolkits for equity — as an illustration, AI Equity 360 — that are useful for bias mitigation in machine studying pipelines.

Iván Palomares Carrascosa is a frontrunner, author, speaker, and adviser in AI, machine studying, deep studying & LLMs. He trains and guides others in harnessing AI in the true world.

Auditing Mannequin Bias with Balanced Datasets with Mimesis

# Introduction

# Step-by-Step Information

# Wrapping Up

Related Articles

5 Key Ideas Behind Agentic AI Each Engineer Should Perceive

Learn how to execute queries in parallel utilizing EF Core

Language Mannequin Hallucination Analysis with GraphEval

Latest Articles

5 Key Ideas Behind Agentic AI Each Engineer Should Perceive

Learn how to execute queries in parallel utilizing EF Core

Language Mannequin Hallucination Analysis with GraphEval

Intel simply posted its greatest progress in 15 years – and burned billions to make it occur

One in every of NASA’s Most Necessary Deep Area Observatories Hit by Spanish Wildfires