You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the provided code to compute SPD (Statistical Parity Difference) on adult datasets. However, upon calling the function get_spd_and_accuracy within a loop, I've noticed that memory consumption gradually increases when class_metrics.statistical_parity_difference() is instantiated with each iteration, and this memory is not being released at the end of each iteration.
from aif360.datasets import AdultDataset, GermanDataset, CompasDataset
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from aif360.datasets import StandardDataset, BinaryLabelDataset
from aif360.metrics import ClassificationMetric
from copy import deepcopy
from sklearn.metrics import accuracy_score
def get_spd_and_accuracy(df, protected, target):
'''Prepare the data for training and testing'''
train, test = train_test_split(df, test_size=0.2, shuffle=True)
X_train = train.drop([protected, target], axis=1).values
y_train = train[target].values
y_test = test[target].values
X_test = test.drop([protected, target], axis=1).values
'''Train the model and predict the labels for the training and testing data'''
lmod = LogisticRegression(solver='liblinear', class_weight='balanced')
lmod.fit(X_train,y_train)
y_train_pred = lmod.predict(X_train)
y_test_pred = lmod.predict(X_test)
'''Prepare the data for the AIF360 metrics'''
train_transf = StandardDataset(train,
label_name=target,
favorable_classes=[1],
protected_attribute_names=[protected],
categorical_features=[],
features_to_drop=[],
privileged_classes=[[1.0]])
train_transf_pred = deepcopy(train_transf)
train_transf_pred.labels = y_train_pred
un_p=[{protected:0.0}]
p=[{protected:1.0}]
'''Calculate the Statistical Parity Difference and Accuracy Score'''
class_metrics = ClassificationMetric(train_transf,train_transf_pred,unprivileged_groups=un_p, privileged_groups=p)
print(round(class_metrics.statistical_parity_difference(),2))
print(round(accuracy_score(y_test, y_test_pred),2))
dataset = AdultDataset()
df = dataset.convert_to_dataframe()[0]
target = df.columns[-1]
protected = 'sex'
for i in range(25):
get_spd_and_accuracy(df, protected, target)
Any insights or recommendations regarding memory release strategies in this context would be greatly appreciated. Below is the snapshot of increase in memory.
Before executing the code
During the execution of the code
The text was updated successfully, but these errors were encountered:
trivedi-nitesh
changed the title
Incorrect Memory Management in AIF's ClassificationMetric Class Persists Beyond Function Termination
Memory Management Issue in ClassificationMetric Class Persists Beyond Function Termination
Apr 19, 2024
trivedi-nitesh
changed the title
Memory Management Issue in ClassificationMetric Class Persists Beyond Function Termination
Memory Management Issue in class ClassificationMetric
Apr 19, 2024
I'm using the provided code to compute SPD (Statistical Parity Difference) on adult datasets. However, upon calling the function get_spd_and_accuracy within a loop, I've noticed that memory consumption gradually increases when
class_metrics.statistical_parity_difference()
is instantiated with each iteration, and this memory is not being released at the end of each iteration.Any insights or recommendations regarding memory release strategies in this context would be greatly appreciated. Below is the snapshot of increase in memory.
Before executing the code
During the execution of the code
The text was updated successfully, but these errors were encountered: