-
Notifications
You must be signed in to change notification settings - Fork 2
Home
eliboss edited this page Mar 29, 2021
·
1 revision
① Find all values to compare in a data set
↳ if no values found, notify the user that there are no valid labels or images.
② For each set of values, group by: • possible gender for images • skin tone for images •For labeled data: -ethnicity -gender -religion -income -age -education
③ for each value, scrape //Somewhere// for, demographic data. For each group for a value:
↳ Check that the size of the group in the data is roughly proportional to the size of that demographic.
>> Also >> Check that each group has a minimum count.
↳ if a group is underrepresented, Suggest adding more data from that group
↳ if a group is over rep. mark as overrepresented.
** Think about how to implement intersectionality **
④ For numerical values: (ex: skin tone in hex coordinates) a) Create a range to group values by. b) find variance within each group and between groups.
↳ if variance within a group is high, !OR! the median is skewed to one side of the range, in this group needs more samples
↳ if variance within a group is low, !AND! the median is close to the middle of the range, this group of samples is representative of the range.
** Think about making ranges dynamic so that they can adapt to a data set. **
↳ If a data set has a range that does not include the full range possible, include that in the report.