Experimenting with tools in python to help extract and observe patterns within major gallery collections about the ratio of male to female produced artworks
These exercises of data wrangling could be applied to help improve accuracy within the application of a classifying system, such as Naive Bayes or plotting graphs using matplotlib
//
The original dataset is borrowed from the official MoMA Github resource and too large for me to upload so please find original at: https://github.com/MuseumofModernArt/collection where it is also updated monthly
I carried out my data analysis in December of 2022... of course my findings are subject to change beyond that date
//
Summary of the dataset taken from the MoMA README file:
The Museum of Modern Art (MoMA) acquired its first artworks in 1929, the year it was established. Today, the Museum’s evolving collection contains almost 200,000 works from around the world spanning the last 150 years. The collection includes an ever-expanding range of visual expression, including painting, sculpture, printmaking, drawing, photography, architecture, design, film, and media and performance art.
MoMA is committed to helping everyone understand, enjoy, and use our collection. The Museum’s website features 98,361 artworks from 27,140 artists. This research dataset contains 140,848 records, representing all of the works that have been accessioned into MoMA’s collection and cataloged in our database. It includes basic metadata for each work, including title, artist, date made, medium, dimensions, and date acquired by the Museum. Some of these records have incomplete information and are noted as “not Curator Approved.”
The Artists dataset contains 15,243 records, representing all the artists who have work in MoMA's collection and have been cataloged in our database. It includes basic metadata for each artist, including name, nationality, gender, birth year, death year, Wiki QID, and Getty ULAN ID.