The venerable Massachussetts Institute of Technology MIT cuts dataset – gathering 80 million images – resulting in AI becoming racist and offensive, set up to help users find them, the AI engine ended up generating racist, misogynistic, sexist and offensive qualifiers to sort them.
More than 4 years after the disaster of Microsoft’s Tay chat robot, MIT also fell into the trap of AI becoming racist. The establishment has announced that it has put offline a data set on which poorly trained AI has been backed up with the consequence of multiplying the racist, misogynistic, sexist and offensive terms to qualify an immense volume of images.
MIT explains “It has been brought to our attention  that the Tiny Images dataset contains some derogatory terms as categories and offensive images. This was a consequence of the automated data collection procedure that relied on nouns from WordNet. We are greatly concerned by this and apologize to those who may have been affected.”
Originally MIT created Tiny Images, a dataset of 79.3 million images extracted from Google Images divided into 75,000 categories in a tiny resolution of 32×32 pixels. A light version of 2.2 million images, available on the website of the computer science and artificial intelligence laboratory of MIT (CSAIL) for which a data visualization has been backed, has also been designed.
“The dataset was created in 2006 and contains 53,464 different nouns, directly copied from Wordnet. Those terms were then used to automatically download images of the corresponding noun from Internet search engines at the time (using the available filters at the time) to collect the 80 million images (at tiny 32×32 resolution; the original high-res versions were never stored).” says MIT.
This huge collection of photos used tags to describe what was in the photos, and a neural network was implemented to automatically associate photo templates with descriptive tags.
Source : MIT