Facebook has made public the operation of its system for recognize image content through artificial intelligence. According to the company, the main purpose of using this technology is to allow people with accessibility problems to hear a description of what is in the photo, even without there being alternative text written by the user associated with it.
They explain in the same way how, since 2016, Facebook has an automatic alt text technology (AAT), capable of generating descriptions for photographs. Along with the explanation, comes the news that the latest version of this AAT system just received major upgrades for image recognition.
This is how Facebook knows what a photo is, even if we don’t give it that data
AAT (Alternative Automatic Text) is a Facebook technology that allows recognize objects in an image. A neural network is used, according to Facebook, trained with millions of parameters and examples. In this way, the neural network is able to know whether or not there are people in a photo, how many there are, the objects that are present, animals, etc.
ATT is able to recognize practically any scene, and even to tell us how far away each person is within the photo
Such is the AI system that Facebook has implemented that even recognizes the environments, being able to tell if the photograph is taken indoors, outdoors, in a specific monument, practically any situation can be recognized.
Facebook has improved this technology, being even capable of detect relative size of objects. In other words, ATT is able to tell you that there is an image with five people, one of them in the center, others on the sides and another behind, analyzing the position of each of them with respect to the position they occupy in space.
ATT, in its early development phase, was able to recognize 100 concepts, like ‘tree’, ‘mountain’, ‘outdoors’ and so on, all quite simple. Today there are more than 1200 concepts recognized by the neural network, and it is that they use data from applications such as Instagram. Facebook acknowledges that its model is trained with public images from both Facebook and Instagram, using their hashtags. Thanks to this gigantic database, it is able to recognize skin tones, gender, events such as weddings (by the clothes that users wear), types of food, etc.
Although Facebook assures that this neural network is intended to help people with visual disabilities, it is still striking the amount of data you collect to find out exactly what is in each photograph.
More information | Facebook