[Show/Hide Left Column]


BIPolarité de l'Information Multimédia pour l'Annotation sémantique d'images dans un contexte de médias sociaux

Beginning : 2014
Action line : DataSense 1,2,3
Subject : Analyse de données multimédia et gestion des connaissances dans le contexte des médias sociaux
Directeurs : Céline Hudelot,MAS, Hervé Le Borgne, CEA list, Isabelle Bloch, LTCI
Institution :MAS / LIST
Phd Student : Sonia AJINA
Scientifical productions :
July 2018 : student in depression in 3rd year. Probably won't support her thesis (Céline Hudelot)
Resources :

Context :
The popularity of social media and of mobile devices has lead to the explosion of available non structured data, specifically of multimedia documents. For instance, the number of images hosted on popular photo-sharing website Flickr was estimated at 8 billions in 2013 . Those collections have a wealth of information and knowledge, provided that one disposes of efficient tools for their analysis and semantical interpretation.

By definition, those multimedia documents consist of various "mono-media", which represent as many information sources. For example, an image found on Flickr will come with metadata regarding the shooting conditions, pixels reflecting its content visually and several tags. Those tags are an important source of information and many recent research works have shown that their combination with visual information significantly increases the performances of images research and interpretation systems. Tags are but only one type of social signal among many types present on the social media. The commentaries and the social networks themselves represent as many exploitable sources of information for the semantical research of images.

This "social" information is, however, imperfect and only partly relevant for the semantical interpretation of images. The reasons are that social annotations are freely worded, that they depend on the motivations of the users- which are most of the time not known - and that they are uncorrelated to a specific problem. It is thus important to take those imperfections into account in order to improve the process of semantical interpretation of images content.

Scientifical challenge:
Le BIPIMA-project builds on precedent works on the semantical interpretation of images (FRIDOM project) and proposes a richer and more explicit modelisation of the imperfections of the multimedia information. It also proposes new frames to reason about this imperfect multimodal information.

More specifically, the project will exploit the bipolar quality of the informations on the social media, in other words, the fact that an information contains a positive side (what is warranted to be possible, for example that which is proven) and a negative side (that which is impossible or forbidden), which can be easily distinguished. For example, the information from the tags actually describing the content of the image would be the positive information, while the presence of the tag "Tour Eiffel" would forbid some other interpretations, which would be the negative information.

Prospects :

The Phd partakes of the fields of artificial intelligence (representation of knowledge and of the reasonning process), of computer visualization and of social data mining. The main objective - which is also the main challenge - of this project is to study and characterize the various information sources presents on the social media and to reflect upon their specificities in order to best exploit them for the annotation and semantical interpretation of images. The strong social dimension of the datas concerned creates the need for new paradigms, on one hand because of the strong "noise" surrounding the datas in this interpretation context, and also because their social dimension, related to a form of collective intelligence, offers new prospects.

During her Phd, Sonia Ajina will broach the following tasks :

  • Study the different information sources available on the social media for the annotation as well as for the multimodal interpretation of images and propose technics for the characterization of their polarity with the purpose of improving the semantical interpretation.
  • Study the various approaches allowing an explicit modelization of the bipolar characteristic of the multimedia information.
  • Propose a formal frame allowing to exploit and reason about the bipolar characteristic of the multimedia information for the improvement of the decison-making.
  • Study the behaviour of the proposed models and of the frame in the big data context, specifically when applied to big corpuses of multimedia data.