Attaché temporaire enseignement/recherche
I am a last-year Ph.D. student under the supervision of Michel Crucianu and Marin Ferecatu. The context of my PhD thesis is compositional visual reasoning. When presented with an image and a question pair, our objective is to have neural networks models answer the question by following a reasoning chain defined by a program. We assess the model's reasoning ability through a Visual Question Answering (VQA) setup.
Compositional VQA breaks down complex questions into modular easier sub-problems. These sub-problems include reasoning skills such as object and attribute detection, relation detection, logical operations, counting, and comparisons. Each sub-problem is assigned to a different module. This approach discourages shortcuts, demanding an explicit understanding of the problem. It also promotes transparency and explainability.