This project takes place in the context of large language models (LLMs) and conversational systems (e.g. ChatGPT, WebGPT), which have experienced tremendous practical progress in the last few months. The project GUIDANCE aims to conduct research on General Purpose Dialogue-assisted Digital Information Access, specifically how to enable users to access digital information, with the goal of overcoming several limitations of current LLMs:
- LLMs were not designed with Information Access – whether at the level of pre-training tasks or fine-tuning ones
- LLMs have limited generalization abilities to new domains and/or languages;
- The veracity and truthfulness of the output are questionable.
- Potentially state-of-the-art LLMs models are not open access and the scientific methodology and proper evaluation are barely described in the scientific literature.
From a community building perspective, GUIDANCE project aims at federating the Information Retrieval (IR) French Community project, by bringing together experts of the field to advance the development of Dialogue-based Information Access (DbIA) models leveraging LLMs.
GUIDANCE is backed up by partners belonging to the ARIA association and gathers 18 researchers from 6 IR and NLP-related groups within 4 research laboratories. The partners furthermore commit to producing open-access annotated resources, both at a national and international level. These resources will be used to evaluate and develop models for DbIA, and will constitute a precious resource for releasing open access DbIA systems.
From a research perspective, GUIDANCE addresses four challenges associated with this project:
- How to design new LLMs or re-use LLMs for DbIA;
- How to leverage retrieval-Enhanced Machine Learning (ReML) techniques to improve the accuracy and efficiency of information retrieval systems;
- Adapt LLMs and develop new architectures (for DbIA models) to deal with low resource and domain adaptation – with special attention paid to the low/medium-resource languages (e.g. Occitan, French);
- Design DbIA models that can ensure the veracity and explainability of retrieved and synthesized information, while preserving the user’s subjectivity
- The project started on October 2023 and is expected to end in Septembre 2027.
- The project offers internships which are listed on this page.