DistFactAssessLM

Language models (LMs) encode extensive factual knowledge within their parameters. In the literature, factual knowledge assessment often relies on cloze sentences (e.g., “The capital of France is ____”), which can lead to erroneous conclusions due to the complexity of natural language (out-of-subject continuations, the existence of many correct answers and the several ways of expressing them). In this work, we introduce a new interpretable knowledge assessment method that mitigates these issues by leveraging distractors which are incorrect but plausible alternatives to the correct answer, that we use to create an “automatic QCM for LMs”. We propose several strategies for retrieving distractors and determine the most effective one through experimentation. Our method is evaluated against existing approaches, demonstrating solid alignment with human judgment and stronger robustness to verbalization artifacts.

The users of our publication will be able to:

Measure the knowledge of a specific fact by a Large Language Model
Reproduce the experiments of our paper
Compare distractor retrieval strategies
Evaluate our knowledge measure and baselines with respect to their alignment with human judgment and robustness to verbalization errors

Source code is available on GitHub under the GPL2 licence.

Liveradio

Orange Radio

Full Content

Fast Point

Le Switch Tuner

TV d’Orange

Livebox

Set Top Box

Djingo

La Clé TV

Live Button

Internet Facile

Livebox Tools

Livebox

DistFactAssessLM