Language models (LMs) encode extensive factual knowledge within their parameters. In the literature, factual knowledge assessment often relies on cloze sentences (e.g., “The capital of France is ____”), which can lead to erroneous conclusions due to the complexity of natural language (out-of-subject continuations, the existence of many correct answers and the several ways of expressing them). In this work, we introduce a new interpretable knowledge assessment method that mitigates these issues by leveraging distractors which are incorrect but plausible alternatives to the correct answer, that we use to create an “automatic QCM for LMs”. We propose several strategies for retrieving distractors and determine the most effective one through experimentation. Our method is evaluated against existing approaches, demonstrating solid alignment with human judgment and stronger robustness to verbalization artifacts.
The users of our publication will be able to:
- Measure the knowledge of a specific fact by a Large Language Model
- Reproduce the experiments of our paper
- Compare distractor retrieval strategies
- Evaluate our knowledge measure and baselines with respect to their alignment with human judgment and robustness to verbalization errors
Source code is available on GitHub under the GPL2 licence.