research agenda


I study large, pretrained neural language models (LLMs) like ChatGPT – specifically, how they learn and represent the meaning in language. Deep neural networks in general, and LLMs in particular, are usually seen as being “black boxes”: we do not (fully) understand what goes on inside them, only how they learn and how well they perform on the benchmarks we create for them (e.g., SuperGLUE, LAMA, BIG-bench, etc.). These benchmarks can be valuable tools for evaluating progress in natural language processing, but they are characteristically agnostic to how  LLMs achieve the performance they do, making them imperfect measures of progress toward general language understanding. For example, current LLMs are already able to achieve super-human performance on many popular language benchmarks. As state-of-the-art LLMs continue to achieve or outperform human-level performance on such benchmarks, it is natural to ask whether they truly understand language at the level of fluent human speakers, or whether “black box” evaluation approaches (where only inputs and outputs are considered, and internal mechanisms are ignored) are simply insufficient to adequately measure progress toward human-level linguistic competence.

My goal is to develop research approaches that can help us answer questions like these. In doing so, I believe that it is necessary to study LLMs’ internal representation of linguistic structure, and how this representation is used as they perform natural language tasks. I am currently researching the structure of LLMs’ representations of lexical semantics via competence-based analysis of language models (CALM) to study what elements of meaning are acquired and used by LLMs, multimodal vision-and-language models, and other varieties of foundation models. A better understanding of what is actually learned by these models (and what is not) will be essential to predict and explain their current limitations, where they may (or may not) be safely applied, and develop the next generation of more human-like, generalizable foundation models.