On Semantic Cognition, Inductive Generalization, and Language Models

Kanishka Misra, Purdue University

Abstract

Our ability to understand language and perform reasoning crucially relies on a robust system of semantic cognition (G. L. Murphy, 2002; Rogers & McClelland, 2004; Rips et al., 2012; Lake & Murphy, 2021): processes that allow us to learn, update, and produce inferences about everyday concepts (e.g., cat, chair), properties (e.g., has fur, can be sat on), categories (e.g., mammals, furniture), and relations (e.g., is-a, taller-than). Meanwhile, recent progress in the field of natural language processing (NLP) has led to the development of language models (LMs): sophisticated neural networks that are trained to predict words in context (Devlin et al., 2019; Radford et al., 2019; Brown et al., 2020), and as a result build representations that encode the knowledge present in the statistics of their training environment. These models have achieved impressive levels of performance on a range of tasks that require sophisticated semantic knowledge (e.g. question answering and natural language inference), often even reaching human parity. To what extent do LMs capture the nuances of human conceptual knowledge and reasoning? Centering around this broad question, this dissertation uses core ideas in human semantic cognition as guiding principles and lays down the groundwork to establish effective evaluation and improvement of conceptual understanding in LMs. In particular, I build on prior work that focuses on characterizing what semantic knowledge is made available in the behavior and representations of LMs, and extend it by additionally proposing tests that focus on functional consequences of acquiring basic semantic knowledge.I primarily focus on inductive generalization(Hayes & Heit, 2018)—the unique ability of humans to rely on acquired conceptual knowledge to project or generalize novel information— as a context within which we can analyze LMs’ encoding of conceptual knowledge. I do this, since the literature surrounding inductive generalization contains a variety of empirical regularities that map to specific conceptual abstractions and shed light on how humans store, organize and use conceptual knowledge. Before explicitly analyzing LMs for these empirical regularities, I test them on two other contexts, which also feature the role of inductive generalization. First I test the extent to which LMs demonstrate typicality effects— a robust finding in human categorization literature where certain members of a category are considered to be more central to the category than are others. Specifically, I test the behavior 19 different LMs on two contexts where typicality effects modulate human behavior: 1) verification of sentences expressing taxonomic category membership, and 2) projecting novel properties from individual category members to the entire category. In both tests, LMs achieved positive but modest correlations with human typicality ratings, suggesting that they can to a non-trivial extent capture subtle differences between category members. Next, I propose a new benchmark to test the robustness of LMs in attributing properties to everyday concepts, and in making inductive leaps to endow properties to novel concepts. On testing 31 different LMs for these capacities, I find that while they can correctly attribute properties to everyday concepts and even predict the properties of novel concepts in simple settings, they struggle to do so robustly. Combined with the analyses of typicality effects, these results suggest that the ability of LMs to demonstrate impressive conceptual knowledge and reasoning behavior can be explained by their sensitivities to shallow predictive cues.

Degree

Ph.D.

Advisors

Rayz, Purdue University.

Subject Area

Language

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS