Similarity Judgment Within and Across Categories: A Comprehensive Model Comparison

Similarity is one of the most important relations humans perceive, arguably subserving category learning and categorization, generalization and discrimination, judgment and decision making, and other cognitive functions. Researchers have proposed a wide range of representations and metrics that coul...

Full description

Bibliographic Details
Main Authors: Bhatia, S. (Author), Richie, R. (Author)
Format: Article
Language:English
Published: John Wiley and Sons Inc 2021
Subjects:
Online Access:View Fulltext in Publisher
Description
Summary:Similarity is one of the most important relations humans perceive, arguably subserving category learning and categorization, generalization and discrimination, judgment and decision making, and other cognitive functions. Researchers have proposed a wide range of representations and metrics that could be at play in similarity judgment, yet have not comprehensively compared the power of these representations and metrics for predicting similarity within and across different semantic categories. We performed such a comparison by pairing nine prominent vector semantic representations with seven established similarity metrics that could operate on these representations, as well as supervised methods for dimensional weighting in the similarity function. This approach yields a factorial model structure with 126 distinct representation-metric pairs, which we tested on a novel dataset of similarity judgments between pairs of cohyponymic words in eight categories. We found that cosine similarity and Pearson correlation were the overall best performing unweighted similarity functions, and that word vectors derived from free association norms often outperformed word vectors derived from text (including those specialized for similarity). Importantly, models that used human similarity judgments to learn category-specific weights on dimensions yielded substantially better predictions than all unweighted approaches across all types of similarity functions and representations, although dimension weights did not generalize well across semantic categories, suggesting strong category context effects in similarity judgment. We discuss implications of these results for cognitive modeling and natural language processing, as well as for theories of the representations and metrics involved in similarity. © 2021 Cognitive Science Society LLC
ISBN:03640213 (ISSN)
DOI:10.1111/cogs.13030