Personality, gender, and age in the language of social media: the open-vocabulary approach.

We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a com...

Full description

Bibliographic Details
Main Authors: H Andrew Schwartz, Johannes C Eichstaedt, Margaret L Kern, Lukasz Dziurzynski, Stephanie M Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, Martin E P Seligman, Lyle H Ungar
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24086296/?tool=EBI
id doaj-e84c8e2b54974bc982d72a411ad35e41
record_format Article
spelling doaj-e84c8e2b54974bc982d72a411ad35e412021-03-03T20:19:51ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0189e7379110.1371/journal.pone.0073791Personality, gender, and age in the language of social media: the open-vocabulary approach.H Andrew SchwartzJohannes C EichstaedtMargaret L KernLukasz DziurzynskiStephanie M RamonesMegha AgrawalAchal ShahMichal KosinskiDavid StillwellMartin E P SeligmanLyle H UngarWe analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase 'sick of' and the word 'depressed'), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive 'my' when mentioning their 'wife' or 'girlfriend' more often than females use 'my' with 'husband' or 'boyfriend'). To date, this represents the largest study, by an order of magnitude, of language and personality.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24086296/?tool=EBI
collection DOAJ
language English
format Article
sources DOAJ
author H Andrew Schwartz
Johannes C Eichstaedt
Margaret L Kern
Lukasz Dziurzynski
Stephanie M Ramones
Megha Agrawal
Achal Shah
Michal Kosinski
David Stillwell
Martin E P Seligman
Lyle H Ungar
spellingShingle H Andrew Schwartz
Johannes C Eichstaedt
Margaret L Kern
Lukasz Dziurzynski
Stephanie M Ramones
Megha Agrawal
Achal Shah
Michal Kosinski
David Stillwell
Martin E P Seligman
Lyle H Ungar
Personality, gender, and age in the language of social media: the open-vocabulary approach.
PLoS ONE
author_facet H Andrew Schwartz
Johannes C Eichstaedt
Margaret L Kern
Lukasz Dziurzynski
Stephanie M Ramones
Megha Agrawal
Achal Shah
Michal Kosinski
David Stillwell
Martin E P Seligman
Lyle H Ungar
author_sort H Andrew Schwartz
title Personality, gender, and age in the language of social media: the open-vocabulary approach.
title_short Personality, gender, and age in the language of social media: the open-vocabulary approach.
title_full Personality, gender, and age in the language of social media: the open-vocabulary approach.
title_fullStr Personality, gender, and age in the language of social media: the open-vocabulary approach.
title_full_unstemmed Personality, gender, and age in the language of social media: the open-vocabulary approach.
title_sort personality, gender, and age in the language of social media: the open-vocabulary approach.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2013-01-01
description We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase 'sick of' and the word 'depressed'), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive 'my' when mentioning their 'wife' or 'girlfriend' more often than females use 'my' with 'husband' or 'boyfriend'). To date, this represents the largest study, by an order of magnitude, of language and personality.
url https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24086296/?tool=EBI
work_keys_str_mv AT handrewschwartz personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT johannesceichstaedt personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT margaretlkern personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT lukaszdziurzynski personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT stephaniemramones personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT meghaagrawal personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT achalshah personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT michalkosinski personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT davidstillwell personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT martinepseligman personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
AT lylehungar personalitygenderandageinthelanguageofsocialmediatheopenvocabularyapproach
_version_ 1714822920627290112