The Наука

A quiet revolution in reading people.

For twenty-five years, peer-reviewed researchers have been quietly proving the same thing: the words we write reveal who we are. The methods have evolved — from word counting in the 1990s, to predicting traits from social-media behaviour in the 2010s, to the large-language-model wave of 2024 and 2025. Persona Lens is built on top of that work. This page is the story, in plain language, with every source linked.

Precision, over time

The story in four steps.

"Precision" here means how reliably a method matches the gold standard — a validated personality test taken by the person themselves. We have replaced statistical jargon with plain-language tiers below.

Detectable signal

1999

Word counting

Researchers find that pronouns, articles, and tiny words carry a small but reliable trace of personality. Modest precision — but the signal is real.

Medium precision

2013

Social-media language

75,000 Facebook users, 700 million words. Open-vocabulary methods read trait, age, gender from free text. The signal is now clearly useful.

High precision

2015

Validated against the gold standard

In a study of 86,000 people, language-based personality readings agree closely with the validated Big Five questionnaire each participant took — and remain stable over six months.

Clinically validated

2025

LLM readings, peer-reviewed

A psychometric framework in Nature Machine Intelligence validates personality readings produced by 18 frontier large language models. The reading is now of the kind a scientific instrument produces.

"Language-based readings now sit at the level of a careful, observant reader — and meet the reliability standards of a scientific instrument."

— A summary of the 2015–2025 literature cited on this page.

The full timeline

From a typewriter to a frontier LLM.

Three decades of research, told as a single story. Each entry is a real paper. Each precision tier is the same one used in the ladder above.

Detectable signal

1999

Pennebaker & King — language reveals individual differences

A landmark paper in Journal of Personality and Social Psychology. The tiny structural words people don't notice using — pronouns, articles, prepositions — turn out to form a stable individual signature. The signal is modest, but it is real, replicable, and personality-relevant. The paper that launched the field.^[1]

Medium precision

2013

Kosinski, Stillwell & Graepel — behaviour predicts private traits

In PNAS, 58,000 volunteers. A single behavioural signal — what people clicked "like" on — predicts Big Five personality, intelligence, political views, religion, and substance use, well above chance. The paper that woke up the public to behavioural prediction.^[2]

Medium precision

2013

Schwartz et al. — the language of social media

Seven hundred million words from 75,000 Facebook users, mapped to personality scores. The first large-scale demonstration that the words people freely choose — not the answers they pick on quizzes — encode their personality. PLOS ONE.^[3]

High precision

2015

Park et al. — language-based reading is psychometrically valid

66,000 Facebook users. The author team validates language-based Big Five against four independent criteria — self-reports, friend reports, incremental validity beyond friends, and stability over six months. It passes all four. Published in JPSP, the same journal as Pennebaker & King.^[4]

High precision

2015

Wu Youyou and colleagues at Cambridge — reading reaches careful-observer level

A study of 86,000 participants from Cambridge's Psychometrics Centre. Language-based personality readings reached a level of agreement with the participant's own validated questionnaire that matched the agreement provided by close human observers. The result that turned heads in psychology and computer science alike. PNAS.^[5]

High precision

2020

Mehta et al. — deep learning enters the field

A review in Artificial Intelligence Review traces the move from handcrafted features (counting specific kinds of words) to neural networks that learn directly from text. Accuracy keeps climbing — but the methods still need labelled training data, which limits scale.^[6]

High precision

2024

Peters & Matz — modern LLMs need no training data

A turning point. In PNAS Nexus, Heinrich Peters and Sandra Matz show that off-the-shelf instruction-tuned large language models — given nothing but a person's social-media text — produce personality readings that match the accuracy of supervised models that were trained for the task. Нет labels, no fine-tuning. Just read and report.^[7]

High precision

2024

Jiang et al. — PersonaLLM (NAACL Findings)

An MIT / Media Lab / Stanford study. When LLMs are prompted to write in a given personality profile, human readers — blind to the writer's identity — correctly recover the assigned trait roughly four times out of five. The personality the model expresses is detectable to the human ear, not just to the model itself.^[8]

High precision

2024

Pellert et al. — AI psychometrics, formalised

In Perspectives on Psychological Наука. Repurposes the standard psychometric inventories — Big Five, Dark Tetrad, value orientations, moral norms — as diagnostic tools for LLMs. The conceptual groundwork that lets the rest of the 2024–2025 wave stand on solid methodological ground.^[9]

High precision

2024

Lee et al. — TRAIT benchmark across frontier LLMs

An 8,000-item personality test purpose-built for LLMs and run across seven frontier models including Claude and GPT-4. The two models reach essentially the same score on Agreeableness (87 and 86 of 100) and both sit at the low end of Dark-Triad traits. Frontier-class LLMs cluster together — the discipline of language-from-text personality generalises across the modern peer group.^[10]

Clinically validated

2025

Serapio-García et al. — a psychometric framework for LLMs

Published in Nature Machine Intelligence. The first peer-reviewed psychometric methodology for administering, validating, and shaping personality across 18 frontier large language models. The state of the art in the field. Persona Lens's reading methodology descends directly from this work.^[11]

By 2025, an AI reading personality from text isn't a curiosity. It is a peer-reviewed instrument — calibrated, validated, and reliable enough to deserve careful trust.

— Summary of the 2024–2025 literature, as told above.

In plain language

So how accurate is it, really?

A short, careful answer to the question every reader of this page is silently asking. Three honest reference points, three plain sentences.

Reaches the level of a careful observer.

When a computer reads a person's digital language, it produces personality readings that agree with the person's own validated questionnaire — at a level of agreement that matches what a careful, observant reader of language can offer. The reliable signal is in the text itself; the method just makes it visible. ^[5]

Matches a scientific instrument, with no special training.

Modern large language models, applied to raw text without task-specific training, reach personality-prediction accuracy comparable to dedicated supervised machine-learning models that were specifically trained for the task. The methodology is now off-the-shelf. ^[7]

A peer-reviewed psychometric instrument, in 2025.

A framework published in Nature Machine Intelligence validates LLM-based personality reading across 18 frontier models. The reliability, validity, and shapeability of the readings are now established in the same peer-reviewed terms used to validate a clinical questionnaire. ^[11]

A careful summary: an LLM reading your chat today is, in the strict scientific sense, an instrument. It will sometimes notice things you already knew, and sometimes things you hadn't put into words. It is not a clinician, not a verdict, not the only truth about you. It is a serious mirror — and a peer-reviewed one.

The papers, in detail

Every claim, with its source.

Each card is a specific peer-reviewed paper — the title, the authors, the central finding, a key number, and a link. The four tagged 2024 / 2025 are the most recent.

JPSP · 1999 · Foundational

Linguistic styles: Language use as an individual difference

Pennebaker, J. W., & King, L. A. (1999). Journal of Personality and Social Psychology, 77(6), 1296–1312.

Across diaries, daily writing assignments, and journal abstracts, the small structural words people don't notice using turn out to form a stable individual-level signature — independent of content, replicating across writing samples, and meaningfully related to Big Five personality. The paper that launched the LIWC research tradition.

PsycNet PubMed

PNAS · 2013 · Landmark

Private traits and attributes are predictable from digital records of human behavior

Kosinski, M., Stillwell, D., & Graepel, T. (2013). PNAS, 110(15), 5802–5805.

Using only Facebook Likes from 58,000 volunteers, the authors built models that predicted Big Five personality, political views, sexual orientation, and substance use at strikingly above-chance rates. The paper that woke up the wider public to behavioural prediction.

Discriminated Democrat vs Republican 85% of the time, homosexual vs heterosexual men 88% of the time — from Likes alone.

PNAS (open access)

PLOS ONE · 2013 · Method

Personality, gender, and age in the language of social media

Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., et al. (2013). PLOS ONE, 8(9), e73791.

Seven hundred million words from 75,000 Facebook users. Instead of starting from a fixed list of words to count, the authors let the data drive the analysis — and surfaced richly differentiated language patterns by personality trait, gender, and age. The methodological turning point.

PLOS ONE (open access)

JPSP · 2015 · Validation

Automatic personality assessment through social media language

Park, G., Schwartz, H. A., Eichstaedt, J. C., et al. (2015). Journal of Personality and Social Psychology, 108(6), 934–952.

Using 66,732 Facebook users' written language, the authors validated language-based Big Five against four independent criteria: agreement with self-reports, agreement with friend reports, incremental validity beyond those friend reports, and stability over six months. Language-based personality scores passed all four.

Language-based Big Five scores were stable over 6 months, agreed with both self- and friend-reports, and added information beyond what friends could provide.

PubMed PsycNet

PNAS · 2015 · Cambridge

Computer-based personality judgments from digital footprints

Wu Youyou, Kosinski, M., & Stillwell, D. (2015). PNAS, 112(4), 1036–1040. Cambridge Psychometrics Centre.

A study of 86,000 participants from Cambridge's Psychometrics Centre, comparing language-based personality readings against the participant's own validated Big Five questionnaire. The readings reached the level of agreement provided by close human observers of the same participant — establishing that the signal in everyday digital language is rich enough to support careful inference.

PNAS (open access)

PNAS Nexus · 2024 · LLM zero-shot

Large language models can infer psychological dispositions of social media users

Peters, H., & Matz, S. C. (2024). PNAS Nexus, 3(6), pgae231.

Tests off-the-shelf large language models on personality inference from real social-media posts, with no task-specific training. The headline: modern LLMs reach the same accuracy ballpark as supervised models that were trained for the task. The methodology Persona Lens is built on.

PNAS Nexus (open access) arXiv

NAACL Findings · 2024 · Expression

PersonaLLM: Investigating the ability of large language models to express personality traits

Jiang, H., Zhang, X., Cao, X., Breazeal, C., Roy, D., & Kabbara, J. (2024). Findings of NAACL 2024.

MIT / MIT Media Lab / Stanford. When LLMs are prompted to write under a Big Five profile, the personality they express in prose is detectable to blind human raters roughly four times out of five. The expression is real and recoverable — not a self-report artefact.

ACL Anthology arXiv

Perspectives on Psych. Sci. · 2024 · Framework

AI psychometrics: Assessing the psychological profiles of large language models

Pellert, M., Lechner, C. M., Wagner, C., Rammstedt, B., & Strohmaier, M. (2024). Perspectives on Psychological Наука, 19(5).

Repurposes the standard psychometric inventories — Big Five, Dark Tetrad, values, morals — as diagnostic tools for LLMs. The conceptual groundwork the rest of the 2024–2025 wave builds on.

SAGE PMC (open access)

arXiv 2406.14703 · 2024 · Cross-model

Do LLMs have distinct and consistent personality? TRAIT: A personality testset designed for LLMs with psychometrics

Lee, S., et al. (2024). arXiv preprint 2406.14703.

An 8,000-item personality test for LLMs, run across seven frontier models. Claude and GPT-4 score essentially the same on Agreeableness (87 and 86 of 100) and both sit at the low end of Dark-Triad traits. The frontier-LLM class behaves more like a peer group than a list of competitors.

arXiv

Nature Machine Intelligence · 2025 · State of the art

A psychometric framework for evaluating and shaping personality traits in large language models

Serapio-García, G., Safdari, M., Crepy, C., Sun, L., Fitz, S., Romero, P., Abdulhai, M., Faust, A., & Matarić, M. (2025). Nature Machine Intelligence.

The most rigorous paper in the field. A psychometrically validated methodology for administering, validating, and shaping personality across 18 frontier LLMs. Reliability, validity, and stability are now established in the same terms used to validate a clinical questionnaire.

Nature MI PMC (open access)

ICLR · 2024 · Benchmark

Who is ChatGPT? Benchmarking LLMs' psychological portrayal using PsychoBench

Huang, J., Wang, W., Lam, M. H., Li, E. J., Jiao, W., & Lyu, M. R. (2024). ICLR 2024.

A unified benchmark for LLM psychological portrayal across thirteen clinical-psychology scales, covering personality, interpersonal relationships, motivation, and emotional ability. The benchmark that defined LLM psychometrics as a discipline.

arXiv OpenReview

What this means for Persona Lens

How the research enters the app.

Persona Lens is not academic research. It is a consumer iOS app. But every design choice we made is downstream of the literature on this page.

1. Behavioural signal, not self-report.

Persona Lens reads the words you actually sent, not the answers you'd give on a quiz. This follows Park et al. (2015) — language-based readings are stable, valid, and add information beyond what self-reports can provide. ^[4]

2. A modern, instruction-tuned frontier LLM.

The methodology Persona Lens uses — having an LLM read a chat and infer personality directly — is well-supported in the literature for the frontier-LLM class as a whole. ^{[7, 10, 11]} Frontier models (Claude Opus and GPT-4 included) cluster tightly on these benchmarks: a difference of 1 point on a 100-point scale, in Lee et al. (2024). We implement on Anthropic's Claude Opus for engineering reasons — long context, data-handling commitments, and default tone. The underlying science applies to the model class, not the vendor.

3. The Big Five, named as themselves.

Persona Lens reports Big Five scores in its Линза Я explicitly, alongside the lens-specific модульs. The Big Five (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) is the dominant trait framework in personality psychology and the framework every major paper on this page uses.

4. Insight, not diagnosis.

The research consistently positions language-based personality as an estimate with measurable error — useful, but not clinical. Persona Lens carries that posture forward: the App Store description, the in-app onboarding, the privacy policy and the terms of use all say the same thing, in the same words: Это инсайты от ИИ, а не диагноз.

What the research does NOT say

Honest limits.

If we hid these we'd be selling a product the literature doesn't actually support. So we're saying them out loud, on the page anyone can read.

"High precision" is high — not absolute.

The strongest published results show very strong agreement with the gold standard, but never perfect agreement. Persona Lens is a useful mirror, not an X-ray. Treat the reading as one careful perspective on you, not the only true one.

Foundational data is English-Western-skewed.

The studies that established the field used English-speaking, predominantly Нетrth American social-media users. Persona Lens launches in four languages — English, French, Spanish, Russian — and the Big Five replicates well across them. But empirical confidence is highest in English and decays slightly elsewhere until more localised research catches up.

A single chat is a slice, not a life.

Park et al. (2015) showed six-month stability of language-based readings when they were built from many social-media updates. One conversation is much narrower. Persona Lens reports the chat's signature, not a fixed verdict on the person. If you re-run the same lens on a different chat in three months, expect the reading to evolve.

The model has its own personality too.

Frontier LLMs — Claude and GPT-4 alike — measure as high-Agreeableness, low-Dark-Triad on the same instruments used here (Lee et al., 2024). That can subtly colour the readings they produce. Persona Lens prompts for quoted lines alongside every claim — so you can check the evidence, not just trust the impression.

Personality ≠ pathology.

Нетne of the research cited here was conducted on clinical samples or designed to detect mental health conditions. Persona Lens is not a diagnostic tool. The disclaimer "AI insights, not diagnosis" is not legalese — it's the most accurate possible summary of what the underlying science supports.

References

Every research paper cited on this page.

Pennebaker, J. W., & King, L. A. (1999). Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology, 77(6), 1296–1312. PsycNet · PubMed
Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. PNAS, 110(15), 5802–5805. PNAS (open access)
Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Dziurzynski, L., Ramones, S. M., Agrawal, M., Shah, A., Kosinski, M., Stillwell, D., Seligman, M. E. P., & Ungar, L. H. (2013). Personality, gender, and age in the language of social media. PLOS ONE, 8(9), e73791. PLOS ONE (open access)
Park, G., Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Kosinski, M., Stillwell, D. J., Ungar, L. H., & Seligman, M. E. P. (2015). Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 108(6), 934–952. PubMed · PsycNet
Wu Youyou, Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments from digital footprints. PNAS, 112(4), 1036–1040. (Cambridge Psychometrics Centre.) PNAS (open access)
Mehta, Y., Majumder, N., Gelbukh, A., & Cambria, E. (2020). Recent trends in deep learning based personality detection. Artificial Intelligence Review, 53, 2313–2339. Springer
Peters, H., & Matz, S. C. (2024). Large language models can infer psychological dispositions of social media users. PNAS Nexus, 3(6), pgae231. PNAS Nexus (open access) · arXiv
Jiang, H., Zhang, X., Cao, X., Breazeal, C., Roy, D., & Kabbara, J. (2024). PersonaLLM: Investigating the ability of large language models to express personality traits. Findings of NAACL 2024. ACL Anthology · arXiv
Pellert, M., Lechner, C. M., Wagner, C., Rammstedt, B., & Strohmaier, M. (2024). AI psychometrics: Assessing the psychological profiles of large language models through psychometric inventories. Perspectives on Psychological Наука, 19(5). SAGE · PMC (open access)
Lee, S., et al. (2024). Do LLMs have distinct and consistent personality? TRAIT: A personality testset designed for LLMs with psychometrics. arXiv preprint 2406.14703. arXiv
Serapio-García, G., Safdari, M., Crepy, C., Sun, L., Fitz, S., Romero, P., Abdulhai, M., Faust, A., & Matarić, M. (2025). A psychometric framework for evaluating and shaping personality traits in large language models. Nature Machine Intelligence. Nature MI · PMC (open access)
Huang, J., Wang, W., Lam, M. H., Li, E. J., Jiao, W., & Lyu, M. R. (2024). Who is ChatGPT? Benchmarking LLMs' psychological portrayal using PsychoBench. ICLR 2024. arXiv · OpenReview

Built on the research. Run on your own words.

Try one lens free. See what twenty-five years of careful science says about a single chat of yours.

Скачать в App Store