NLP-Driven Insights into Queer Sentiment During Legal Milestones

word cloud by me

Summary:

  • Leveraging advanced natural language processing techniques, this project investigates long-term emotional expression among openly queer users on Twitter (now X.com) from 2012 to 2017. The analysis spans two landmark U.S. Supreme Court rulings on same-sex marriage (2013, 2015), revealing how sentiment patterns evolved in response to major sociopolitical events.

  • Key findings: LGBTQ+ users expressed more positive emotions than non-LGBTQ+ users, suggesting emotional resilience and the supportive nature of digital communities during major political events.

Team & Collaboration:

  • Kyle Ratner Ph.D. UC Santa Barbara

Challenge

  • Identifying relevant users: I needed a rough but reasonable way to detect Twitter users who were openly LGBTQ+ or discussing their identity (e.g., using emojis, hashtags, or self-descriptions such as 🏳️‍🌈).
  • Establishing user-centered relevance: The study addresses a research gap in understanding how queer emotional expression evolves over time in response to sociopolitical events — a need often overlooked in sentiment and visibility research.
  • Data scale and accessibility: Managing millions of tweets required building a custom web scraper and designing efficient data pipelines.
  • Computational constraints: Running BERT-based sentiment analysis on such a large dataset demanded substantial compute resources and optimization strategies.

Solutions to the challenge

  • User detection pipeline: Wrote custom Python scripts to extract user profile metadata and identify openly LGBTQ+ users based on self-descriptions, emojis, and hashtags.
  • Web scraping infrastructure: Customized the open-source Twint package to build a scalable Twitter scraper, distributed across multiple lab machines for parallel data collection.
  • High-performance sentiment analysis: Leveraged UCSB’s High Performance Computing (HPC) cluster to run BERT-based sentiment models efficiently, overcoming the limitations of personal hardware.

Goal

At its core, the goal was to understand the affective experiences of LGBTQ+ users through their digital expressions — seeing online sentiment as one potential signal of long-term well-being in the face of systemic change.

Why it matters

  • Understand how government rulings shape public discourse and online emotions.
  • Explore whether structural events like court rulings contribute to community resilience or distress among stigmatized groups.
  • Assess whether sentiment on social media can serve as a proxy for public well-being.

Research Process

Stage 1: Identifying the Problem

  • Reviewed prior literature showing that queer individuals report higher rates of depression, anxiety, and reduced emotional well-being, often linked to structural stigma and social exclusion.
  • Noted that major legal rulings—such as those affirming same-sex marriage—can shift public norms, reduce perceived stigma, and potentially promote better mental health outcomes.
  • Identified a gap in existing research: while emotional spikes around rulings are well-documented, few studies examine long-term emotional trajectories among stigmatized communities.
  • Hypothesized that following landmark rulings, openly queer users would display more positive sentiment in the years after the ruling compared to the years before—reflecting increased visibility, solidarity, and resilience.
  • Defined the central research question:

    How does queer emotional expression shift online in response to major government rulings??

Stage 2: Study Design

  • User focus: Targeted everyday openly LGBTQ+ users with fewer than 1,000 followers to minimize bias from influencers or institutional accounts.
  • User categorization: Identified openly LGBTQ+ users through profile cues such as emojis, hashtags, and bio descriptions.
  • Pilot phase:
    • Tested keyword filters and categorization logic on small data subsets.
    • Refined classification to ensure clear differentiation between user types and ruling periods.

Data Collection Strategy:

  1. Scraped tweets referencing key rulings using hashtags like #SCOTUS and #LoveWins.
  2. Compiled a list of users who engaged with these rulings.
  3. Scraped all tweets from these users between 2012–2017, forming a longitudinal dataset.
  4. Processed tweets using BERT to generate a sentiment score for each tweet.
  5. Calculated average sentiment scores per user per year (2012–2017).
  6. Categorized users’ sexual orientation based on profile cues (e.g., 🌈 emojis, hashtags, bios) and tweet content (e.g., “coming out”).
  7. Included only non-influencer accounts (<1,000 followers) to preserve everyday emotional expression.

Dataset composition overview
Schematic Design

Stage 3: Data Collection

Programming Language: Python

  • Libraries & Tools:
    • Twint + Twitter API for data scraping
    • Pandas, NumPy for data cleaning
    • OpenAI’s BERT for sentiment scoring
  • Dataset:
    • Collected ~12 million tweets from 2950users (1520 openly LGBTQ+ users).
    • All users tweeted at least once per year from 2012 to 2017
    • Users identified through hashtags like #SCOTUS and #LoveWins.
    • Inclusion criteria: Tweeted at least once per year (2012–2017), with U.S. location specified.
  • Category Identification:
    • Users self-identified as openly LGBTQ+ via emojis, profile text, or hashtags (e.g., 🏳️‍🌈).
    • Others classified as non-openly LGBTQ+ as a comparison group.
  • Final Dataset for Analysis: Focused on 2013 and 2015, the two key ruling years.

Stage 4: Analysis

  • Sentiment Scoring:
    • BERT model fine-tuned for sentiment classification.
    • Tokenized tweets → context-aware embeddings → sentiment scores (−1 = negative; +1 = positive).
  • Statistical Modeling:
    • Linear Mixed Models in R.
    • Random intercepts for “username” to capture within-user variation.
    • Fixed effects: Month, Category (LGBTQ+ vs. Non-LGBTQ+), and their interaction.
  • Analytic Focus:
    • Temporal trends in positivity.
    • Comparison across ruling months (especially June, July).
    • Evaluation of within-group sentiment variability.

Stage 5: Results

2013 Ruling

  • Openly LGBTQ+ tweets were more positive overall (β = 0.1651, p < .001).
  • Significant monthly variation, with peaks around June–August (the ruling period).
  • However, from July–October, LGBTQ+ sentiment declined, showing less positivity than non-LGBTQ+ users.
  • Interpretation: The initial uplift from the ruling might have been tempered by backlash, social debate, or continuing discrimination.

2015 Ruling

  • Openly LGBTQ+ users again showed higher baseline positivity (β = 0.156, p < .001).
  • Sentiment rose in June (ruling month) but dropped sharply July–December (βs between −0.02 to −0.04, p < .001).
  • Possible Explanation: Post-ruling backlash or lingering interpersonal discrimination may have dampened long-term emotional benefits.

Additionally, seasonal trends were observed, including a sentiment dip in 2016 (possibly linked to Trump’s election)

Interactive Visualization of Sentiment Trends (2012–2017)

Explore monthly sentiment patterns among LGBTQ+ and non-LGBTQ+ users before and after key Supreme Court rulings. Hover over the plot to view average sentiment scores and sentiment shifts
🔍 Open in full screen

Stage 6: Interpretation

  • Unexpected Finding: LGBTQ+ tweets were generally more positive overall, contradicting assumptions about persistent negative affect due to marginalization.
  • Possible Explanation: Reflects community resilience and positive self-expression patterns online.
  • Psychological Context: Aligns with research showing minority group resilience (e.g., Crocker et al., 1994).
  • Temporal Dynamics: Institutional recognition (rulings) may shift perceived norms rather than emotional tone—positivity spikes may be short-lived.
  • Social Implications: Legal wins may not immediately alleviate interpersonal discrimination or structural stigma.

Reflection

  • The study highlights how digital platforms serve as emotional ecosystems—spaces for both coping and celebration.
  • Offers evidence that social media can amplify positive affect within stigmatized communities following structural change.
  • Future steps include:
    • Comparing emotional trends across specific LGBTQ+ subgroups (e.g., gay vs. lesbian users).
    • Exploring non-verbal communication (emoji use, visual media).
    • Linking online sentiment to offline well-being indicators (e.g., community health metrics, survey data).

Programming & Tools Summary

Stage Programming Language Key Libraries / Methods
Data Collection Python Twint, Twitter API
Data Cleaning Python Pandas, NumPy, regex
Sentiment Analysis Python BERT
Statistical Analysis & Visualization R & Python statsmodels, lme4, scipy
Lee Qianqian Cui
Lee Qianqian Cui
Ph.D. Researcher in Social Psychology