Data

Datasets, scrapers, pipelines: the boring infrastructure that matters

Cropped close up of a pale blue socius pattern on a bright blue background

Published

ICWSM 2024

sentibank

How Software 1.0 understood feelings

Sixty years of human emotion research, rescued from digital archaeology. Fifteen dictionaries. One toolkit. How Software 1.0 understood feelings before neural networks took over.

Responsible Data Pipeline

Making compliance automatic
Cropped close up of a green socius pattern on a dark green background

Published

AIES 2025

PETLP Framework

Social media AI research is legally haunted. GDPR, copyright, platform terms — every pipeline step faces different rules. We mapped the maze and built the guide.

In Development

RedditHarbor

Reddit research has too many rules to track manually. We are building scraper that follows them all automatically. Privacy filters, audit logs, and ethical defaults built-in.