OpenSermon
With Tucker · open-sermon.com
Churches are a major part of American public life. Roughly 45 million Americans attend a place of worship each week, and about 73% attend at least once a year. Pastors rank among the most trusted figures in public life, more trusted than journalists and many other professionals—especially among churchgoers—making sermons a consequential form of public communication in the USA.
However, no large-scale resource exists for systematically studying sermon content. With over 350,000 congregations nationwide, the challenge of identifying, collecting, and organizing sermon media has historically been prohibitive, seriously limiting previous research and making it difficult for congregation members to take a systematic look at their church relative to others. Recent advances in AI now make large-scale data collection and analysis possible.
To respond to this opportunity, my friend Tucker and I are building OpenSermon, targeting two aims. First, we are building a systematic database of American church sermons: identifying congregations, locating published sermon media, transcribing the content, and organizing it into a unified corpus that is open to researchers and the public. Second, we are developing an accessible online platform to make these materials generally accessible. Check out our beta platform here: open-sermon.com.




We're excited about the potential to expand these visualizations as the archive grows. We also recognize that the current dashboard is designed more for “power users,” and we're actively thinking about ways to make the display simpler and more accessible. Ideally, those two goals will grow together.
Beyond the research contribution, developing OpenSermon has been a major learning experience in web development and applied AI. We use traditional web crawlers, agentic scraping, and Scrapling to collect sermon media from congregations across the U.S. We then transcribe the media with a distilled version of Whisper and analyze the sermons using dedicated prompts developed for DeepSeek. The web platform was built alongside Claude Code using TSX and Tailwind, and we host everything on AWS, hoping usage scales quickly.
If you're interested in the project or have ideas for improvements, please reach out! We are actively seeking founding grants to expand our impact.