Methodology

How we collect, process, and match church data with their web presence.

Data Source: IRS Exempt Organizations

Our primary data source is the IRS Exempt Organizations Business Master File (EO BMF). This publicly available dataset contains information about all organizations that have been granted tax-exempt status under Section 501(c)(3) of the Internal Revenue Code.

We filter this dataset to identify religious organizations using NTEE codes and classification codes that indicate churches, synagogues, mosques, and other religious congregations.

Website Discovery

For each church in our database, we use Google's Custom Search API to find potential website matches. We construct search queries using the organization's name, city, and state to find relevant results.

Search results are analyzed to determine if they represent the church's official website. We look for domain patterns, content relevance, and other signals to make accurate matches.

Matching Algorithm

Our matching algorithm considers several factors:

  • Domain name similarity to church name
  • Geographic relevance (mentions of city/state)
  • Content indicators of religious organizations
  • Search result ranking and relevance

We prioritize precision over recall, meaning we'd rather miss a website than incorrectly match one. This ensures the data we provide is reliable.

Data Updates

The IRS updates the Exempt Organizations data monthly. We periodically refresh our database to incorporate new organizations and remove those that have lost their exempt status.

Website matching is an ongoing process. We continuously search for websites for churches that don't yet have matches and verify existing matches remain valid.

Limitations

This dataset has some known limitations:

  • Only includes churches registered as 501(c)(3) nonprofits
  • Small churches may not have websites to match
  • Some churches share websites or use social media instead
  • Organization names from IRS filings may be outdated