Each row in this dataset depicts a single non-profit organization (NPO), labeled by their Employer Identification Number (EIN).
Each row contains the National Taxonomy of Exempt Entities (NTEE) code assigned to each NPO by the IRS (if any) and the official Essential/Non-Essential status connected to that NTEE code.
Each row of this dataset depicts a single Ohio-based non-profit organization (NPO) (identified by Employer Identification Number) and a hand-coded determination of their 'essential' status.
This determination of essential status is guided by the official IRS definition and based strictly on the NPO's own mission statement and activities language supplied in their 2019 tax form.
This CSV file contains the topic distribution of each EIN as uncovered using six parallel Latent Dirichlet Allocation (LDA) Topic Models.
Each row depicts a topic and topic-score associated with an Ohio NPO (identified by Employer Identification Number) generated from one model run.
The sum of topic scores possible for every row associated with an EIN therefore will not exceed 6.0 (6 models x 100%)
Topic scores below .01 (1%) are not included.
Each topic from the models is further identified as Essential/Non-Essential by subject matter expert, Dr. Michael Jones, guided by the official IRS definition.
The topic models are generated on unstructured text language from the mission statement and activities language taken from the 2019 tax forms of Ohio non-profit organizations.