Each row of this dataset depicts a single Ohio-based non-profit organization (NPO) (identified by Employer Identification Number) and a hand-coded determination of their 'essential' status.
This determination of essential status is guided by the official IRS definition and based strictly on the NPO's own mission statement and activities language supplied in their 2019 tax form.
This CSV file contains the topic distribution of each EIN as uncovered using six parallel Latent Dirichlet Allocation (LDA) Topic Models.
Each row depicts a topic and topic-score associated with an Ohio NPO (identified by Employer Identification Number) generated from one model run.
The sum of topic scores possible for every row associated with an EIN therefore will not exceed 6.0 (6 models x 100%)
Topic scores below .01 (1%) are not included.
Each topic from the models is further identified as Essential/Non-Essential by subject matter expert, Dr. Michael Jones, guided by the official IRS definition.
The topic models are generated on unstructured text language from the mission statement and activities language taken from the 2019 tax forms of Ohio non-profit organizations.
All models and corresponding network visualizations are generated from documents in the CORD-19 dataset as of July 14, 2020. All annotations in red were added by the research team.
Note: These topic models are included here as additional reference and to append links to interactive versions on the Digital Scholarship Center’s machine learning platform for further exploration.
All models and corresponding network visualizations are generated from virus related documents in the CORD-19 dataset as of July 14, 2020. All annotations in red were added by the research team.
Note: Certain Non-Coronaviridae topic models are included in the text of this article and are included here only as additional reference and to append links to interactive versions on the Digital Scholarship Center’s machine learning platform for further exploration.
All models and corresponding network visualizations are generated from virus related documents in the CORD-19 dataset as of July, 2020. All annotations in red were added by the research team.
Note: Coronavirus topic models are included in the text of this article and are included here only as additional reference and to append links to interactive versions on the Digital Scholarship Center’s machine learning platform for further exploration.
These Centrality measurements were generated with NetworkX, a Python package for networks. The specific algorithms used for this paper are Betweenness Centrality (where Degree Centrality considers individual topics).
Complete Centrality Data for this research can be found at https://scholar.uc.edu/show/6t053h21x
This list contains the titles and publication years of 599 articles from two Archaeology journals, Ancient Mesoamerica and Latin American Antiquity that contain the term, 'bone'. The articles named in this list were used as the dataset to generate LDA topic models for related research.