Index Catalog // Scholar@UC

Dataset thumbnail: Vocabulary Count Feature Vectors for Medical School Classifier

51. Vocabulary Count Feature Vectors for Medical School Classifier

Type:: Dataset
摘抄:: Classifier algorithms use the features (collectively known as Feature Vectors) of each item in a dataset to assess the classification to which that item belongs. In this classifier approach, each item represents one document containing the application essay combined with unstructured language describing relevant activities of a single applicant. For privacy, the full text of this document is not provided. Instead, each document is represented only by its features. The feature vector for this classifier is based on the term frequency for each of the identified terms. E.G. Doc_A contains 0 occurrences of any terms identified as family medicine vocabulary, and 10 occurrences of terms from the the non-family-medicine vocabulary.
作者:: Boylan, Andrew and McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 05/14/2021
更改日期:: 05/14/2021
证书:: Open Data Commons Public Domain Dedication and License (PDDL)

Dataset thumbnail: Vocabulary Comparison of Medical School Applications

52. Vocabulary Comparison of Medical School Applications

Type:: Dataset
摘抄:: W2V takes terms from a large corpus of text and models them onto a vector space, based on word associations from your dataset. These Word Associations take into account each word's immediate context (its ten neighboring words). Following the data modeling (large-scale unstructured text), The platform then generates a visualization of this vector space, which lets us perform analysis e.g. detect synonymous/synonym-ish words and highlight related words. At the heart of this project, is W2V's ability to identify key words that were more frequent - and more unique - to each group using results from 2 different W2V models – one for each group's application texts. We coded these Key Terms into categories, then analyzed those categories for overarching themes.
作者:: McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 05/14/2021
更改日期:: 05/14/2021
证书:: Open Data Commons Public Domain Dedication and License (PDDL)

Dataset thumbnail: IRS Classification of Ohio Non-Profit Organizations

53. IRS Classification of Ohio Non-Profit Organizations

Type:: Dataset
摘抄:: Each row in this dataset depicts a single non-profit organization (NPO), labeled by their Employer Identification Number (EIN). Each row contains the National Taxonomy of Exempt Entities (NTEE) code assigned to each NPO by the IRS (if any) and the official Essential/Non-Essential status connected to that NTEE code.
作者:: Jones, Michael and McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 05/06/2021
更改日期:: 05/07/2021
创建:: 2020-11-01
证书:: Open Data Commons Public Domain Dedication and License (PDDL)

Dataset thumbnail: Hand Coded Essential Classification of Ohio Non-Profit Organizations

54. Hand Coded Essential Classification of Ohio Non-Profit Organizations

Type:: Dataset
摘抄:: Each row of this dataset depicts a single Ohio-based non-profit organization (NPO) (identified by Employer Identification Number) and a hand-coded determination of their 'essential' status. This determination of essential status is guided by the official IRS definition and based strictly on the NPO's own mission statement and activities language supplied in their 2019 tax form.
作者:: Jones, Michael and McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 05/06/2021
更改日期:: 05/06/2021
证书:: Open Data Commons Public Domain Dedication and License (PDDL)

Dataset thumbnail: Topic Model Results of Ohio Non-Profit Organizations' Mission Language

55. Topic Model Results of Ohio Non-Profit Organizations' Mission Language

Type:: Dataset
摘抄:: This CSV file contains the topic distribution of each EIN as uncovered using six parallel Latent Dirichlet Allocation (LDA) Topic Models. Each row depicts a topic and topic-score associated with an Ohio NPO (identified by Employer Identification Number) generated from one model run. The sum of topic scores possible for every row associated with an EIN therefore will not exceed 6.0 (6 models x 100%) Topic scores below .01 (1%) are not included. Each topic from the models is further identified as Essential/Non-Essential by subject matter expert, Dr. Michael Jones, guided by the official IRS definition. The topic models are generated on unstructured text language from the mission statement and activities language taken from the 2019 tax forms of Ohio non-profit organizations.
作者:: McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 05/06/2021
更改日期:: 09/20/2021
创建:: 2020-10-03
证书:: Open Data Commons Public Domain Dedication and License (PDDL)

Document thumbnail: APPENDIX D: Topic Network Maps for a Random 10k Documents & the Complete CORD-19 Dataset

56. APPENDIX D: Topic Network Maps for a Random 10k Documents & the Complete CORD-19 Dataset

Type:: Document
摘抄:: All models and corresponding network visualizations are generated from documents in the CORD-19 dataset as of July 14, 2020. All annotations in red were added by the research team. Note: These topic models are included here as additional reference and to append links to interactive versions on the Digital Scholarship Center’s machine learning platform for further exploration.
作者:: McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 02/26/2021
更改日期:: 02/26/2021
创建:: 2020
证书:: CC0 1.0 Universal

Document thumbnail: APPENDIX C: Non-Coronaviridae Topic Network Maps [HIV, ZIKA, H1N1, EBOLA]

57. APPENDIX C: Non-Coronaviridae Topic Network Maps [HIV, ZIKA, H1N1, EBOLA]

Type:: Document
摘抄:: All models and corresponding network visualizations are generated from virus related documents in the CORD-19 dataset as of July 14, 2020. All annotations in red were added by the research team. Note: Certain Non-Coronaviridae topic models are included in the text of this article and are included here only as additional reference and to append links to interactive versions on the Digital Scholarship Center’s machine learning platform for further exploration.
作者:: McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 02/26/2021
更改日期:: 02/26/2021
创建:: 2020
证书:: CC0 1.0 Universal

Document thumbnail: APPENDIX B: Coronaviridae Topic Network Maps [SARS, MERS, COVID-19]

58. APPENDIX B: Coronaviridae Topic Network Maps [SARS, MERS, COVID-19]

Type:: Document
摘抄:: All models and corresponding network visualizations are generated from virus related documents in the CORD-19 dataset as of July, 2020. All annotations in red were added by the research team. Note: Coronavirus topic models are included in the text of this article and are included here only as additional reference and to append links to interactive versions on the Digital Scholarship Center’s machine learning platform for further exploration.
作者:: McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 02/26/2021
更改日期:: 02/26/2021
证书:: CC0 1.0 Universal

Document thumbnail: APPENDIX A: Group Betweenness Centrality for Topic - Coronavidae [SARS, MERS, COVID-19]

59. APPENDIX A: Group Betweenness Centrality for Topic - Coronavidae [SARS, MERS, COVID-19]

Type:: Document
摘抄:: These Centrality measurements were generated with NetworkX, a Python package for networks. The specific algorithms used for this paper are Betweenness Centrality (where Degree Centrality considers individual topics). Complete Centrality Data for this research can be found at https://scholar.uc.edu/show/6t053h21x
作者:: McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 02/26/2021
更改日期:: 02/26/2021
证书:: CC0 1.0 Universal

Document thumbnail: README - Examining the Substance of Bone through a Meta-Analysis of Academic Texts

60. README - Examining the Substance of Bone through a Meta-Analysis of Academic Texts

Type:: Document
摘抄:: README for dataset available through the collection, Examining the Substance of Bone through a Meta-Analysis of Academic Texts.
作者:: McCabe, Erin E.
提交者:: Erin E. McCabe
上传日期:: 11/10/2020
更改日期:: 11/10/2020
证书:: Attribution 4.0 International

51. Vocabulary Count Feature Vectors for Medical School Classifier

52. Vocabulary Comparison of Medical School Applications

53. IRS Classification of Ohio Non-Profit Organizations

54. Hand Coded Essential Classification of Ohio Non-Profit Organizations

55. Topic Model Results of Ohio Non-Profit Organizations' Mission Language

56. APPENDIX D: Topic Network Maps for a Random 10k Documents & the Complete CORD-19 Dataset

57. APPENDIX C: Non-Coronaviridae Topic Network Maps [HIV, ZIKA, H1N1, EBOLA]

58. APPENDIX B: Coronaviridae Topic Network Maps [SARS, MERS, COVID-19]

59. APPENDIX A: Group Betweenness Centrality for Topic - Coronavidae [SARS, MERS, COVID-19]

60. README - Examining the Substance of Bone through a Meta-Analysis of Academic Texts

限定搜索

工作类型

创建者

学科

学

语言

出版者

创建日期

集合

搜索条件

搜索结果

51. Vocabulary Count Feature Vectors for Medical School Classifier

52. Vocabulary Comparison of Medical School Applications

53. IRS Classification of Ohio Non-Profit Organizations

54. Hand Coded Essential Classification of Ohio Non-Profit Organizations

55. Topic Model Results of Ohio Non-Profit Organizations' Mission Language

56. APPENDIX D: Topic Network Maps for a Random 10k Documents & the Complete CORD-19 Dataset

57. APPENDIX C: Non-Coronaviridae Topic Network Maps [HIV, ZIKA, H1N1, EBOLA]

58. APPENDIX B: Coronaviridae Topic Network Maps [SARS, MERS, COVID-19]

59. APPENDIX A: Group Betweenness Centrality for Topic - Coronavidae [SARS, MERS, COVID-19]

60. README - Examining the Substance of Bone through a Meta-Analysis of Academic Texts

限定搜索