Search Constraints
Filtering by:
Subject
Natural Language Processing
Remove constraint Subject: Natural Language Processing
« Previous |
41 - 49 of 49
|
Next »
Number of results to display per page
Search Results
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] / Chicago Novel Corpus [nvl] / Newspaper Corpus [nws] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 11/11/2022
- Date Created:
- 2022
- License:
- Open Data Commons Attribution License (ODC-By)
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
- Type:
- Dataset
- Description/Abstract:
- The data sets were derived from coronavirus related scientific literature using the CORD-19 dataset released by the Allen Institute of Artificial Intelligence as of July 14, 2020, using the Elasticsearch engine hosted by the Digital Scholarship Center (DSC). Through indexing the full-text and the metadata of the article corpus, the research team generated a full-corpus model and 7 different models corresponding to key viral outbreaks from the past several decades' coronaviruses (SARS-CoV, MERS-CoV, and SARS- CoV-2) and non-coronaviruses (HIV, Zika, H1N1, and Ebola). The targeted subsets of the articles used two or more occurrences of virus-specific keywords drawn from conventions established by the World Health Organization.
- Creator/Author:
- Koshoffer, Amy; Wu, Danny; Latessa, Jenny; Kannayyagar, Suraj; Luken, Sally; McCabe, Erin; Edgerton, Ezra; Washington, Dorcas; Lee, James; Powers, Margaret, and Hagedorn, Philip
- Submitter:
- Amy Koshoffer
- Date Uploaded:
- 10/30/2020
- Date Modified:
- 11/05/2020
- Date Created:
- 2020-07
- License:
- CC0 1.0 Universal
- Type:
- Dataset
- Description/Abstract:
- The data sets were derived from coronavirus related scientific literature using the CORD-19 dataset released by the Allen Institute of Artificial Intelligence as of July 14, 2020, using the Elasticsearch engine hosted by the Digital Scholarship Center (DSC). Through indexing the full-text and the metadata of the article corpus, the research team generated a full-corpus model and 7 different models corresponding to key viral outbreaks from the past several decades' coronaviruses (SARS-CoV, MERS-CoV, and SARS- CoV-2) and non-coronaviruses (HIV, Zika, H1N1, and Ebola). The targeted subsets of the articles used two or more occurrences of virus-specific keywords drawn from conventions established by the World Health Organization.
- Creator/Author:
- Koshoffer, Amy; Hagedorn, Philip ; Latessa, Jenny; Lee, James; Power, Margaret; Luken, Sally; McCabe, Erin; Wu. Danny; Washington, Dorcas; Kannayyagar, Suraj, and Edgerton, Ezra
- Submitter:
- Amy Koshoffer
- Date Uploaded:
- 10/29/2020
- Date Modified:
- 11/05/2020
- Date Created:
- 2020-07
- License:
- CC0 1.0 Universal
- Type:
- Dataset
- Description/Abstract:
- The data sets were derived from coronavirus related scientific literature using the CORD-19 dataset released by the Allen Institute of Artificial Intelligence as of July 14, 2020, using the Elasticsearch engine hosted by the Digital Scholarship Center (DSC). Through indexing the full-text and the metadata of the article corpus, the research team generated a full-corpus model and 7 different models corresponding to key viral outbreaks from the past several decades' coronaviruses (SARS-CoV, MERS-CoV, and SARS- CoV-2) and non-coronaviruses (HIV, Zika, H1N1, and Ebola). The targeted subsets of the articles used two or more occurrences of virus-specific keywords drawn from conventions established by the World Health Organization.
- Creator/Author:
- Koshoffer, Amy; Wu, Danny; Latessa, Jenny; Lee, James; Luken, Sally; McCabe, Erin; Edgerton, Ezra; Washington, Dorcas; Kannayyagar, Suraj, and Powers, Margaret
- Submitter:
- Amy Koshoffer
- Date Uploaded:
- 10/29/2020
- Date Modified:
- 11/05/2020
- Date Created:
- 2020-07
- License:
- CC0 1.0 Universal