Search Constraints
1 - 6 of 6
Number of results to display per page
Search Results
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one from) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] / News [nws] SearchTerm[s] = (one from) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 11/12/2022
- Date Modified:
- 11/12/2022
- Date Created:
- 2022
- License:
- Open Data Commons Attribution License (ODC-By)
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one from) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] / News [nws] SearchTerm[s] = (one from) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 11/12/2022
- Date Modified:
- 11/12/2022
- Date Created:
- 2022
- License:
- Open Data Commons Attribution License (ODC-By)
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one from) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] / News [nws] SearchTerm[s] = (one from) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 11/12/2022
- Date Modified:
- 11/12/2022
- Date Created:
- 2022
- License:
- Open Data Commons Attribution License (ODC-By)
-
- Type:
- Image
- Description/Abstract:
- Box-and-Whisker visualization of coherence scores for three corpora types: Caselaw (cas), Pubmed Abstracts (pma), Pubmed Central (pmc). This figure is for models matching search-term "climate". Visualizations for other search terms and additional interactive elements available at the related URL below. Coherence was scored across every combination of: - TopicCount: 10-40 - Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] - Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric]
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/12/2022
- Date Modified:
- 08/12/2022
- Date Created:
- 2022
- License:
- Public Domain Mark 1.0
-
- Type:
- Image
- Description/Abstract:
- Box-and-Whisker visualization of topic coherence scores for three corpora types: Caselaw (cas), Pubmed Abstracts (pma), Pubmed Central (pmc). This figure is for models matching search-term "climate". Visualizations for other search terms and additional interactive elements available at related URL below. Coherence was scored across every combination of: - TopicCount: 10-40 - Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] - Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric]
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/12/2022
- Date Modified:
- 08/12/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
-
- Type:
- Image
- Description/Abstract:
- Heat map visualization of median coherence scores for three corpora: Caselaw (cas), Pubmed Abstracts (pma), Pubmed Central (pmc). Median coherence scores across all search-term based models ("climate", "earth", "environmental" "pollution") The median is found from 1,116 total coherence scores. Coherence was scored across every combination of: - TopicCount: 10-40 - Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] - Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric]
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Public Domain Mark 1.0