Dataset
Coherence_Evaluations - pmc_environmental.csv Open Access Deposited
CSV files containing the topic coherence scoring pertaining to datasets of:
DocumentCount = 5,000
Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc]
SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus
Coherence was scored across every combination of:
TopicCount: 10-40
Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric]
Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric]
The columns in this file include:
Validation_Set: Which search term this scoring pertains to
Topics: Number of topics in the model
Alpha: Hyperparameter alpha selection from the 6 options above
Beta: Hyperparameter beta selection from the 6 options above
Coherence: The topic coherence score for the given model-row
Perplexity: The perplexity score for the given model-row
- Creator
- License
- Subject
- Submitter
- College
- Department
- Date Created
- Publisher
- Language
Digital Object Identifier (DOI)
Identifier: doi:10.7945/6594-eh39
Link: https://doi.org/10.7945/6594-eh39
This DOI link is the best way for others to cite your work.
-
- In Collection:
Relationships
Items
Thumbnail | Title | Date Uploaded | Visibility | Actions |
---|---|---|---|---|
eval_pmc_environmental.csv | 2022-11-04 | Open Access |
|
Permanent link to this page: https://scholar.uc.edu/show/j098zc69m