Dataset

 

Coherence_Evaluations - pmc_environmental.csv Open Access Deposited

No preview available

Download the file

Date Uploaded: 11/04/2022
Date Modified: 11/04/2022

CSV files containing the topic coherence scoring pertaining to datasets of:
DocumentCount = 5,000
Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc]
SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus

Coherence was scored across every combination of:
TopicCount: 10-40
Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric]
Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric]

The columns in this file include:
Validation_Set: Which search term this scoring pertains to
Topics: Number of topics in the model
Alpha: Hyperparameter alpha selection from the 6 options above
Beta: Hyperparameter beta selection from the 6 options above
Coherence: The topic coherence score for the given model-row
Perplexity: The perplexity score for the given model-row

Creator
License
Subject
Submitter
College
Department
Date Created
Publisher
Language

Digital Object Identifier (DOI)

Identifier: doi:10.7945/6594-eh39
Link: https://doi.org/10.7945/6594-eh39

This DOI link is the best way for others to cite your work.

Relationships

In Collection:

Items

Permanent link to this page: https://scholar.uc.edu/show/j098zc69m