Dataset

 

Coherence_Evaluations - nws_pollution.csv Open Access Deposited

No preview available

Download the file

Date Uploaded: 11/03/2022
Date Modified: 11/17/2022

CSV files containing the coherence scoring pertaining to datasets of:
DocumentCount = 5,000
Corpus = (one from) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] / News [nws]
SearchTerm[s] = (one from) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus

Coherence was scored across every combination of:
TopicCount: 10-40
Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric]
Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric]

The columns in this file include:
Validation_Set: Which search term this scoring pertains to
Topics: Number of topics in the model
Alpha: Hyperparameter alpha selection from the 6 options above
Beta: Hyperparameter beta selection from the 6 options above
Coherence: The topic coherence score for the given model-row
Perplexity: The perplexity score for the given model-row

Creator
License
Subject
Geographic Subject
Time Period
  • 21st century
Submitter
College
Department
Date Created
Publisher
Language

Digital Object Identifier (DOI)

Identifier: doi:10.7945/yevs-9z66
Link: https://doi.org/10.7945/yevs-9z66

This DOI link is the best way for others to cite your work.

Relationships

In Collection:

Items

Permanent link to this page: https://scholar.uc.edu/show/jq085m40n