Search Constraints
Number of results to display per page
Search Results
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] / Chicago Novel Corpus [nvl] / Newspaper Corpus [nws] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 11/11/2022
- Date Created:
- 2022
- License:
- Open Data Commons Attribution License (ODC-By)
-
- Type:
- Dataset
- Description/Abstract:
- CSV files containing the topic coherence scoring pertaining to datasets of: DocumentCount = 5,000 Corpus = (one of) Federal Caselaw [cas] / Pubmed-Abstracts [pma] / Pubmed-Central [pmc] SearchTerm[s] = (one of) Earth / Environmental / Climate / Pollution / Random 5k documents of a specific corpus Coherence was scored across every combination of: TopicCount: 10-40 Hyperparameter-Alpha: [0.01, 0.31, 0.61, 0.91, symmetric, asymmetric] Hyperparameter-Beta: [0.01, 0.31, 0.61, 0.91, automatic, symmetric] The columns in this file include: Validation_Set: Which search term this scoring pertains to Topics: Number of topics in the model Alpha: Hyperparameter alpha selection from the 6 options above Beta: Hyperparameter beta selection from the 6 options above Coherence: The topic coherence score for the given model-row Perplexity: The perplexity score for the given model-row
- Creator/Author:
- McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 08/10/2022
- Date Modified:
- 08/10/2022
- Date Created:
- 2022
- License:
- Open Data Commons Public Domain Dedication and License (PDDL)
-
- Type:
- Generic Work
- Description/Abstract:
- This is a collection of items that I used in my Samvera History Presentation.
- Creator/Author:
- Scherz, Thomas
- Submitter:
- Thomas Scherz
- Date Uploaded:
- 07/12/2022
- Date Modified:
- 07/12/2022
- License:
- Open Data Commons Attribution License (ODC-By)
-
- Type:
- Dataset
- Description/Abstract:
- This file contains poet metadata for a dataset of poems collected from Poetry Foundation. This metadata relates to a poetry corpus collected from Kaggle available on that site: https://www.kaggle.com/datasets/ultrajack/modern-renaissance-poetry Sean Ayres collected publication metadata and generated other relevant metadata fields based on his subject matter expertise. The original Kaggle poetry dataset contains approximately X lines of poetry. The metadata for this dataset references X poets
- Creator/Author:
- Ayres, Sean and McCabe, Erin E.
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 07/11/2022
- Date Modified:
- 07/11/2022
- Date Created:
- 2022
- License:
- Open Data Commons Attribution License (ODC-By)
-
- Type:
- Dataset
- Description/Abstract:
- This file contains book metadata for a Project Gutenberg dataset of poetry metadata. This metadata relates to a poetry corpus collected by Allison Parrish available on GitHub: https://github.com/aparrish/gutenberg-poetry-corpus Sean Ayres collected the publication metadata and generated other relevant metadata fields based on his subject matter expertise. The poetry dataset contains approximately three million lines of poetry The metadata for the dataset references X hundreds of books.
- Creator/Author:
- Ayres, Sean; McCabe, Erin E., and Parrish, Allison
- Submitter:
- Erin E. McCabe
- Date Uploaded:
- 07/11/2022
- Date Modified:
- 07/11/2022
- Date Created:
- 2022
- License:
- Open Data Commons Attribution License (ODC-By)
-
- Type:
- Dataset
- Description/Abstract:
- Data and code used for the analysis of the effects of weather on the fall migration of eastern monarch butterflies.
- Creator/Author:
- Matter, Stephen F.; Guerra, Patrick ; Parlin, Adam ; Rich, Jeremy, and Taylor, Orley R.
- Submitter:
- Stephen F. Matter
- Date Uploaded:
- 07/03/2022
- Date Modified:
- 11/28/2022
- Date Created:
- 2022
- License:
- Attribution-NonCommercial-NoDerivs 4.0 International
