The files in this work represent the presentations and workshop content from the 5th UC Data Day held 2020-10-23.
The theme was “World Changing Data: How Digital Data Will Change Our Future”.
The Keynote speaker was Glenn Ricart, of US Ignite - "Smart Runs on Data"
Interactive Panel featuring: Michael Dunaway (moderator) - Whitney Gaskins (Asst Dean, CEAS - Incl Excellence & Comm Engagmnt) - Zvi Biener (Assoc Professor, A&S Philosophy) - Prashant Khare (Asst Professor, CEAS - Aerospace Eng & Eng Mechanics)- Sam Anand (Professor, CEAS - Mechanical Eng) - Achala Vagal
(Professor Clinical - GEO, COM Radiology Neuroradiology)
Power Sessions:
George Turner - Indiana University - High-Performance Computing at UC
Erin McCabe - University of Cincinnati - Text Mining, Natural Language Processing & AI
link to slides - https://bit.ly/dataday_slides
link to code - https://bit.ly/dataday_code
Videos of the day can be found on the UC Libraries STRC1 youtube channel - https://www.youtube.com/c/STRC1/videos
The data sets were derived from coronavirus related scientific literature using the CORD-19 dataset released by the Allen Institute of Artificial Intelligence as of July 14, 2020, using the Elasticsearch engine hosted by the Digital Scholarship Center (DSC). Through indexing the full-text and the metadata of the article corpus, the research team generated a full-corpus model and 7 different models corresponding to key viral outbreaks from the past several decades' coronaviruses (SARS-CoV, MERS-CoV, and SARS- CoV-2) and non-coronaviruses (HIV, Zika, H1N1, and Ebola). The targeted subsets of the articles used two or more occurrences of virus-specific keywords drawn from conventions established by the World Health Organization.
The data sets were derived from coronavirus related scientific literature using the CORD-19 dataset released by the Allen Institute of Artificial Intelligence as of July 14, 2020, using the Elasticsearch engine hosted by the Digital Scholarship Center (DSC). Through indexing the full-text and the metadata of the article corpus, the research team generated a full-corpus model and 7 different models corresponding to key viral outbreaks from the past several decades' coronaviruses (SARS-CoV, MERS-CoV, and SARS- CoV-2) and non-coronaviruses (HIV, Zika, H1N1, and Ebola). The targeted subsets of the articles used two or more occurrences of virus-specific keywords drawn from conventions established by the World Health Organization.
The data sets were derived from coronavirus related scientific literature using the CORD-19 dataset released by the Allen Institute of Artificial Intelligence as of July 14, 2020, using the Elasticsearch engine hosted by the Digital Scholarship Center (DSC). Through indexing the full-text and the metadata of the article corpus, the research team generated a full-corpus model and 7 different models corresponding to key viral outbreaks from the past several decades' coronaviruses (SARS-CoV, MERS-CoV, and SARS- CoV-2) and non-coronaviruses (HIV, Zika, H1N1, and Ebola). The targeted subsets of the articles used two or more occurrences of virus-specific keywords drawn from conventions established by the World Health Organization.