Algorithms in Bioinformatics

Enhanced COVID-19 data for improved prediction of survival

Supplementary files

This page contains data and code used in our paper: Wenhuan Zeng, Anupam Gautam and Daniel H. Huson (2021) On the Application of Advanced Machine Learning Methods to Analyze Enhanced, Multimodal Data from Persons Infected with COVID-19, Computation 9(4)

.

Data:

Initial COVID-19 dataset 183 Cases:

Initial.xlsx

Country-wise  base and weighted average polarity score:

Polarity.csv

Past 14 day weather for each subject:

WeatherInfoTotal.csv
Weather, numerical values: WeatherEmbeddingfile.xlsx
Enhanced COVID-19 dataset: EnhancedCOVID_19Dataset.csv
Data downloaded from WHO website: CSV_as_at_09_April_2020-Full_database.xlsx
Processed WHO publication data: WHOJournalFormatted.txt
Country-wise mortality: detail.xlsx

Code:

Python script for parsing WHO database:

 PublicationListProcessing.py
Usage:  PublicationListProcessing.py -i InputFileName -o OutputFileName -t NumberOfThreads
Input file:  CSV_as_at_09_April_2020-Full_database.xlsx (Downloaded from WHO database)
Output file:   Text file with DOI and Institute Names
Required Python modules:  BeautifulSoup, pandas, requests, multiprocessing, argparse

Python script  for crawling climate information

on weather underground weather:

CrawlWeather.py

Usage: CrawlWeather.py --input InputFileName --browser BrowserDriverName --output OutputFileName
Input file: Initial.xlsx
Output file:  CSV file with Sample ID and Climate Information, each sample with 15 records
Required Python modules: BeautifulSoup, pandas, NumPy, selenium, argparse