Considering the many R packages released each year, this article will look at the packages used in Health Care Data Science Research in Botswana from 2018 to 2022.
This post won’t be explaining functions and code, just an explanation of the packages used.
M y home country Botswana is landlocked in Southern Africa, home to dangerous wildlife like Lions, Elephants, Basarwa (affectionately known as “Bushmen”) and swamplife . Little is known about her use of R programming for Data Science research particularly in Health care.
Health care research in Botswana aims to identify, evaluate and improve general health conditions. The data collected for descriptive analysis from different regions of Botswana help to understand the actuality of on going treatment distribution and health institution management in order to improve Health care services.
The following R packages were used in most Health care Data Science research.
1. forecast
2. oce
3. ggplot2
4. SNP Relate
5. inctools
6. APE
7. adephylo
8. iGraph
1. forecast
The forecast package was developed for automatic time series forecasting. It is part of a forecasting bundle which contains the fma, Mcomp and expsmooth packages developed by Rob Hyndman.
The forecast package contains functions for:
- Univariate forecasting
- Automatic forecasting using exponential smoothing
- ARIMA models
- Theta method
- Cubic splines
- Other common forecasting methods
2. oce
The oce package is used for reading captured data from Oceanographic instruments. Initially, designed for real-world applications, oce supports a broad range of practical work too.
Even though there are no oceans and seas in Botswana, the package makes it easy for analysing details of calculations, discipline specific file formats, and plots.
Generic functions take care of general operations such as sub-setting and plotting data, while specialized functions address more specific tasks such as Hydrographic analysis and ADCP coordinate transformations. According to Dan.E. Kelley, it’s easy to document work done with oce because its functions automatically update processing logs stored within its data objects.
3. ggplot2
The most well known of the packages in the list is ggplot2. Ggplot2 is used for making plots and annotations for data visualisation. The different types of plots built using ggplot2 can range from dendrograms, network graphs and histograms. Ggplot2 can improve the quality of the graphics just from changing fonts, sizes and images for attractive data reading.
4. SNP Relate
SNP Relate is used in Genomic exploration for Principle Component Analysis (PCA) and relatedness analysis using identity-by-descent measures.
It was developed for multi-core symmetric multiprocessing computer architectures. The SNP Relate package provides computation for Single-Nucleotide Polymorphism (SNP) data in Genome-wide association studies.
Unfortunately, like other packages, the SNP Relate documentation is no longer on CRAN but can be found from recommended links as archives.
5. inctools
The inctools package is used for estimating prevalence from biomarker data in cross-sectional checks and for calibrating tests for any recent infection.
Originally developed to measure HIV infection prevalence in a certain population, it gives state of the art functionality to support large aspects of population position prevalence surveillance. The alleviation for the work of the package came from the challenges associated with estimating population position HIV prevalence.
6. APE
APE, which stands for Analyses of Phylogenetics and Evolution, is used in molecular evolution and phylogenetics. The APE package uses phylogenetic and genealogical trees as input when making statistical analyses.
The APE package has functions for working on phylogenetic trees as well as phylogenetic and evolutionary analysis such as population genetic and comparative methods.
APE takes advantage of the numerous R functions for statistics, graphics and also provides a flexible framework for developing and implementing more statistical methods for the analysis of evolutionary processes.
7. adephylo
The adephylo package is designed for the analysis of comparative evolutionary data. Phylogenetic comparative methods are aimed at accounting for, or removing the effects of phylogenetic signal in the analysis of biological traits.
8. iGraph
The iGraph package provides tools for plotting networking graphs. It can handle huge graphs with millions of vertices, edges and it’s also suitable for grid computing. It contains routines for:
- Creating, manipulating and imaging networks.
- Calculating colourful structural parcels.
- Importing from and exporting to colourful train formats.
While using GNU( GNU’s Not Unix! software), R and Python, it supports fast development and fast prototyping.