Software and Toolkit

This page gives links to my software and projects I am working on, and describes the tools I use for data analysis and modelling.

Personal

Github
My personal github page
ggfan
Github homepage of the R package I authored for the plotting of probability distributions. Based on fanplot

Software

This section describes the tools I use for research and data analysis.

R

R is a free and open source statistical programming language. Some of the packages I use most often are listed below.

Tidyverse
Suite of R libraries for the tidy analysis of data. Tidy data is data store as a table where every column represents one variable and every row represents one (and only one) observation. Storing data in this way makes data processing and visualisation The tidy data allows data processing and analysis task to be written in an easily understandable way. Each step in the analysis can be represented by a function, which are generally named after the verb which describes the task they perform (e.g. "filter", "summarise"). The complete anaylsis is then just a chain (or pipeline) of such functions.
rmarkdown
Markdown is a simple syntax for the writing of documents in plain text. R markdown allows one to interweave markdown documents with R code chunks.
knitr
Knitr runs code chunks in documents written in rmarkdown or in latex and inserts their outputs, or a reference to the relevant figures.

Python

jupyter
Python notebooks
sympy
Symbolic maths package for python. Allows for symbolic manipulation of mathematical expressions, including differentiation and integration.
pymc3
Bayesian Sampling package for python, utilising Theano.
pelican
Static website generator with which this blog is constructed.

Other

ubuntu
Linux OS distribution.
pandoc
Coverts between markdown, .doc, html, latex and various other formats.
sublime
Text editor in which I am writing this!
stan
Bayesian Sampling
mingw
Linux Command line tools for windows.
git
Decentralised version control system.

I use the University of Southampton's supercomputer, iridis, to do heavy computational lifting. Mostly this involves embarrassingly parallel tasks such as running simulations over experimental design points, or running variants of a family of statistical models for purposes of model comparison.