Software and Toolkit

This page gives links to my software and projects I am working on, and describes the tools I use for data analysis and modelling.

Personal

  • Github
    My personal github page
  • ggfan
    Github homepage of the R package I authored for the plotting of probability distributions. Based on fanplot

Software

This section describes the tools I use for research and data analysis.

R

R is a free and open source statistical programming language. Some of the packages I use most often are listed below.

  • CRAN
    R's home. Repository for R packages.
  • Tidyverse
    Suite of R libraries for the tidy analysis of data. Tidy data is data store as a table where every column represents one variable and every row represents one (and only one) observation. Storing data in this way makes data processing and visualisation The tidy data allows data processing and analysis task to be written in an easily understandable way. Each step in the analysis can be represented by a function, which are generally named after the verb which describes the task they perform (e.g. "filter", "summarise"). The complete anaylsis is then just a chain (or pipeline) of such functions.
  • rmarkdown
    Markdown is a simple syntax for the writing of documents in plain text. R markdown allows one to interweave markdown documents with R code chunks.
  • knitr
    Knitr runs code chunks in documents written in rmarkdown or in latex and inserts their outputs, or a reference to the relevant figures.


Python

  • jupyter
    Python notebooks
  • sympy
    Symbolic maths package for python. Allows for symbolic manipulation of mathematical expressions, including differentiation and integration.
  • pymc3
    Bayesian Sampling package for python, utilising Theano.
  • pelican
    Static website generator with which this blog is constructed.


Other

  • ubuntu
    Linux OS distribution.
  • pandoc
    Coverts between markdown, .doc, html, latex and various other formats.
  • sublime
    Text editor in which I am writing this!
  • stan
    Bayesian Sampling
  • mingw
    Linux Command line tools for windows.
  • git
    Decentralised version control system.

I use the University of Southampton's supercomputer, iridis, to do heavy computational lifting. Mostly this involves embarrassingly parallel tasks such as running simulations over experimental design points, or running variants of a family of statistical models for purposes of model comparison.