Software and Toolkit
This page gives links to my software and projects I am working on, and describes the tools I use for data analysis and modelling.
Personal
- Github
My personal github page - ggfan
Github homepage of the R package I authored for the plotting of probability distributions. Based onfanplot
Software
This section describes the tools I use for research and data analysis.
R
R is a free and open source statistical programming language. Some of the packages I use most often are listed below.
- Tidyverse
Suite of R libraries for the tidy analysis of data. Tidy data is data store as a table where every column represents one variable and every row represents one (and only one) observation. Storing data in this way makes data processing and visualisation The tidy data allows data processing and analysis task to be written in an easily understandable way. Each step in the analysis can be represented by a function, which are generally named after the verb which describes the task they perform (e.g. "filter", "summarise"). The complete anaylsis is then just a chain (or pipeline) of such functions. - rmarkdown
Markdown is a simple syntax for the writing of documents in plain text. R markdown allows one to interweave markdown documents with R code chunks. - knitr
Knitr runs code chunks in documents written in rmarkdown or in latex and inserts their outputs, or a reference to the relevant figures.
Python
- jupyter
Python notebooks - sympy
Symbolic maths package for python. Allows for symbolic manipulation of mathematical expressions, including differentiation and integration. - pymc3
Bayesian Sampling package for python, utilising Theano. - pelican
Static website generator with which this blog is constructed.
Other
- ubuntu
Linux OS distribution. - pandoc
Coverts between markdown, .doc, html, latex and various other formats. - sublime
Text editor in which I am writing this! - stan
Bayesian Sampling - mingw
Linux Command line tools for windows. - git
Decentralised version control system.
I use the University of Southampton's supercomputer, iridis
, to do heavy computational lifting. Mostly this involves embarrassingly parallel tasks such as running simulations over experimental design points, or running variants of a family of statistical models for purposes of model comparison.