Setting up Python for Data Analysis

Anand Bisen bio photo By Anand Bisen Comment

Note to self… I find myself performing these same steps or similar steps several times every year. Always missing a step or two that requires me to go back and forth to resolve the dependencies. So here it is… This one was performed for Fedora 20.

Install supporting packages in root environment

$ sudo yum groupinstall -y "Development Tools"
$ sudo yum install -y blas blas-devel atlas atlas-devel gcc-c++   \
	dvipng freetype-devel cairo-devel PyQT4-devel PyQt4 latex2html  \
	pygtk2-devel czmq czmq-devel tkinter lcms lcms-devel            \
	libjpeg-turbo libjpeg-turbo-devel gdal npm

Setup a virtual environment for data analysis

$ # As a regular user
$ mkdir -p ~/virtualenvs
$ virtualenv --system-site-packages ~/virtualenvs/pydata

$ # Activate the virtual environment
$ source ~/virtualenvs/pydata/bin/activate

$ # Install the gems (magical modules that makes things happen)
$ pip install nose ipython jinja2 pyzmq tornado numpy scipy   \
  pandas scikit-learn patsy statsmodels matplotlib vincent    \
  pystan pandasql PIL folium

Install some handy node.js modules as regular user

$ npm install d3 optimist shapefile queue-async d3-geo-projection

Install some cool utilities from GitHub

$ mkdir ~/opt
$ cd ~/opt

$ # Install topojson - handy with geo mapping
$ git clone https://github.com/mbostock/topojson.git
comments powered by Disqus