Data Science Docker Environment

Published:

Problem

Bootstrapping a new analyst on a Python + R data-science stack — Jupyter, scikit-learn, pandas, RStudio kernel, common viz libs — eats a day every time. The exact versions that worked last quarter rarely work today.

Approach

  • One Dockerfile that pins Python + R + library versions.
  • docker-compose wires up a notebook server with a sane volume mount, port mapping, and an opinionated default working directory.
  • A small Makefile of common operations (build, start, shell, kill) so day-zero is one command.

Used as a teaching environment and as the base image for several internal analytics setups.