You can easily forget theoretical knowledge if you don’t play with them by getting your hands dirty on real projects. To do this, you need to prepare your working environment with installing required packages first.
- Install python
- Install anaconda
- Install jupyter
Install python
There is always a mystery which version to install with python. This is mainly because of not protecting backward compatibility with the 3rd version and, due to this fact most of legacy applications still prefer to use python 2.7. Since python 3 is the new version and we are not maintaining a legacy application, I recommend you to install probably the latest version of python to play with. Also, 2.7 is being retired and jupyter is switching to version 3 only.
To install python, follow https://www.python.org/downloads/
Install anaconda
Anaconda is the open source data science platform to manage environments and packages. It comes with python, conda
and pre-installed data science packages. The power of anaconda actually comes from conda which is a package and environment manager related to the data science space. Anaconda is a fairly large download so, there is an alternative distribution just comes with conda and python which is called miniconda
. If you prefer to use miniconda, you need to use conda to install packages.
Package installer for python is called pip
. Conda is a bit different than pip having the feature of being able to install any software stack packages, so you can install non-python packages as well. Be aware of the fact that not all python libraries are available from conda, and you need to use pip alongside conda to install required packages.
Sample conda usage:
conda install package_name
conda install numpy scipy pandas
conda install numpy=1.10
conda remove package_name
conda update package_name
conda update --all => to update all packages in an environment
conda list => to list installed packages
conda search *beautifulsoup* or conda search '*beautifulsoup*'
To install anaconda: https://www.anaconda.com/download/
Install jupyter
Jupyter
is a web application to serve notebooks (JSON files with the .ipynb extension). A notebook is an essential tool to work with data consisting of code and markdown cells. It can be used at any step of a data project like data cleaning, analyzing, exploration, visualization and machine learning. And, because it is a sharable document and automatically rendered on github, it is easy to work on it together with different people.
Jupyter comes with anaconda by default, to install it through conda run conda install jupyter notebook
or through pip run pip install jupyter notebook
.
To run jupyter notebooks: jupyter notebook
(preferably in the directory that you have your notebook to work on).
Notebooks can be converted to different formats like HTML, Markdown and slideshows. nbconvert
comes with Jupyter by default, simply run: jupyter nbconvert --to html notebook.ipynb
Useful shortcuts when working with a notebook:
ctrl + enter
=> execute a cellesc + a
=> insert a cell aboveesc + b
=> insert a new cell belowesc + m
=> convert a cell to a markdown cellesc + c
=> convert a cell to a code cellesc + l
=> put / remove line numbersesc + dd
=> delete cell
Check out http://nbviewer.jupyter.org/ to render your notebooks from any location.