In the past, programming has followed a general cycle involving write, compile, check and execute. Although this system has proved to be reliable, modern applications of programming possess alternate requirements. I have always been a proponent of text editors over IDEs. This is partly since I originally started coding with text editors and text editors ensured that I did not constantly look for auto-correct to help me. However, over time I realized that IDEs save a lot of time and its more about efficiency rather than practice. The issue was, after writing the code, the process of executing the code remained the same:
1. Open the terminal
2. Compile and run
When I started performing Data Science and Machine Learning, this process proved too cumbersome. For example, when I needed to visualize a dataset in Python, I would have to save the graph locally on my system, close the program and view the graph. Soon, I realized that this did not align with my philosophy of efficiency.
Figure 1: Conventional programming techniques employed text editors such as Notepad++ or IDEs such as PyCharm
When it comes to Data Science, real-time interaction with the data is paramount. This enables Data Scientists to work dynamically with data and numbers. One such tool for Python is Jupyter. Jupyter Notebook is an open-source web app for interactive coding. The web app offers support for Python, Julia and Ruby.
Figure 2: Jupyter is an open source web based tool for interactive Data Science and Machine Learning
Each Jupyter Notebook consists of a back-end kernel known as IPython. The kernel analyses and runs your code. With Jupyter, multiple kernels are supported. As such, a custom kernel can be created based on one’s requirements.
During the initial stages of my foray into Data Science, I realized that I needed separate environments for different tasks. The availability of multiple kernels was a boon. I could set up TensorFlow and all its dependencies in one kernel and data visualization in another.
Figure 3: IPython forms the backbone of a Python Notebook
One of the biggest drawbacks of using Jupyter Notebooks for Python development or Data Science is that it is quite resource intensive. However, this drawback can be overcome with the help of Azure Notebooks. Azure Notebooks are Jupyter Notebooks that are hosted on the Cloud using Azure virtual machines.
Figure 4: Azure Notebooks are Jupyter Notebooks that are hosted on Azure
Access and use of Azure Notebooks is completely free and you need not possess an Azure account to access these resources. Apart from the cost benefit, Azure Notebooks come pre-installed with a variety of different packages that aid in Data Science. Additional packages can also be installed through the Jupyter front-end interface with the use of magic commands. The packages that comes installed include the entire Anaconda stack as well as numerous Microsoft specific modules such as CNTK and Azure ARM.
Figure 5: Azure notebooks come with a wide range of packages pre-installed
Another major benefit of using Azure Notebooks is that the tool does not exclusively feature Python. It also has support for R and F#. Microsoft has also loaded Azure Notebooks with a plethora of study and practice material in form of notebooks that enables users to approach Data Science and programming in an interactive manner. I found myself constantly going through the sample notebooks to find new and interesting ways to use Jupyter and Azure.
Jupyter Notebooks can also be imported from fellow Azure Notebook users or from GitHub ensuring that one has the fastest possible method to pull resources. The terminal for the machine hosting the Jupyter Notebook can also be accessed.
Figure 6: “Libraries” can be easily shared via GitHub or within Azure Notebooks
Although Azure Notebooks act independently of Microsoft Azure, the two services are closely intertwined. The Azure Notebooks service is deployed when one attempts to visualize data or make Python code specific changes in Azure Machine Learning Studio. Veterans in the interactive Data Science sphere will feel right at home on Azure Notebooks.
What do you use for Data Science? We would love to hear about the new technologies or tools that you use. Let us know in the comments section.