How to create interactive visualization content in Jupyter Notebook based on Plotly and ipywidgets

Editor's note: Zalando data engineer Alex Martinelli introduced how to create interactive visual content in Jupyter Notebook based on Plotly and ipywidgets.

This article describes how to create interactive content in Jupyter Notebook. The so-called content mainly refers to visual content. But as we will see soon, the visualization here includes not only the usual charts, but also interactive interfaces and animations that help to explore the data.

I will first briefly introduce data visualization and more accurately define the scope and meaning of the interactions in this article. Then I will give a quick overview of the tools involved (Plotly and ipywidgets), and by the way provide some general advice on the Jupyter ecosystem. Finally, I will show some specific examples. Most of these examples come from my personal projects. I will also introduce the effect of these interactive content on my project. These examples are to demonstrate the capabilities of such tools under the powerful framework of Jupyter. I hope this will encourage you to try these technologies on your own projects and at the same time spread these technologies widely.

introduce

Data visualization is one of the core skills of a good data scientist (or any role involved in data processing). On the one hand, it allows you (or other members of the team) to better understand the nature of the data set; on the other hand, it allows you to deliver appropriate messages to external audiences (skilled and ignorant).

When searching for "data visualization" on the Internet, D3.js is one of the most recommended libraries. But some people assert that the more you need a personalized approach to customization, the more necessary it is to learn this tool in depth. Conversely, if your goal is fast, "standard" visualization, then it may be more appropriate to use the packages provided in the programming language of your choice.

For the Python ecosystem, it is inevitable to deal with Matplotlib basic components at the beginning, and then may expand to high-level alternative libraries (for example, Seaborn, Bokeh). In addition, especially for data science charts, more and more people choose to use Jupyter notebooks. I believe the main reason behind this is that Jupyter encapsulates different media into a simple solution: code, text, visualization. It not only makes your entire workflow more convenient and fun, but also greatly simplifies the sharing of work results (for teaching and for collaboration).

I personally find that using the Matplotlib + Seaborn combination under Jupyter best meets my visualization needs, and is stronger than Pandas's own drawing function. When animation functions are needed, the simple animation framework provided by Matplotlib since version 1.1 provides a good compromise between ease of use and aesthetics of the results.

However, there will always be a moment when we will find that we need more. It's not that you move to a completely new tool, but that you use new tools based on your needs. In my opinion, interaction can be divided into two categories:

Interactive chart: Display real-time information of a specific data point or area currently browsed, highlight/hide specific content.

Components used for content interaction: Expanding from charts to other media and content generally requires complex UI elements to collect additional input.

So far, I found that Plotly can solve the first requirement well. As for the latter requirement, I specifically searched for solutions that can be easily embedded in Jupyter, so that I can turn any notebook into an interactive panel. Finally I found ipywidgets.

tool

Before introducing the aforementioned tools in detail, I would like to share the Python and Jupyter related technology stacks that I personally use:

Anaconda (Python distribution)

Use virtualenv to manage multiple isolated Python environments (or use Anaconda's own environment management solution)

Use nbcondakernels to manage multiple environments/cores of Jupyter notebook

Use Jupyter extensions to customize Jupyter

Plotly

It is extremely convenient to use Plotly to create and share interactive charts. It even provides seamless web hosting for charts-public content is free hosting, private content is paid. Of course, if necessary, Plotly can be used offline, rendering interactive charts directly in the notebook, or exporting to (interactive) HTML files.

If this is not enough, Plotly has two more attractive features: plot_mpl and cufflinks (cufflink means cufflinks).

plot_mpl can directly convert pure matplotlib images into interactive Plotly images (currently the conversion is not perfect, but this function is improving rapidly).

Cufflinks allows Plotly to automatically bind Pandas dataframes. With automatic binding, you can get high-quality data visualization based on the content of the dataframe with a single call, such as automatically parsing date data and displaying it in an appropriate format; inferring color tone; and managing tags.

Under normal circumstances, it is extremely convenient to start using Plotly, just run:

pip install plotly

pip install cufflinks

Then select online mode or offline mode for the notebook. Then just remember that .iplot() is a magic statement for displaying content in Jupyter.

ipywidgets

ipywidgets can easily create an interactive interface in the notebook. Again, it balances flexibility and ease of use.

After installing ipywidgets according to the installation steps, you can use it directly. ipywidgets provides components for the interactive interface, such as sliders, checkboxes, buttons, text boxes, and so on.

It's easy to get started with ipywidgets, just understand the interactive functions. The interactive function can automatically create parts based on the parameters passed in. For example, passing in a Boolean parameter will automatically generate the corresponding check box; passing in a list will automatically generate a drop-down menu.

These mechanisms of ipywidgets are quite straightforward, and the descriptions in the official documentation are also great, so now let us go directly to the next section.

gallery

In order to demonstrate what I have explained so far, and to advertise myself shamelessly, I will give a few examples from my personal projects.

All of my projects can be found on GitHub (5agado), and some of the simple projects are self-contained, reproducible notebooks.

Example 1: Fitbit panel

Practitioners of quantitative life, and those who want to create simple panels for personal data (but do not intend to create a complete panel application) to explore data and gain insights may be particularly interested in this project.

My Fitbit sleep data is an example of data that needs to visualize different statistics (sleep efficiency, sleep score, time to first fall asleep, etc.) and different scales (day, week, year, etc.). Use Seaborn's factorplot to visualize a set of target statistics at once.

This is useful in simple situations, but not so effective in complex situations. For example, to visualize monthly workday data:

The effect of the above picture is very poor, the information presented is very messy, and it is difficult to see the data for people like me who are not strong in color recognition. On the contrary, Plotly provides a good solution for this situation. We can create Plotly charts in three different ways:

Pass the matplotlib image directly to the iplot_mpl method

Create a chart from scratch using Plotly syntax

Use cufflinks to draw directly based on Pandas dataframe

In this case, the last method is the most convenient and most accurate.

Example 2: Nutritional database

Sometimes we need to create a simple interface for some structured content. For example, I have this need when I am exploring nutrition data. What I am exploring is the USDA National Nutrient Database, which provides a wealth of data and is relatively complex. However, it is not difficult to analyze this database using Python and Pandas.

Some people might argue that this actually belongs to the field of databases, and that it is more appropriate to use SQL tools. This proposition may be correct, but I still want to show how easy it is to use ipywidgets to accomplish this task under Jupyter. Compared to using a separate database system, this may be a better temporary solution, especially during the data exploration phase.

As we can see from the above figure, ipywidgets is very convenient to use, just pass the original Python function and the parameters that need to be automatically mapped to the widget to the interact function:

food (food): empty list, will be converted into a text box

nutrient: will be converted into a drop-down menu

Example 3: Animation

As mentioned earlier, once you have good tools, interactive content is useful in various scenarios, such as animation.

After passing the (min, max, step) tuple to the interact function, the corresponding slider can be automatically generated. Such a simple integer slider is helpful when viewing animations at different resolutions or complexity. For example, simulation [Conway Game of Life]:

We can also customize the interface to quickly browse the data and related analysis results. For example, I used to visualize the output of the CNN intermediate layer trained on the cat and dog dataset:

There are many visualization frameworks and tools, especially for machine learning tasks, but sometimes a quick and rude solution like the above can provide all the information and functions actually needed, while saving a lot of time.

Conclusion

The above example only uses the basic functions of the tool. For example, ipywidgets only uses interact functions and type inference. In fact, ipywidgets also supports custom widgets and widget combinations, see the official documentation for details.

I hope this article demonstrates the simplicity and customizability of these tools. At the same time, I also hope to get feedback on data visualization tools, especially frameworks that include 3D modeling and animation.

Finally, Amway under dunovank/jupyter-themes, this GitHub repository provides many Jupyter themes and visualization styles.

Plastic PC Speakers

Plastic Speakers,Plastic Speaker With Labtop,Plastic Abs Speaker,Portable Plastic Speaker

Comcn Electronics Limited , https://www.comencnspeaker.com

This entry was posted in on