Python Visualization is Possible with this

in python •  4 years ago 

Python Visualization is Possible with this

Visualization is an important skill set for a data scientist. A good visualization can help in clearly communicating insights identified in the analysis also it is a good technique to better understand the dataset. Our brain is wired in a way that makes it easy for us to extract patterns or trends from visual data as compared to extracting details based on reading or other means.
Reading is one of the best ways to enrich the mind.
There is nothing more satisfying than reading a great book that's not only enjoyable, but that also imparts lasting knowledge. If you're looking for a way to boost your knowledge, invest time and energy in reading thought-provoking books.
According to Howard Gardner, a developmental psychologist, there is not one measure of intelligence, but nine. That means if you value intelligence and your goal is to broaden your knowledge about the world and our place in it, you have to feed every aspect of your intellect.
The thought-provoking books below can broaden your knowledge and enhance your level of intelligence. They are perfect for anyone with a curious mind and a passion for learning. I've added one or two of my favourite quotes from each book.
Embark on a reading journey that'll help you understand the human race, life and living it and almost everything there is to know about becoming a better version of yourself.
The genetic strategies behind everything
The Moral Animal: Why We Are, the Way We Are: The New Science of Evolutionary Psychology by Robert Wright
"Certainly genetic differences matter. Some people's genes dispose them to be unusually ambitious, or clever, or athletic, or artistic, or various other things - including unusually rich in serotonin. But these traits depend, for their flowering, on the environment (and sometimes on each other), and their eventual translation into status can rest heavily on chance."
This book can help you understand the everyday world
Forces of Nature by Professor Brian Cox and Andrew Cohen
"Science is delighted frustration. It is about asking questions, to which the answers may be unavailable - now, or perhaps ever. It is about noticing regularities, asserting that these regularities must have natural explanations and searching for those explanations."
Kathryn urges us to reassess our relationship with our own mistakes
Being Wrong: Adventures in the Margin of Error by Kathryn Schulz
"A whole lot of us go through life assuming that we are basically right, basically all the time, about basically everything: about our political and intellectual convictions, our religious and moral beliefs, our assessment of other people, our memories, our grasp of facts. As absurd as it sounds when we stop to think about it, our steady state seems to be one of unconsciously assuming that we are very close to omniscient."
This book will upgrade your understanding of human intelligence
Frames of Mind: The Theory of Multiple Intelligences by Howard Gardner
"The less a person understands his own feelings, the more he will fall prey to them. The less a person understands the feelings, the responses, and the behaviour of others, the more likely he will interact inappropriately with them and therefore fail to secure his proper place in the world."
Deep insights about many areas of modern life
How Not to Be Wrong: The Hidden Maths of Everyday Life by Jordan Ellenberg
"A basic rule of mathematical life: if the universe hands you a hard problem, try to solve an easier one instead, and hope the simple version is close enough to the original problem that the universe doesn't object."
"In the Bayesian framework, how much you believe something after you see the evidence depends not just on what the evidence shows, but on how much you believed it to begin with."
In this article, I will be covering the visualization concept from the basics using python. Below are the steps to learn visualization from basic,
Step 1: Importing data
Step 2: Basic visualization using Matplotlib
Step 3: More advanced visualizations, still using Matplotlib
Step 4: Building quick visualizations for data analysis using Seaborn
Step 5: Building interactive charts

By the end of this journey, you would be equipped with everything that is required to build a visualization. Though we will be not covering every single visualization that can be built you will be learning the concepts behind building a chart and hence it would be easy for you to build any new charts that are not covered in this article.
The scripts and the data used in this article can also be found in the git repository here. All data used in this article can be found in the "Data" folder within the mentioned git repository and the scripts are available in the folders 'Day23, Day 24, and Day25'.
Importing Data
The first step is to read the required datasets. We can use pandas to read the data. Below is a simple command that can be used to read data from a CSV file
On reading the dataset it is important to transform it and make it suitable for the visualization we would apply. For example, let's say we have sales details at the customer level and if we would want to build a chart that shows the day-wise sales trend then it is required to group the data and aggregate them at the day level and then use a trend chart.
Basic Visualization using Matplotlib
Let us start with some basic visualization. It is better to use the code 'fig,ax=plt.subplots()', where 'plt.subplots()' is a function which will return a tuple with figure and axes objects and assigned to the variables 'fig' and 'ax' respectively. Without using this as well you can print a chart but by using this you would be able to make changes to figure like you would be able to re-size the charts depending on how it looks and to save the chart as well. Similarly, the 'ax' variable here can be used to provide labels to the axes. Below is a simple example where I have passed the data as an array and have print it as a chart directly
In the above code at first, the required libraries are imported, and then the 'plt.subplots()' function is used to generate the figure and the axes objects and then the data is directly passed as an array to the axes object to print the chart. In the second chart, the axes variable 'ax' has taken inputs for labels specific to the x-axis, y-axis, and the title.
Trend Charts
Now, let's start using some real data and learn about building interesting charts and about customizing them to make it more intuitive. As explained, in most real-life use-cases the data would require some transformation to make it usable for the charts. Below is an example where I have used the Netflix data but have transformed the data to consolidate the number of movies and Tv shows by year wise. And then I have used the 'plt.subplots()' function but I have also added few additional details to make the chart more intuitive and self-explanatory.
There are a few more customization that can be done to the above chart like creating a dual-axis. In the above case, there isn't much difference between the number of movies and TV shows hence the data appears OK, if there has been a huge difference between them then the chart will not be very clear in those cases we can make use of dual-axis so that the attribute with smaller values will also be scaled in line with the other one.
Scatter Plots
We can also make use of the Scatter Plot, to bring out any relationship between the variables that we are plotting. This plot helps in bringing the correlation between variables like what happens to one attribute when the other attribute is increasing/decreasing.
More advanced visualizations, still using Matplotlib
Once you are comfortable with the simple trend-charts we have covered so far you are ready to move to slightly more advanced charts and functionalities to better customization your visualization
Bar Charts
The Bar Charts help us to compare multiple values at the same time by plotting them side-to-side. There are different kinds of Bar Charts,
Vertical Bar Chart
Horizontal Bar Chart
Stacked Bar Chart

Below is an example of a Bar Chart, there are a number of customization added to this plot. They are,
Axis labels and title are added
Font size has been provided
Figure size is provided as well (default chart would look much smaller and cluttered)
A function is used to generate and add values to the top of each of the bars to help the viewers get the actual details

Horizontal and Stacked Bar Chart
Vertical Bar charts are most common but we can also make use of the horizontal bar charts especially when the data labels have a long name and it is very difficult to print them below a vertical bar. In the case of the stacked bar chart, the bars will be stacked on top of one another within a category. below is the example of implementing the horizontal and stacked bar charts. The below code also includes customization to the chart color.
Pie and Donut Chart
Pie charts are useful to show the proportion of different categories in the data and these pie charts can easily be modified to a Donut chart by covering the center part of the pie chart with a circle and re-aligning the text/values to suit the donut chart. Below is a simple example where I have implemented the pie chart and later modified it into a donut chart
Why is it important to learn Matplotlib?
Matplotlib is a very important visualization library in python because many other visualization libraries in python are dependent on matplotlib. Some of the advantages/benefits of learning matplotib are,
It is easy to learn
It is efficient
It allows a lot of customizations hence possible to build almost any kind of visuals
Libraries like Seaborn are built on top of Matpotlib

I have covered only the most essential visualization in Matplotlib but the important factor is by practicing these charts you would have acquired the knowledge for building much more visualization. Matplotlib supports a number of visualization here is the link to the gallery of all supported charts.
Building quick visualizations for data analysis using Seaborn
We have covered a variety of visualization using the Matplotlib library. I am not sure if you have noticed, though matplotlib offers high customization it involves a lot of coding and hence could be time-consuming especially when you are working on exploratory analysis and would want to make a few quick plots to understand the data better and make the decisions faster. That's exactly what is offered by Seaborn library, here are some benefits of using the seaborn library,
Default themes are still attractive
Simple and quick to build visualizations especially for data analysis
Its declarative API allows us to just focus on the key elements of the charts

There are few downsides too like it doesn't offer much customization and it could lead to memory issues especially when we work on large datasets. But still, the benefit outweighs the disadvantages.
Visualizations with just one line code
Below are some simple visualizations that are implemented with just a single code using the seaborn library.
As shown in the above snapshot the visualizations are created with just a single line of code and they look quite presentable as well. The Seaborn library is widely used in the data analysis phase as we can build charts quickly with ease and with minimum/no effect to make the charts presentable. Visualization is key in the data analysis as they help in bringing out patterns in the data and the seaborn library fits apt for the purpose.
Heatmaps
Heatmaps are another interesting visualization that is widely used on time-series data to bring out the seasonalities and other patterns in the dataset. However to build a heatmap we need to transform the data into a specific format to support heatmap plotting. Below is a sample code to transform the data to suit the heatmap plot and seaborn library used to build the heatmap
Pair Plot - my favorite functionality of Seaborn
I consider the pair plot as one of the best features of the seaborn library. It helps in comparison of each attribute in the dataset to every other attribute through visuals and again in a single line of code. Below is a sample code to build pair plots. The use of a pair plot might not be feasible when the dataset we are working on has a large number of columns. In those cases, the pair-plots can be used to analyze the relationship between a specific set of attributes alone.
Building interactive charts

https://medium.com/@taradsimpson81/if-you-are-a-leader-then-you-should-have-this-quality-8ffa9b405335
https://note.com/voloyiy536z/n/n0aa158b613d9
https://steemit.com/yahoo/@annamparsons33/hidden-beauty-near-the-shore-photography-by-toru-kasuya
https://irenemdickens32.medium.com/top-books-you-should-read-if-your-goal-is-to-become-smarter-986b5c5e2830
https://www.mydigoo.com/forums-topicdetail-194511.html

While working on data science projects sometimes there would a requirement to share some visualization with the business teams. Dashboarding tools are widely used for this purpose but let's say there is an interesting pattern that you have noticed while performing data analysis and would like to share it with the business user. If they are shared as an image there might not be much the business user can do but if they are shared as an interactive chart then it gives the business user power to look into the granular details by zooming in or out or use other functionality to interact with the chart. Below is an example where we are creating an HTML file as an output which includes the visualization that can be shared with any other user and they can be simply opened in a web browser.
If you are keen to learn about visualizations using python please check out my playlist below. It includes three videos, with a total tutorial length of just over one hour.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Copying/Pasting full or partial texts with adding very little original content are frowned upon by the community. Repeated copy/paste posts could be considered spam. Spam is discouraged by the community and may result in the account being Blacklisted.