Python Pandas Tutorial

in utopian-io •  7 years ago 

Repository

https://github.com/pandas-dev/pandas

What Will I Learn?

  • Python Pandas - DataFrame
  • Python Pandas - Panel

Requirements

State the requirements the user needs in order to follow this tutorial.

-Standard Python distribution doesn't come bundled with Pandas module. A lightweight alternative is to install NumPy using popular Python package installer, pip.

                                pip install pandas

If you install Anaconda Python package, Pandas will be installed by default with the following

Windows
Anaconda (from https://www.continuum.io) is a free Python distribution for SciPy stack. It is also available for Linux and Mac.

Canopy (https://www.enthought.com/products/canopy/) is available as free as well as commercial distribution with full SciPy stack for Windows, Linux and Mac.

Python (x,y) is a free Python distribution with SciPy stack and Spyder IDE for Windows OS. (Downloadable from http://python-xy.github.io/)

Difficulty

  • Intermediate

Tutorial

A Data outline is a two-dimensional information structure, i.e., information is adjusted in a forbidden manner in lines and sections.

####Highlights of DataFrame

Conceivably segments are of various kinds

Measure – Mutable

Marked tomahawks (lines and segments)

Can Perform Arithmetic activities on lines and sections

####Structure

Give us a chance to accept that we are making an information outline with understudy's information.

screenshotAtUploadCC_1527224341367.png

####Create DataFrame

A pandas DataFrame can be created using various inputs like

  • Lists
  • dict
  • Series
  • Numpy ndarrays
  • Another DataFrame

####Create an Empty DataFrame

A basic DataFrame, which can be created is an Empty Dataframe.

      #import the pandas library and aliasing as pd
      import pandas as pd
      df = pd.DataFrame()
      print df

Output

  Empty DataFrame
   Columns: []
  Index: []

Create a DataFrame from Lists

The DataFrame can be made utilizing a solitary rundown or a rundown of records.

Example 1

  import pandas as pd
  data = [1,2,3,4,5]
   df = pd.DataFrame(data)
    print df

Output

      ![screenshotAtUploadCC_1527224341368.png](https://cdn.steemitimages.com/DQmNntq1cNrcVi78gMA9x4GAxend48uSLB9nxiKBKsMVU22/screenshotAtUploadCC_1527224341368.png)

Example 2

          import pandas as pd
          data = [['Alex',10],['Bob',12],['Clarke',13]]
          df = pd.DataFrame(data,columns=['Name','Age'])
           print df

Output

screenshotAtUploadCC_1527224341368.png

Create a DataFrame from Dict of ndarrays / Lists

All the ndarrays must be of same length. In the event that list is passed, at that point the length of the list should equivalent to the length of the clusters.

On the off chance that no list is passed, at that point as a matter of course, record will be range(n), where n is the exhibit length.

Example

                                import pandas as pd
                                 data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
                               df = pd.DataFrame(data)
                               print df

Output

screenshotAtUploadCC_1527224341367.png

Create a DataFrame from Dict of Series

Word reference of Series can be passed to shape a DataFrame. The resultant file is the association of all the arrangement lists passed.

Example

     import pandas as pd

         d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
       'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

        df = pd.DataFrame(d)
         print df

Output

screenshotAtUploadCC_1527224341367.png

Column Addition

We will understand this by adding a new column to an existing data frame

Example

     import pandas as pd

       d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
            'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

        df = pd.DataFrame(d)

              # Adding a new column to an existing DataFrame object with column label by passing new series

              print ("Adding a new column by passing as Series:")
        df['three']=pd.Series([10,20,30],index=['a','b','c'])
          print df

         print ("Adding a new column using the existing columns in DataFrame:")
            df['four']=df['one']+df['three']

              print df

screenshotAtUploadCC_1527224341367.png

Column Deletion

Columns can be deleted or popped; let us take an example to understand how

Example

                          # Using the previous DataFrame, we will delete a column
                          # using del function
                         import pandas as pd

             d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 
               'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']), 
                  'three' : pd.Series([10,20,30], index=['a','b','c'])}

              df = pd.DataFrame(d)
               print ("Our dataframe is:")
                 print df

              # using del function
         print ("Deleting the first column using DEL function:")
         del df['one']
            print df

           # using pop function
       print ("Deleting another column using POP function:")
             df.pop('two')
             print df

screenshotAtUploadCC_1527224341367.png

Row Selection, Addition, and Deletion

We will now comprehend push determination, expansion and cancellation through cases. Give us a chance to start with the idea of choice.

Selection by Label

Rows can be selected by passing row label to a loc function.

              import pandas as pd

               d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 
         'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

          df = pd.DataFrame(d)
             print df.loc['b']

Output

screenshotAtUploadCC_1527224341367.png

Python Pandas - Panel

A board is a 3D holder of information. The term Panel information is gotten from econometrics and is somewhat in charge of the name pandas − pan(el)- da(ta)- s.

The names for the 3 tomahawks are expected to give some semantic significance to depicting activities including board information. They are

  • items − axis 0, each item corresponds to a DataFrame contained inside.

  • major_axis − axis 1, it is the index (rows) of each of the DataFrames.

  • minor_axis − axis 2, it is the columns of each of the DataFrames.

pandas.Panel()

A Panel can be created using the following constructor −

                                        pandas.Panel(data, items, major_axis, minor_axis, dtype, copy)

Create Panel

A Panel can be created using multiple ways like −

  • From ndarrays
  • From dict of DataFrames

From 3D ndarray

              # creating an empty panel
                     import pandas as pd
                    import numpy as np

                       data = np.random.rand(2,4,5)
                        p = pd.Panel(data)
                        print p

output

            <class 'pandas.core.panel.Panel'>
                  Dimensions: 2 (items) x 4 (major_axis) x 5 (minor_axis)
                 Items axis: 0 to 1
                  Major_axis axis: 0 to 3
                  Minor_axis axis: 0 to 4

From dict of DataFrame Objects

        #creating an empty panel
            import pandas as pd
          import numpy as np

              data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)), 
                        'Item2' : pd.DataFrame(np.random.randn(4, 2))}
            p = pd.Panel(data)
             print p

output

                     <class 'pandas.core.panel.Panel'>
                    Dimensions: 2 (items) x 4 (major_axis) x 5 (minor_axis)
                    Items axis: 0 to 1
                    Major_axis axis: 0 to 3
                    Minor_axis axis: 0 to 4

Create an Empty Panel

           #creating an empty panel
            import pandas as pd
           p = pd.Panel()
          print p

output

           <class 'pandas.core.panel.Panel'>
              Dimensions: 0 (items) x 0 (major_axis) x 0 (minor_axis)
             Items axis: None
                  Major_axis axis: None
                Minor_axis axis: None

Selecting the Data from Panel

Select the data from the panel using −

  • Items
  • Major_axis
  • Minor_axis

Using Items

     # creating an empty panel
       import pandas as pd
        import numpy as np
      data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)), 
    'Item2' : pd.DataFrame(np.random.randn(4, 2))}
           p = pd.Panel(data)
           print p['Item1']

output

screenshotAtUploadCC_1527224341367.png

Proof of Work Done

https://gist.github.com/faizaraja/9c2c9cbb97e74e7d37e830d5056cfadf

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Source
Plagiarism is the copying & pasting of others work without giving credit to the original author or artist. Plagiarized posts are considered spam.

Spam is discouraged by the community, and may result in action from the cheetah bot.

More information and tips on sharing content.

If you believe this comment is in error, please contact us in #disputes on Discord

Flagged because of plagiarism
@steemflagrewards

Steem Flag Rewards mention comment has been approved! Thank you for reporting this abuse,@flugschwein categorized as plagiarism. This post was submitted via our Discord Community channel. Check us out on the following link!
SFR Discord

In utopian plagiarism isn't allowed. Link
Don't do that again.

Your contribution has been evaluated according to Utopian rules and guidelines, as well as a predefined set of questions pertaining to the category.
To view those questions and the relevant answers related to your post,Click here


Need help? Write a ticket on https://support.utopian.io/.
Chat with us on Discord.
[utopian-moderator]