Data Types and Structures in Pandas

 In Pandas, there are three primary data structures: Series, DataFrame, and Index.

  1. Series:

A series is a one-dimensional array that is capable of holding any data type such as integers, strings, floating-point numbers, or Python objects. A series is similar to a column in a spreadsheet. It is defined using the pd.Series() function in Pandas.

Example:

kotlin
import pandas as pd 
data = [10, 20, 30, 40, 50
s = pd.Series(data
print(s)

Output:

go
0  10 
1  20 
2  30 
3  40 
4  50 
dtype: int64

  1. DataFrame:

A DataFrame is a two-dimensional table that is capable of holding heterogeneous data types such as integers, strings, floating-point numbers, or Python objects. It is similar to a spreadsheet or an SQL table. A DataFrame is defined using the pd.DataFrame() function in Pandas.

Example:

kotlin
import pandas as pd 
data = {'Name': ['John', 'Mike', 'Sarah', 'Jasmine'], 'Age': [25, 30, 27, 29], 'Gender': ['Male', 'Male', 'Female', 'Female']}
df = pd.DataFrame(data
print(df)

Output:

markdown
 Name   Age  Gender 
0 John    25  Male 
1 Mike    30  Male 
2 Sarah   27  Female 
3 Jasmine 29  Female

  1. Index:

An index is an immutable array-like object that is used to label the rows and columns in a Pandas DataFrame. By default, it starts from 0 and goes up to n-1, where n is the number of rows in the DataFrame.

Example:

kotlin
import pandas as pd 
data = {'Name': ['John', 'Mike', 'Sarah', 'Jasmine'], 'Age': [25, 30, 27, 29], 'Gender': ['Male', 'Male', 'Female', 'Female']} 
df = pd.DataFrame(data, index=['row1', 'row2', 'row3', 'row4']) 
print(df)

Output:

markdown
    Name    Age  Gender 
row1 John    25   Male 
row2 Mike    30   Male 
row3 Sarah   27   Female 
row4 Jasmine 29   Female

In the above example, we have assigned custom row labels using the index parameter of the pd.DataFrame() function.

Comments

Popular posts from this blog

How to use the statsmodels library in Python to calculate Exponential Smoothing

K-means Clustering 3D Plot Swiss roll Dataset

How to detect Credit Card Fraud Using Python Pandas