Data Types and Structures in Pandas
In Pandas, there are three primary data structures: Series, DataFrame, and Index.
- Series:
A series is a one-dimensional array that is capable of holding any data type such as integers, strings, floating-point numbers, or Python objects. A series is similar to a column in a spreadsheet. It is defined using the pd.Series() function in Pandas.Example:
kotlinimport pandas as pd
data = [10, 20, 30, 40, 50]
s = pd.Series(data)
print(s)
Output:
go0 10
1 20
2 30
3 40
4 50
dtype: int64
- DataFrame:
A DataFrame is a two-dimensional table that is capable of holding heterogeneous data types such as integers, strings, floating-point numbers, or Python objects. It is similar to a spreadsheet or an SQL table. A DataFrame is defined using the pd.DataFrame() function in Pandas.Example:
kotlinimport pandas as pd
data = {'Name': ['John', 'Mike', 'Sarah', 'Jasmine'],
'Age': [25, 30, 27, 29],
'Gender': ['Male', 'Male', 'Female', 'Female']}
df = pd.DataFrame(data)
print(df)
Output:
markdown Name Age Gender
0 John 25 Male
1 Mike 30 Male
2 Sarah 27 Female
3 Jasmine 29 Female
- Index:
An index is an immutable array-like object that is used to label the rows and columns in a Pandas DataFrame. By default, it starts from 0 and goes up to n-1, where n is the number of rows in the DataFrame.Example:
kotlinimport pandas as pd
data = {'Name': ['John', 'Mike', 'Sarah', 'Jasmine'],
'Age': [25, 30, 27, 29],
'Gender': ['Male', 'Male', 'Female', 'Female']}
df = pd.DataFrame(data, index=['row1', 'row2', 'row3', 'row4'])
print(df)
Output:
markdown Name Age Gender
row1 John 25 Male
row2 Mike 30 Male
row3 Sarah 27 Female
row4 Jasmine 29 Female
In the above example, we have assigned custom row labels using the index
parameter of the pd.DataFrame()
function.
Comments
Post a Comment