Pandas cheatsheet
1. Creating a Pandas Series
import pandas as pd
s = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
print(s)
2. Creating a Pandas DataFrame
data = {'Name': ['John', 'Anna', 'Peter', 'Linda', 'Tom'],
        'Age': [28, 24, 35, 32, 28],
        'City': ['New York', 'Paris', 'Berlin', 'London', 'New York']}
df = pd.DataFrame(data, index = ["a9", "b5", "d2", "r1", "x6"])
print(df)
3. Reading and Writing to CSV
- 
    
Reading/writing CSV:
df = pd.read_csv('data.csv') df.to_csv('output.csv', index=False) 
4. Accessing Rows and Columns (Indexing and Slicing)
- 
    
Accessing rows:
df.iloc[3] # Access by position df.loc['b5'] # Access by index label - 
    
Multiple indices:
df.loc[["b5", "d2"]] df.iloc[[2, 3]] - 
    
Slicing rows:
df.iloc[1:4:2] # Slicing from row 1 to 4, with steps of 2 - 
    
Accessing columns:
df['Name'] # Single column -> Series df[['Name', 'Age']] # Multiple columns -> DF 
5. statistics
- 
    
Statistical info of columns:
df['Age'].mean() df['Age'].median() df['Age'].sum() df['Age'].min() df['Age'].max() df['Age'].var() df['Age'].std() df['Age'].count() # Count non-null values 
6. Filtering Data
- 
    
Single condition: Filter rows based on a single condition.
df[df['Age'] > 30] - 
    
Filter with multiple conditions: You can combine multiple conditions using
&(and) or|(or), and parentheses().# Example: Filter rows where Age > 30 and City is 'New York' filtered_df = df[(df['Age'] > 30) & (df['City'] == 'New York')] # Example: Filter rows where Age < 30 or City is 'Paris' filtered_df = df[(df['Age'] < 30) | (df['City'] == 'Paris')] 
7. Grouping and Looping Over Groups
- 
    
Grouping by ‘City’ and Looping Over Groups:
grouped = df.groupby('City') for city, group in grouped: print(f"\nCity: {city}") print(group) 
8. Common Aggregation Functions
- 
    
Aggregation after
groupby:grouped['Age'].mean() grouped['Age'].sum() grouped['Age'].count() grouped['Age'].max() grouped['Age'].min() grouped['Age'].median() grouped['Age'].std() grouped['Age'].var() 
9. Column Operations: Adding, Multiplying, etc.
- 
    
Arithmetic operations on columns:
# Adding a new column by combining other columns df['Salary'] = [50000, 60000, 75000, 65000, 55000] df['Bonus'] = df['Salary'] * 0.10 # Adding, subtracting, multiplying columns df['Total Compensation'] = df['Salary'] + df['Bonus'] df['Years Until Retirement'] = 65 - df['Age'] 
10. Sorting Data
- 
    
Sort by a single column:
df.sort_values(by='Age') - 
    
Sort by multiple columns:
df.sort_values(by=['Age', 'Salary'], ascending=[True, False])