PDA Assignments
  • Python For Data Analytics
    • 1.Python
      • 1.Python Documents
        • 1.Data Types
        • 2.Variables In Python
        • 3.Operators In Python
        • 4.User Input In Python
        • 5.TypeCasting In Python
        • 6.Strings In Python
        • 7.Conditional Statements In Python
        • 8.Branching using Conditional Statements and Loops in Python
        • 9.Lists In Python
        • 10.Sets In Python
        • 11.Tuples In Python
        • 12.Dictionary In Python
        • 13.Functions In Python
        • 14.File Handling In Python
        • 15.Numerical Computing with Python and Numpy
      • 2.Python Assignments
        • Data Type & Variables
        • Operators Assignment
        • User Input & Type Casting
        • Functions- Basic Assignments
        • String Assignments
          • String CheatSheet
        • Conditional Statements Assignments
        • Loops Assignments
        • List Assignments
          • List Cheatsheet
        • Set Assignments
          • Sets Cheatsheet
        • Dictionary Assignments
          • Dictionary Cheatsheet
        • Function Assignments
        • Functions used in Python
      • 3.Python Projects
        • Employee Management System
        • Hamming distance
        • Webscraping With Python
          • Introduction To Web Scraping
          • Importing Necessary Libraries
          • Basic Introduction To HTML
          • Introduction To BeautifulSoup
          • Flipkart Web Scraping
            • Scraping Step By Step
        • Retail Sales Analysis
        • Guess the Word Game
        • Data Collection Through APIs
        • To-Do List Manager
        • Atm-functionalities(nested if)
        • Distribution of Cards(List & Nested for)
        • Guess the Number Game
      • 4.Python + SQL Projects
        • Bookstore Management System
    • 2.Data Analytics
      • 1.Pandas
        • 1.Pandas Documents
          • 1.Introduction To Pandas
          • Reading and Loading Different Data
          • 2.Indexing and Slicing In Pandas
          • 3.Joining In Pandas
          • 4.Missing Values In Pandas
          • 5.Outliers In Pandas
          • 6.Aggregating Data
          • 7.DateTime In Pandas
          • 8.Validation In Pandas
          • 9.Fetching Data From SQL
          • 10. Automation In Pandas
          • 11.Matplotlib - Data Visualization
          • 12. Seaborn - Data Visualization
          • 13. Required Files
        • 3.Pandas Projects
          • Retail Sales Analysis
            • Retail Sales Step By Step
          • IMDB - Dataset Analysis - Basic
        • 2. Pandas Assignments
          • 1. Reading and Loading the Data
          • 2. Data frame Functions and Properties
          • 3. Series - Basic Operations
          • 4. Filtering in Pandas
          • 5. Advance Filtering
          • 6. Aggregate Functions & Groupby
          • 7. Pivot Tables
          • 8. Datetime
          • 9. String Functions
Powered by GitBook
On this page
  1. Python For Data Analytics
  2. 2.Data Analytics
  3. 1.Pandas
  4. 2. Pandas Assignments

6. Aggregate Functions & Groupby

Previous5. Advance FilteringNext7. Pivot Tables

Last updated 3 months ago

  1. Find the total revenue generated from all transactions.

Solution
print(df['Total_Amount'].sum())

# Output
401978248.89447826
  1. Find the average value of the orders.

Solution
print(df['Total_Amount'].mean())

# Output
1367.6869831155632
  1. Find the highest sale value.

Solution
print(df['Total_Amount'].max())

# Output
4999.625796
  1. Find the lowest sale value.

Solution
print(df['Total_Amount'].max())

# Output
10.00374959
  1. Find the average rating given by customers.

Solution
print(df['Ratings'].mean())

# Output
3.16230083256496
  1. Find the total number of products sold.

Solution
print(df['Total_Purchases'].sum())

# Output
1575323
  1. Find the city and country with highest frequencies.

Solution
print(df['City'].mode())

# Output
Chicago

print(df['Country'].mode())

# output
USA
  1. Count the number of transactions per country wise.

Solution
ndf=df.groupby('Country').agg(
    total_transactions=('Transaction_ID','count')
)
ndf.reset_index()

# Output
        Country	total_transactions
0	Australia 44170
1	Canada	44110
2	Germany	51433
3	UK	61398
4	USA	92800
  1. Find the total sales city wise.

Solution
ndf=df.groupby('City').agg(
    total_sale=('Total_Amount','sum')
)
ndf.reset_index()

# Output
	City	total_sale
0	Adelaide	2.972902e+06
1	Albuquerque	1.213217e+06
2	Albury-Wodonga	3.026290e+06
3	Arlington	1.169384e+06
4	Atlanta	1.229512e+06
...	...	...
125	Wichita	1.188115e+06
126	Windsor	2.954510e+06
127	Winnipeg	3.160659e+06
128	Wollongong	3.101860e+06
129	Wuppertal	3.043289e+06
130 rows × 2 columns
  1. Find out average sales for every product category.

Solution
ndf=df.groupby('Product_Category').agg(
    avg_revenue=('Total_Amount','mean')
)
ndf=ndf.reset_index()
ndf

# Output
Product_Category	avg_revenue
0	Books	1367.069440
1	Clothing	1368.820004
2	Electronics	1369.736547
3	Grocery	1366.061942
4	Home Decor	1366.480831
  1. Create a new dataset which contains the total sale value of each brand.

Solution
ndf=df.groupby('Product_Brand').agg(
    total_sale=('Total_Amount','sum')
)
ndf=ndf.reset_index()
ndf

# Output
Product_Brand	total_sale
0	Adidas	2.423476e+07
1	Apple	2.396481e+07
2	Bed Bath & Beyond	2.427537e+07
3	BlueStar	2.992519e+06
4	Coca-Cola	2.446867e+07
5	HarperCollins	2.433139e+07
6	Home Depot	2.415726e+07
7	IKEA	2.390750e+07
8	Mitsubhisi	9.008944e+06
9	Nestle	2.407735e+07
10	Nike	2.435496e+07
11	Penguin Books	2.417092e+07
12	Pepsi	4.042013e+07
13	Random House	2.422442e+07
14	Samsung	2.469212e+07
15	Sony	2.446911e+07
16	Whirepool	9.884276e+06
17	Zara	2.434375e+07
  1. Find out the highest average transaction value as per payment method.

Solution
ndf=df.groupby('Payment_Method').agg(
    avg_trnasaction_value=('Total_Amount','mean')
)
ndf=ndf.reset_index()
ndf.max()

# Output
Payment_Method                PayPal
avg_trnasaction_value    1369.456358
dtype: object
  1. Find the total number of transactions for each shipping method.

Solution
ndf=df.groupby('Shipping_Method').agg(
    Shipping_method_count=('Transaction_ID','count')
)
ndf.reset_index()


# Output
    Shipping_Method	Shipping_method_count
0	Express	        99600
1	Same-Day	101541
2	Standard	92770
  1. Find out the top sale from each country.

Solution
ndf=df.groupby('Country').agg(
    top_sale=('Total_Amount','max')
)
ndf.reset_index()

# Output
        Country	        top_sale
0	Australia	4997.714637
1	Canada	4999.171428
2	Germany	4998.603558
3	UK	4999.625796
4	USA	4999.340097
  1. Find out total number of transactions for each product category in every country.

Solution
ndf=df.groupby(['Country', 'Product_Category']).agg(
    total_transactions=('Transaction_ID','count')
)
ndf

# Output
		        total_transactions
Country	Product_Category	
Australia	Books	8165
Clothing	8055
Electronics	11038
Grocery	8743
Home Decor	8169
Canada	Books	7965
Clothing	8175
Electronics	11334
Grocery	8676
Home Decor	7960
Germany	Books	9581
Clothing	9549
Electronics	12575
Grocery	10162
Home Decor	9566
UK	Books	11455
Clothing	11643
Electronics	14786
Grocery	12048
Home Decor	11466
USA	Books	16033
Clothing	15860
Electronics	19632
Grocery	25497
Home Decor	15778
  1. Find out the total number of transactions for each payment method for every country.

Solution
ndf=df.groupby(['Country', 'Payment_Method']).agg(
     new=('Transaction_ID','count')
)
ndf

# Output
		          new
Country	Payment_Method	
Australia	Cash	10733
Credit Card	13879
Debit Card	11283
PayPal	8275
Canada	Cash	10742
Credit Card	13972
Debit Card	11134
PayPal	8262
Germany	Cash	12439
Credit Card	15690
Debit Card	13137
PayPal	10167
UK	Cash	14934
Credit Card	18026
Debit Card	15848
PayPal	12590
USA	Cash	23079
Credit Card	26214
Debit Card	23342
PayPal	20165
Payment_Method

75MB
retail_data.csv