7. Pivot Tables
Last updated
Last updated
Find the total sales amount for each year.
import pandas as pd
df=pd.read_csv('retail_data.csv')
ndf = df.pivot_table(values="Total_Amount", index="Year", aggfunc="sum").reset_index()
print(ndf)
# Output
Year Total_Amount
0 2023 3.358360e+08
1 2024 6.614229e+07
How many transactions were made for each unique total purchase count, categorized by customer segments?
ndf=df.pivot_table(index='Total_Purchases',values='Customer_Segment',aggfunc='count')
ndf.reset_index()
# Output
Total_Purchases Customer_Segment
0 1 31050
1 2 31084
2 3 31070
3 4 30795
4 5 31112
5 6 27767
6 7 27696
7 8 27966
8 9 27757
9 10 27614
Find the average purchase amount for each product category.
ndf=df.pivot_table(index="Product_Category", values='Amount',aggfunc='mean')
ndf.reset_index()
# Output
Product_Category Amount
0 Books 255.072599
1 Clothing 254.907741
2 Electronics 255.408202
3 Grocery 255.350918
4 Home Decor 254.904483
Find the number of transactions for each country.
ndf=df.pivot_table(index='Country',values='Transaction_ID',aggfunc='count')
ndf.reset_index()
# Output
Country Transaction_ID
Australia 44170
Canada 44110
Germany 51433
UK 61398
USA 92800
Find the maximum payment for each payment method.
ndf=df.pivot_table(index='Payment_Method',values='Total_Amount',aggfunc='max')
ndf.reset_index()
# Output
Payment_Method Total_Amount
0 Cash 4999.340097
1 Credit Card 4999.171428
2 Debit Card 4998.306569
3 PayPal 4999.625796
Find the minimum purchase amount of each product.
ndf=df.pivot_table(index='products',values='Amount',aggfunc='min')
ndf.reset_index()
# Output
products Amount
0 4K TV 11.092281
1 A-line dress 10.617622
2 Acer Iconia Tab 10.106953
3 Acer Swift 10.325613
4 Action 11.265383
... ... ...
313 Wrap dress 10.155775
314 Wrench 10.367558
315 Xiaomi Mi 11.782073
316 iPad 11.002920
317 iPhone 10.458969
318 rows × 2 columns
Find the total purchases made in each month across years.
ndf=df.pivot_table(index='Month',columns='Year',values='Total_Purchases',aggfunc='sum')
ndf
# Output
Year 2023 2024
Month
April 196563 19656
August 160573 12014
December 98493 196
February 1420 94283
January 80903 113377
July 151209 10570
June 95443 327
March 99971 297
May 139731 7993
November 95188 253
October 99394 290
September 96945 234
Find the most expensive purchase amount for each product type.
ndf=df.pivot_table(values="Amount", index="Product_Type", aggfunc="max")
ndf.reset_index()
# Output
Product_Type Amount
0 Bathroom 499.968673
1 Bedding 499.963011
2 BlueStar AC 499.863784
3 Children's 499.916128
4 Chocolate 499.989331
5 Coffee 499.894725
6 Decorations 499.975680
7 Dress 499.886695
8 Fiction 499.993884
9 Fridge 499.780098
10 Furniture 499.966831
11 Headphones 499.863222
12 Jacket 499.934010
13 Jeans 499.728422
14 Juice 499.971297
15 Kitchen 499.996681
16 Laptop 499.989082
17 Lighting 499.971256
18 Literature 499.975499
19 Mitsubishi 1.5 Ton 3 Star Split AC 499.971872
20 Non-Fiction 499.997024
21 Shirt 499.988382
22 Shoes 499.976353
23 Shorts 499.905745
24 Smartphone 499.965071
25 Snacks 499.397907
26 Soft Drink 499.948541
27 T-shirt 499.872190
28 Tablet 499.967846
29 Television 499.995511
30 Thriller 499.973765
31 Tools 499.966177
32 Water 499.997911
Find the average revenue from each city and segment.
ndf = df.pivot_table(values="Total_Amount", index="City", columns="Customer_Segment", aggfunc="mean")
print(ndf)
# Output
Customer_Segment New Premium Regular
City
Adelaide 1369.313135 1341.971086 1375.674019
Albuquerque 1422.536802 1283.270134 1422.539740
Albury-Wodonga 1338.948198 1355.163718 1400.024683
Arlington 1461.008394 1351.804828 1374.576625
Atlanta 1366.889174 1336.929332 1519.515216
... ... ... ...
Wichita 1350.645689 1455.371135 1412.965164
Windsor 1391.934619 1365.263345 1393.146160
Winnipeg 1378.766203 1274.070543 1352.827470
Wollongong 1472.783467 1392.782376 1357.131103
Wuppertal 1366.524067 1372.380386 1384.728925
[130 rows x 3 columns]
Find the minimum revenue for each product brand and order status.
ndf=df.pivot_table(index='Product_Brand',columns='Order_Status',values='Total_Amount',aggfunc='min')
ndf
# Output
Order_Status Delivered Pending Processing Shipped
Product_Brand
Adidas 10.484693 11.558308 13.318187 13.267327
Apple 11.696255 11.288107 10.928036 13.992273
Bed Bath & Beyond 10.362214 11.892097 11.035773 11.383778
BlueStar 10.312626 NaN NaN NaN
Coca-Cola 10.382533 13.914545 11.927405 11.392996
HarperCollins 10.304855 12.472113 11.333472 10.288169
Home Depot 10.198830 10.573064 12.494869 10.068154
IKEA 11.044818 13.457616 12.371498 12.115942
Mitsubhisi 10.800385 110.733110 15.309442 34.552923
Nestle 10.761919 12.512199 10.056353 10.947367
Nike 10.648666 12.162504 10.578870 10.601896
Penguin Books 10.226839 10.572502 10.707279 10.133500
Pepsi 10.092966 10.231967 10.304530 10.003750
Random House 10.934673 13.717490 10.172078 10.510516
Samsung 10.566244 12.491820 10.523771 11.536797
Sony 10.063269 11.832873 11.509879 10.831884
Whirepool 10.011336 NaN NaN NaN
Zara 11.548863 10.295241 11.218513 10.783579