PDA Assignments
  • Python For Data Analytics
    • 1.Python
      • 1.Python Documents
        • 1.Data Types
        • 2.Variables In Python
        • 3.Operators In Python
        • 4.User Input In Python
        • 5.TypeCasting In Python
        • 6.Strings In Python
        • 7.Conditional Statements In Python
        • 8.Branching using Conditional Statements and Loops in Python
        • 9.Lists In Python
        • 10.Sets In Python
        • 11.Tuples In Python
        • 12.Dictionary In Python
        • 13.Functions In Python
        • 14.File Handling In Python
        • 15.Numerical Computing with Python and Numpy
      • 2.Python Assignments
        • Data Type & Variables
        • Operators Assignment
        • User Input & Type Casting
        • Functions- Basic Assignments
        • String Assignments
          • String CheatSheet
        • Conditional Statements Assignments
        • Loops Assignments
        • List Assignments
          • List Cheatsheet
        • Set Assignments
          • Sets Cheatsheet
        • Dictionary Assignments
          • Dictionary Cheatsheet
        • Function Assignments
        • Functions used in Python
      • 3.Python Projects
        • Employee Management System
        • Hamming distance
        • Webscraping With Python
          • Introduction To Web Scraping
          • Importing Necessary Libraries
          • Basic Introduction To HTML
          • Introduction To BeautifulSoup
          • Flipkart Web Scraping
            • Scraping Step By Step
        • Retail Sales Analysis
        • Guess the Word Game
        • Data Collection Through APIs
        • To-Do List Manager
        • Atm-functionalities(nested if)
        • Distribution of Cards(List & Nested for)
        • Guess the Number Game
      • 4.Python + SQL Projects
        • Bookstore Management System
    • 2.Data Analytics
      • 1.Pandas
        • 1.Pandas Documents
          • 1.Introduction To Pandas
          • Reading and Loading Different Data
          • 2.Indexing and Slicing In Pandas
          • 3.Joining In Pandas
          • 4.Missing Values In Pandas
          • 5.Outliers In Pandas
          • 6.Aggregating Data
          • 7.DateTime In Pandas
          • 8.Validation In Pandas
          • 9.Fetching Data From SQL
          • 10. Automation In Pandas
          • 11.Matplotlib - Data Visualization
          • 12. Seaborn - Data Visualization
          • 13. Required Files
        • 3.Pandas Projects
          • Retail Sales Analysis
            • Retail Sales Step By Step
          • IMDB - Dataset Analysis - Basic
        • 2. Pandas Assignments
          • 1. Reading and Loading the Data
          • 2. Data frame Functions and Properties
          • 3. Series - Basic Operations
          • 4. Filtering in Pandas
          • 5. Advance Filtering
          • 6. Aggregate Functions & Groupby
          • 7. Pivot Tables
          • 8. Datetime
          • 9. String Functions
Powered by GitBook
On this page
  • Very Easy assignments
  • Easy Assignments
  1. Python For Data Analytics
  2. 2.Data Analytics
  3. 1.Pandas
  4. 2. Pandas Assignments

2. Data frame Functions and Properties

Previous1. Reading and Loading the DataNext3. Series - Basic Operations

Last updated 7 months ago

Very Easy assignments

  1. Load the dataset and display.

Solution
import pandas as pd
df=pd.read_csv('retail_data.csv')
df

# Output


Transaction_ID	Customer_ID	Name	Email	Phone	Address	City	State	Zipcode	Country	...	Total_Amount	Product_Category	Product_Brand	Product_Type	Feedback	Shipping_Method	Payment_Method	Order_Status	Ratings	products
0	8691788	37249	Michelle Harrington	Ebony39@gmail.com	1414786801	3959 Amanda Burgs	Dortmund	Berlin	77985	Germany	...	324.086270	Clothing	Nike	Shorts	Excellent	Same-Day	Debit Card	Shipped	5	Cycling shorts
1	2174773	69749	Kelsey Hill	Mark36@gmail.com	6852899987	82072 Dawn Centers	Nottingham	England	99071	UK	...	806.707815	Electronics	Samsung	Tablet	Excellent	Standard	Credit Card	Processing	4	Lenovo Tab
2	6679610	30192	Scott Jensen	Shane85@gmail.com	8362160449	4133 Young Canyon	Geelong	New South Wales	75929	Australia	...	1063.432799	Books	Penguin Books	Children's	Average	Same-Day	Credit Card	Processing	2	Sports equipment
3	7232460	62101	Joseph Miller	Mary34@gmail.com	2776751724	8148 Thomas Creek Suite 100	Edmonton	Ontario	88420	Canada	...	2466.854021	Home Decor	Home Depot	Tools	Excellent	Standard	PayPal	Processing	4	Utility knife
4	4983775	27901	Debra Coleman	Charles30@gmail.com	9098267635	5813 Lori Ports Suite 269	Bristol	England	48704	UK	...	248.553049	Grocery	Nestle	Chocolate	Bad	Standard	Cash	Shipped	1	Chocolate cookies
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
293906	4246475	12104	Meagan Ellis	Courtney60@gmail.com	7466353743	389 Todd Path Apt. 159	Townsville	New South Wales	4567	Australia	...	973.962984	Books	Penguin Books	Fiction	Bad	Same-Day	Cash	Processing	1	Historical fiction
293907	1197603	69772	Mathew Beck	Jennifer71@gmail.com	5754304957	52809 Mark Forges	Hanover	Berlin	16852	Germany	...	285.137301	Electronics	Apple	Laptop	Excellent	Same-Day	Cash	Processing	5	LG Gram
293908	7743242	28449	Daniel Lee	Christopher100@gmail.com	9382530370	407 Aaron Crossing Suite 495	Brighton	England	88038	UK	...	182.105285	Clothing	Adidas	Jacket	Average	Express	Cash	Shipped	2	Parka
293909	9301950	45477	Patrick Wilson	Rebecca65@gmail.com	9373222023	3204 Baird Port	Halifax	Ontario	67608	Canada	...	120.834784	Home Decor	IKEA	Furniture	Good	Standard	Cash	Shipped	4	TV stand
293910	2882826	53626	Dustin Merritt	William14@gmail.com	9518926645	143 Amanda Crescent	Tucson	West Virginia	25242	USA	...	2382.233417	Home Decor	Home Depot	Decorations	Average	Same-Day	Cash	Shipped	2	Clocks
293911 rows × 30 columns
  1. Check the number of rows and columns in the dataset.

Solution
df.shape

# Output
(293911, 30)
  1. Check details for index numbers in the dataset.

Solution
df.index

# Output

RangeIndex(start=0, stop=293911, step=1)
  1. Display all the columns of this dataset.

Solution
df.columns    

# Output

Index(['Transaction_ID', 'Customer_ID', 'Name', 'Email', 'Phone', 'Address',
       'City', 'State', 'Zipcode', 'Country', 'Age', 'Gender', 'Income',
       'Customer_Segment', 'Date', 'Year', 'Month', 'Time', 'Total_Purchases',
       'Amount', 'Total_Amount', 'Product_Category', 'Product_Brand',
       'Product_Type', 'Feedback', 'Shipping_Method', 'Payment_Method',
       'Order_Status', 'Ratings', 'products'],
      dtype='object')

  1. Check out the data type for each column.

Solution
df.dtypes

# Output

Transaction_ID        int64
Customer_ID           int64
Name                 object
Email                object
Phone                 int64
Address              object
City                 object
State                object
Zipcode               int64
Country              object
Age                   int64
Gender               object
Income               object
Customer_Segment     object
Date                 object
Year                  int64
Month                object
Time                 object
Total_Purchases       int64
Amount              float64
Total_Amount        float64
Product_Category     object
Product_Brand        object
Product_Type         object
Feedback             object
...
Payment_Method       object
Order_Status         object
Ratings               int64
products             object
dtype: object
  1. Create a sample dataset of 10 rows from the existing dataset.

Solution
df.sample(10)

# Output---- It results in a dataset of 10 rows randomly.
  1. Display the statistical summary of the dataset.

Solution
df.describe()

# Output

Transaction_ID	Customer_ID	Phone	Zipcode	Age	Year	Total_Purchases	Amount	Total_Amount	Ratings
count	2.939110e+05	293911.000000	2.939110e+05	293911.000000	293911.000000	293911.000000	293911.000000	293911.000000	293911.000000	293911.000000
mean	5.493726e+06	55013.400523	5.500607e+09	50288.383830	35.465767	2023.165125	5.359864	255.153307	1367.686983	3.162301
std	2.596086e+06	26009.435811	2.596111e+09	28976.614021	15.017749	0.371294	2.868440	141.388614	1128.895164	1.320762
min	1.000007e+06	10000.000000	1.000049e+09	501.000000	18.000000	2023.000000	1.000000	10.000219	10.003750	1.000000
25%	3.245886e+06	32470.000000	3.253497e+09	25408.000000	22.000000	2023.000000	3.000000	132.839683	438.852849	2.000000
50%	5.495879e+06	55027.000000	5.504466e+09	50586.000000	32.000000	2023.000000	5.000000	255.463226	1041.164351	3.000000
75%	7.738197e+06	77514.000000	7.749761e+09	75252.000000	46.000000	2023.000000	8.000000	377.638576	2028.954272	4.000000
max	9.999995e+06	99999.000000	9.999996e+09	99949.000000	70.000000	2024.000000	10.000000	499.997911	4999.625796	5.000000
  1. Check out the information on all the columns of the dataset of retail data.

Solution
df.info()

# Output

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 293911 entries, 0 to 293910
Data columns (total 30 columns):
 #   Column            Non-Null Count   Dtype  
---  ------            --------------   -----  
 0   Transaction_ID    293911 non-null  int64  
 1   Customer_ID       293911 non-null  int64  
 2   Name              293911 non-null  object 
 3   Email             293911 non-null  object 
 4   Phone             293911 non-null  int64  
 5   Address           293911 non-null  object 
 6   City              293911 non-null  object 
 7   State             293911 non-null  object 
 8   Zipcode           293911 non-null  int64  
 9   Country           293911 non-null  object 
 10  Age               293911 non-null  int64  
 11  Gender            293911 non-null  object 
 12  Income            293911 non-null  object 
 13  Customer_Segment  293911 non-null  object 
 14  Date              293911 non-null  object 
 15  Year              293911 non-null  int64  
 16  Month             293911 non-null  object 
 17  Time              293911 non-null  object 
 18  Total_Purchases   293911 non-null  int64  
 19  Amount            293911 non-null  float64
...
 28  Ratings           293911 non-null  int64  
 29  products          293911 non-null  object 
dtypes: float64(2), int64(8), object(20)
memory usage: 67.3+ MB
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... 
 29  products          293911 non-null  object 
  1. Display the top 5 rows of the dataset.

Solution
    df.head()

# Output
    
        Transaction_ID	Customer_ID	Name	Email	Phone	Address	City	State	Zipcode	Country	...	Total_Amount	Product_Category	Product_Brand	Product_Type	Feedback	Shipping_Method	Payment_Method	Order_Status	Ratings	products
0	8691788	37249	Michelle Harrington	Ebony39@gmail.com	1414786801	3959 Amanda Burgs	Dortmund	Berlin	77985	Germany	...	324.086270	Clothing	Nike	Shorts	Excellent	Same-Day	Debit Card	Shipped	5	Cycling shorts
1	2174773	69749	Kelsey Hill	Mark36@gmail.com	6852899987	82072 Dawn Centers	Nottingham	England	99071	UK	...	806.707815	Electronics	Samsung	Tablet	Excellent	Standard	Credit Card	Processing	4	Lenovo Tab
2	6679610	30192	Scott Jensen	Shane85@gmail.com	8362160449	4133 Young Canyon	Geelong	New South Wales	75929	Australia	...	1063.432799	Books	Penguin Books	Children's	Average	Same-Day	Credit Card	Processing	2	Sports equipment
3	7232460	62101	Joseph Miller	Mary34@gmail.com	2776751724	8148 Thomas Creek Suite 100	Edmonton	Ontario	88420	Canada	...	2466.854021	Home Decor	Home Depot	Tools	Excellent	Standard	PayPal	Processing	4	Utility knife
4	4983775	27901	Debra Coleman	Charles30@gmail.com	9098267635	5813 Lori Ports Suite 269	Bristol	England	48704	UK	...	248.553049	Grocery	Nestle	Chocolate	Bad	Standard	Cash	Shipped	1	Chocolate cookies
5 rows × 30 columns	
  1. Display the top 12 rows of the dataset.

Solution
df.head(12)

# Output
    
    Transaction_ID	Customer_ID	Name	Email	Phone	Address	City	State	Zipcode	Country	...	Total_Amount	Product_Category	Product_Brand	Product_Type	Feedback	Shipping_Method	Payment_Method	Order_Status	Ratings	products
0	8691788	37249	Michelle Harrington	Ebony39@gmail.com	1414786801	3959 Amanda Burgs	Dortmund	Berlin	77985	Germany	...	324.086270	Clothing	Nike	Shorts	Excellent	Same-Day	Debit Card	Shipped	5	Cycling shorts
1	2174773	69749	Kelsey Hill	Mark36@gmail.com	6852899987	82072 Dawn Centers	Nottingham	England	99071	UK	...	806.707815	Electronics	Samsung	Tablet	Excellent	Standard	Credit Card	Processing	4	Lenovo Tab
2	6679610	30192	Scott Jensen	Shane85@gmail.com	8362160449	4133 Young Canyon	Geelong	New South Wales	75929	Australia	...	1063.432799	Books	Penguin Books	Children's	Average	Same-Day	Credit Card	Processing	2	Sports equipment
3	7232460	62101	Joseph Miller	Mary34@gmail.com	2776751724	8148 Thomas Creek Suite 100	Edmonton	Ontario	88420	Canada	...	2466.854021	Home Decor	Home Depot	Tools	Excellent	Standard	PayPal	Processing	4	Utility knife
4	4983775	27901	Debra Coleman	Charles30@gmail.com	9098267635	5813 Lori Ports Suite 269	Bristol	England	48704	UK	...	248.553049	Grocery	Nestle	Chocolate	Bad	Standard	Cash	Shipped	1	Chocolate cookies
5	6095326	41289	Ryan Johnson	Haley12@gmail.com	3292677006	532 Ashley Crest Suite 014	Brisbane	New South Wales	74430	Australia	...	1185.167224	Electronics	Apple	Tablet	Good	Express	PayPal	Pending	4	Lenovo Tab
6	5434096	97285	Erin Lewis	Arthur76@gmail.com	1578355423	600 Brian Prairie Suite 497	Kitchener	Ontario	47545	Canada	...	630.115295	Electronics	Samsung	Television	Bad	Standard	Cash	Processing	1	QLED TV
7	2344675	26603	Angela Fields	Tanya94@gmail.com	3668096144	237 Young Curve	Munich	Berlin	86862	Germany	...	46.588070	Clothing	Zara	Shirt	Bad	Same-Day	Cash	Processing	1	Dress shirt
8	4155845	80175	Diane Clark	Martin39@gmail.com	6219779557	8823 Mariah Heights Apt. 263	Wollongong	New South Wales	39820	Australia	...	2630.714413	Grocery	Nestle	Chocolate	Bad	Same-Day	Cash	Delivered	1	Dark chocolate
9	4926148	31878	Lori Bell	Jessica33@gmail.com	6004895059	6225 William Lodge	Cologne	Berlin	64317	Germany	...	3976.112295	Home Decor	Home Depot	Decorations	Excellent	Standard	Cash	Delivered	4	Candles
10	8493213	19136	Jonathan Eaton	Mark38@gmail.com	2996714102	9772 Sosa Coves	Portsmouth	England	59280	UK	...	363.927479	Home Decor	Home Depot	Tools	Average	Standard	Credit Card	Shipped	2	Screwdriver set
11	1609659	66883	Brianna Oconnor	David47@gmail.com	9398168800	95471 Jerry Hollow Suite 034	Portsmouth	England	91253	UK	...	364.830567	Books	Random House	Non-Fiction	Average	Standard	Credit Card	Pending	2	Science
12 rows × 30 columns
  1. Display 5 rows from the bottom.

Solution
df.tail()

# Output

        
        Transaction_ID	Customer_ID	Name	Email	Phone	Address	City	State	Zipcode	Country	...	Total_Amount	Product_Category	Product_Brand	Product_Type	Feedback	Shipping_Method	Payment_Method	Order_Status	Ratings	products
293906	4246475	12104	Meagan Ellis	Courtney60@gmail.com	7466353743	389 Todd Path Apt. 159	Townsville	New South Wales	4567	Australia	...	973.962984	Books	Penguin Books	Fiction	Bad	Same-Day	Cash	Processing	1	Historical fiction
293907	1197603	69772	Mathew Beck	Jennifer71@gmail.com	5754304957	52809 Mark Forges	Hanover	Berlin	16852	Germany	...	285.137301	Electronics	Apple	Laptop	Excellent	Same-Day	Cash	Processing	5	LG Gram
293908	7743242	28449	Daniel Lee	Christopher100@gmail.com	9382530370	407 Aaron Crossing Suite 495	Brighton	England	88038	UK	...	182.105285	Clothing	Adidas	Jacket	Average	Express	Cash	Shipped	2	Parka
293909	9301950	45477	Patrick Wilson	Rebecca65@gmail.com	9373222023	3204 Baird Port	Halifax	Ontario	67608	Canada	...	120.834784	Home Decor	IKEA	Furniture	Good	Standard	Cash	Shipped	4	TV stand
293910	2882826	53626	Dustin Merritt	William14@gmail.com	9518926645	143 Amanda Crescent	Tucson	West Virginia	25242	USA	...	2382.233417	Home Decor	Home Depot	Decorations	Average	Same-Day	Cash	Shipped	2	Clocks
5 rows × 30 columns
  1. Display 8 bottom rows.

Solution
df.tail(8)

# Output
    Transaction_ID	Customer_ID	Name	Email	Phone	Address	City	State	Zipcode	Country	...	Total_Amount	Product_Category	Product_Brand	Product_Type	Feedback	Shipping_Method	Payment_Method	Order_Status	Ratings	products
293903	8961631	79479	Jason Welch	Jason36@gmail.com	6279294104	764 Garcia Flat	Hamilton	Ontario	61218	Canada	...	2659.976987	Home Decor	Home Depot	Tools	Excellent	Express	Cash	Pending	5	Level
293904	2844206	18799	Angel Hood	Joseph24@gmail.com	2825444712	7593 Joseph Trace Suite 382	Cairns	New South Wales	39837	Australia	...	2384.717299	Electronics	Apple	Tablet	Average	Same-Day	Cash	Pending	2	Amazon Fire Tablet
293905	4833982	94117	Kara Hart	Tammy37@gmail.com	7108672468	872 Robinson Harbors Apt. 328	Charlotte	Missouri	65301	USA	...	2362.120301	Clothing	Nike	Shorts	Excellent	Standard	Cash	Delivered	4	Chino shorts
293906	4246475	12104	Meagan Ellis	Courtney60@gmail.com	7466353743	389 Todd Path Apt. 159	Townsville	New South Wales	4567	Australia	...	973.962984	Books	Penguin Books	Fiction	Bad	Same-Day	Cash	Processing	1	Historical fiction
293907	1197603	69772	Mathew Beck	Jennifer71@gmail.com	5754304957	52809 Mark Forges	Hanover	Berlin	16852	Germany	...	285.137301	Electronics	Apple	Laptop	Excellent	Same-Day	Cash	Processing	5	LG Gram
293908	7743242	28449	Daniel Lee	Christopher100@gmail.com	9382530370	407 Aaron Crossing Suite 495	Brighton	England	88038	UK	...	182.105285	Clothing	Adidas	Jacket	Average	Express	Cash	Shipped	2	Parka
293909	9301950	45477	Patrick Wilson	Rebecca65@gmail.com	9373222023	3204 Baird Port	Halifax	Ontario	67608	Canada	...	120.834784	Home Decor	IKEA	Furniture	Good	Standard	Cash	Shipped	4	TV stand
293910	2882826	53626	Dustin Merritt	William14@gmail.com	9518926645	143 Amanda Crescent	Tucson	West Virginia	25242	USA	...	2382.233417	Home Decor	Home Depot	Decorations	Average	Same-Day	Cash	Shipped	2	Clocks

Easy Assignments

  1. Check out the number of data and data-type of each column in the dataset.

Solution
df.info()

# Output

#   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   Invoice/Item Number    919 non-null    object 
 1   Date                   919 non-null    object 
 2   Store Number           919 non-null    int64  
 3   Store Name             919 non-null    object 
 4   Address                919 non-null    object 
 5   City                   919 non-null    object 
 6   Zip Code               919 non-null    object 
 7   Store Location         919 non-null    object 
 8   County Number          919 non-null    int64  
 9   County                 919 non-null    object 
 10  Category               919 non-null    int64  
 11  Category Name          919 non-null    object 
 12  Vendor Number          919 non-null    int64  
 13  Vendor Name            919 non-null    object 
 14  Item Number            919 non-null    int64  
 15  Item Description       919 non-null    object 
 16  Pack                   919 non-null    int64  
 17  Bottle Volume (ml)     919 non-null    int64  
 18  State Bottle Cost      919 non-null    float64
 19  State Bottle Retail    919 non-null    float64
...
 22  Volume Sold (Liters)   919 non-null    float64
 23  Volume Sold (Gallons)  919 non-null    float64
  1. Display the first few rows of the dataset.

Solution
df.head()

# Output


    Invoice/Item Number	Date	Store Number	Store Name	Address	City	Zip Code	Store Location	County Number	County	...	Item Number	Item Description	Pack	Bottle Volume (ml)	State Bottle Cost	State Bottle Retail	Bottles Sold	Sale (Dollars)	Volume Sold (Liters)	Volume Sold (Gallons)
0	S28865700001	11-09-2015	2538	Hy-Vee Food Store #3 / Waterloo	1422 FLAMMANG DR	WATERLOO	50702	1422 FLAMMANG DR\nWATERLOO 50702\n(42.459938, ...	7	Black Hawk	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	6	104.58	9.0	2.38
1	S29339300091	11/30/2015	2662	Hy-Vee Wine & Spirits / Muscatine	522 MULBERRY, SUITE A	MUSCATINE	52761	522 MULBERRY, SUITE A\nMUSCATINE 52761\n	70	Muscatine	...	173	Laphroaig w/ Whiskey Stones	12	750	19.58	29.37	4	117.48	3.0	0.79
2	S28866900001	11-11-2015	3650	Spirits, Stogies and Stuff	118 South Main St.	HOLSTEIN	51025	118 South Main St.\nHOLSTEIN 51025\n(42.490073...	47	Ida	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	1	17.43	1.5	0.40
3	S29134300126	11/18/2015	3723	J D Spirits Liquor	1023 9TH ST	ONAWA	51040	1023 9TH ST\nONAWA 51040\n(42.025841, -96.095845)	67	Monona	...	258	Rumchata "GoChatas"	1	6000	99.00	148.50	1	148.50	6.0	1.59
4	S29282800048	11/23/2015	2642	Hy-Vee Wine and Spirits / Pella	512 E OSKALOOSA	PELLA	50219	512 E OSKALOOSA\nPELLA 50219\n(41.397023, -92....	63	Marion	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	6	104.58	9.0	2.38
5 rows × 24 columns
  1. Present the first 9 rows of the dataset.

Solution
df.head(9)


# Output


Invoice/Item Number	Date	Store Number	Store Name	Address	City	Zip Code	Store Location	County Number	County	...	Item Number	Item Description	Pack	Bottle Volume (ml)	State Bottle Cost	State Bottle Retail	Bottles Sold	Sale (Dollars)	Volume Sold (Liters)	Volume Sold (Gallons)
0	S28865700001	11-09-2015	2538	Hy-Vee Food Store #3 / Waterloo	1422 FLAMMANG DR	WATERLOO	50702	1422 FLAMMANG DR\nWATERLOO 50702\n(42.459938, ...	7	Black Hawk	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	6	104.58	9.0	2.38
1	S29339300091	11/30/2015	2662	Hy-Vee Wine & Spirits / Muscatine	522 MULBERRY, SUITE A	MUSCATINE	52761	522 MULBERRY, SUITE A\nMUSCATINE 52761\n	70	Muscatine	...	173	Laphroaig w/ Whiskey Stones	12	750	19.58	29.37	4	117.48	3.0	0.79
2	S28866900001	11-11-2015	3650	Spirits, Stogies and Stuff	118 South Main St.	HOLSTEIN	51025	118 South Main St.\nHOLSTEIN 51025\n(42.490073...	47	Ida	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	1	17.43	1.5	0.40
3	S29134300126	11/18/2015	3723	J D Spirits Liquor	1023 9TH ST	ONAWA	51040	1023 9TH ST\nONAWA 51040\n(42.025841, -96.095845)	67	Monona	...	258	Rumchata "GoChatas"	1	6000	99.00	148.50	1	148.50	6.0	1.59
4	S29282800048	11/23/2015	2642	Hy-Vee Wine and Spirits / Pella	512 E OSKALOOSA	PELLA	50219	512 E OSKALOOSA\nPELLA 50219\n(41.397023, -92....	63	Marion	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	6	104.58	9.0	2.38
5	S28867000001	11-04-2015	3842	Bancroft Liquor Store	107 N PORTLAND ST PO BX 222	BANCROFT	50517	107 N PORTLAND ST PO BX 222\nBANCROFT 50517\n(...	55	Kossuth	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	3	52.29	4.5	1.19
6	S28865800001	11-09-2015	2539	Hy-Vee Food Store / iowa Falls	HIGHWAY 65 SOUTH	IOWA FALLS	50126	HIGHWAY 65 SOUTH\nIOWA FALLS 50126\n	42	Hardin	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	6	104.58	9.0	2.38
7	S28867100001	11-09-2015	4604	Pit Stop Liquors / Newton	1324, 1st AVE E	NEWTON	50208	1324, 1st AVE E\nNEWTON 50208\n(41.699173, -93...	50	Jasper	...	238	Forbidden Secret Coffee Pack	6	1500	11.62	17.43	2	34.86	3.0	0.79
8	S29191200001	11/19/2015	2248	Ingersoll Liquor and Beverage	3500 INGERSOLL AVE	DES MOINES	50312	3500 INGERSOLL AVE\nDES MOINES 50312\n(41.5863...	77	Polk	...	173	Laphroaig w/ Whiskey Stones	12	750	19.58	29.37	36	1057.32	27.0	7.13
9 rows × 24 columns
  1. Reveal the bottom rows of the dataset.

Solution
df.tail()

# Output


Invoice/Item Number	Date	Store Number	Store Name	Address	City	Zip Code	Store Location	County Number	County	...	Item Number	Item Description	Pack	Bottle Volume (ml)	State Bottle Cost	State Bottle Retail	Bottles Sold	Sale (Dollars)	Volume Sold (Liters)	Volume Sold (Gallons)
914	S26164400020	06/16/2015	3944	Sam's Club 4973 / Dubuque	4400 ASBURY RD	DUBUQUE	52002	4400 ASBURY RD\nDUBUQUE 52002\n(42.515282, -90...	31	Dubuque	...	41705	Uv Red (cherry) Vodka	12	1000	7.50	11.25	12	135.00	12.0	3.17
915	S19675100022	06/25/2014	4008	Sioux Valley Spirits	116 E MAIN ST	ANTHON	51004	116 E MAIN ST\nANTHON 51004\n(42.388268, -95.8...	97	Woodbury	...	28890	Tanqueray Rangpur Gin	12	750	12.50	18.74	2	37.48	1.5	0.40
916	S12278000057	05/22/2013	3385	Sam's Club 8162 / Cedar Rapids	2605 BLAIRS FERRY RD NE	CEDAR RAPIDS	52402	2605 BLAIRS FERRY RD NE\nCEDAR RAPIDS 52402\n(...	57	Linn	...	34029	Absolut Citron (lemon Vodka)	12	1000	15.00	22.49	60	1349.40	60.0	15.85
917	S05694100030	05/23/2012	2487	Anamosa Family Foods	402 EAST MAIN	ANAMOSA	52205	402 EAST MAIN\nANAMOSA 52205\n(42.108289, -91....	53	Jones	...	11774	Black Velvet	24	375	3.07	4.60	48	220.80	18.0	4.76
918	S23309100006	01-06-2015	4559	Osage Payless Foods	633, CHASE ST	OSAGE	50461	633, CHASE ST\nOSAGE 50461\n(43.285134, -92.81...	66	Mitchell	...	35926	Five O'clock PET Vodka	12	750	3.37	5.06	12	60.72	9.0	2.38
5 rows × 24 columns
  1. Display the last 25 liquor data of the dataset.

Solution
df.tail(25)

# Output


Invoice/Item Number	Date	Store Number	Store Name	Address	City	Zip Code	Store Location	County Number	County	...	Item Number	Item Description	Pack	Bottle Volume (ml)	State Bottle Cost	State Bottle Retail	Bottles Sold	Sale (Dollars)	Volume Sold (Liters)	Volume Sold (Gallons)
894	S25976200016	06-02-2015	4326	Stratford Food Center	829 SHAKESPEARE AVE	STRATFORD	50249	829 SHAKESPEARE AVE\nSTRATFORD 50249\n(42.2713...	40	Hamilton	...	22157	Wild Turkey 101	12	1000	16.16	24.24	2	48.48	2.00	0.53
895	S12095200005	05-08-2013	3013	Keith's Foods	207 E LOCUST ST	BLOOMFIELD	52537	207 E LOCUST ST\nBLOOMFIELD 52537\n(40.752691,...	26	Davis	...	36908	Mccormick Vodka Pet	6	1750	7.46	11.19	6	67.14	10.50	2.77
896	S24129700005	02/19/2015	3712	Monte Spirits	109 N 4TH ST	MONTEZUMA	50171	109 N 4TH ST\nMONTEZUMA 50171\n(41.585429, -92...	79	Poweshiek	...	11788	Black Velvet	6	1750	9.70	14.93	6	89.58	10.50	2.77
897	S12946600041	06/24/2013	2835	CVS Pharmacy #8538 / Cedar Falls	2302 WEST FIRST ST	CEDAR FALLS	50613	2302 WEST FIRST ST\nCEDAR FALLS 50613\n(42.539...	7	Black Hawk	...	34116	Absolut Mandrin	12	750	11.00	16.49	3	49.47	2.25	0.59
898	S26299700151	06/18/2015	2515	Hy-Vee Food Store #1 / Mason City	2400 4TH ST SW	MASON CITY	50401	2400 4TH ST SW\nMASON CITY 50401\n(43.148446, ...	17	Cerro Gordo	...	82836	Dekuyper Raspberry Pucker	12	750	6.30	9.45	2	18.90	1.50	0.40
899	S04680100053	03/21/2012	3825	Shop N Save #2 / E 14th	1372 E 14TH ST	DES MOINES	50316	1372 E 14TH ST\nDES MOINES 50316\n(41.604893, ...	77	Polk	...	67266	Yukon Jack Canadian Liqueur	12	750	8.54	12.81	4	51.24	3.00	0.79
900	S15444500015	10/30/2013	4251	Aj's Liquor / Ames	4518 MORTENSON RD STE 109	AMES	50014	4518 MORTENSON RD STE 109\nAMES 50014\n	85	Story	...	19068	Jim Beam	6	1750	18.42	28.14	2	56.28	3.50	0.92
901	S15833500007	11/20/2013	3986	Siouxland Beverage	1203 5 ST	SIOUX CITY	51101	1203 5 ST\nSIOUX CITY 51101\n(42.495322, -96.3...	97	Woodbury	...	11936	Canadian Ltd Whisky Convenience Pack	12	750	3.84	6.01	12	72.12	9.00	2.38
902	S22927800089	12/15/2014	3562	Wal-Mart 0797 / W Burlington	324 WEST AGENCY RD	WEST BURLINGTON	52655	324 WEST AGENCY RD\nWEST BURLINGTON 52655\n(40...	29	Des Moines	...	33256	Seagrams Lime Twisted Gin	12	750	6.49	9.74	12	116.88	9.00	2.38
903	S21552900002	10-02-2014	4647	B and B EAST / Waterloo	1615 BISHOP AVE	WATERLOO	50707	1615 BISHOP AVE\nWATERLOO 50707\n(42.49807, -9...	7	Black Hawk	...	35926	Five O'clock PET Vodka	12	750	3.37	5.06	12	60.72	9.00	2.38
904	S16534000021	12/27/2013	4148	Fareway Stores #479 / Independence	1400 3RD AVE SE	INDEPENDENCE	50644	1400 3RD AVE SE\nINDEPENDENCE 50644\n(42.45563...	10	Buchanan	...	89916	Tortilla Gold Tequila	12	750	6.27	9.40	2	18.80	1.50	0.40
905	S04572600041	03/15/2012	4280	Slagle's Grocery / Le Claire	1301 EAGLE RIDGE RD	LE CLAIRE	52753	1301 EAGLE RIDGE RD\nLE CLAIRE 52753\n(41.5873...	82	Scott	...	87510	1800 Silver Tequila	12	750	14.50	21.74	1	21.74	0.75	0.20
906	S11456900038	04-04-2013	2190	Central City Liquor, Inc.	1460 2ND AVE	DES MOINES	50314	1460 2ND AVE\nDES MOINES 50314\n(41.60566, -93...	77	Polk	...	5350	Johnnie Walker Green	6	750	36.75	55.12	2	110.24	1.50	0.40
907	S04354200031	03-01-2012	3719	Wal-Mart 0581 / Marshalltown	2802 S CENTER ST	MARSHALLTOWN	50158	2802 S CENTER ST\nMARSHALLTOWN 50158\n(42.0129...	64	Marshall	...	37348	Phillips Vodka	6	1750	7.31	10.97	6	65.82	10.50	2.77
908	S19584700001	06/17/2014	4266	Wal-Mart 1683 / Shenandoah	705 S FREMONT	SHENANDOAH	51601	705 S FREMONT\nSHENANDOAH 51601\n(40.760655, -...	73	Page	...	11788	Black Velvet	6	1750	10.45	15.67	12	188.04	21.00	5.55
909	S16906900003	01/20/2014	3612	B and C Liquor / Maquoketa	509 E PLATT	MAQUOKETA	52060	509 E PLATT\nMAQUOKETA 52060\n(42.069219, -90....	49	Jackson	...	16518	Ancient Age Bourbon	6	1750	11.80	17.70	3	53.10	5.25	1.39
910	S18142000006	03/31/2014	3868	Wal-Mart 3630 / Marion	5491 BUSINESS HWY 151	MARION	52302	5491 BUSINESS HWY 151\nMARION 52302\n	57	Linn	...	68049	Bailey's Vanilla Cinnamon	12	750	13.00	19.50	12	234.00	9.00	2.38
911	S19772000013	06/26/2014	4948	Wheatland Day Break	102 W HWY 30	WHEATLAND	52777	102 W HWY 30\nWHEATLAND 52777\n	23	Clinton	...	43336	Captain Morgan Original Spiced	12	750	8.75	13.12	6	78.72	4.50	1.19
912	S22284900002	11-10-2014	3830	Wal-Mart 1435 / Creston	806 LAUREL ST	CRESTON	50801	806 LAUREL ST\nCRESTON 50801\n(41.047716, -94....	88	Union	...	42718	Malibu Coconut Rum	6	1750	16.49	24.74	6	148.44	10.50	2.77
913	S17966000051	03/19/2014	3495	Great Pastimes	228 N MAIN ST	MONTICELLO	52310	228 N MAIN ST\nMONTICELLO 52310\n(42.240132, -...	53	Jones	...	43137	Bacardi Limon	12	1000	10.24	15.35	5	76.75	5.00	1.32
914	S26164400020	06/16/2015	3944	Sam's Club 4973 / Dubuque	4400 ASBURY RD	DUBUQUE	52002	4400 ASBURY RD\nDUBUQUE 52002\n(42.515282, -90...	31	Dubuque	...	41705	Uv Red (cherry) Vodka	12	1000	7.50	11.25	12	135.00	12.00	3.17
915	S19675100022	06/25/2014	4008	Sioux Valley Spirits	116 E MAIN ST	ANTHON	51004	116 E MAIN ST\nANTHON 51004\n(42.388268, -95.8...	97	Woodbury	...	28890	Tanqueray Rangpur Gin	12	750	12.50	18.74	2	37.48	1.50	0.40
916	S12278000057	05/22/2013	3385	Sam's Club 8162 / Cedar Rapids	2605 BLAIRS FERRY RD NE	CEDAR RAPIDS	52402	2605 BLAIRS FERRY RD NE\nCEDAR RAPIDS 52402\n(...	57	Linn	...	34029	Absolut Citron (lemon Vodka)	12	1000	15.00	22.49	60	1349.40	60.00	15.85
917	S05694100030	05/23/2012	2487	Anamosa Family Foods	402 EAST MAIN	ANAMOSA	52205	402 EAST MAIN\nANAMOSA 52205\n(42.108289, -91....	53	Jones	...	11774	Black Velvet	24	375	3.07	4.60	48	220.80	18.00	4.76
918	S23309100006	01-06-2015	4559	Osage Payless Foods	633, CHASE ST	OSAGE	50461	633, CHASE ST\nOSAGE 50461\n(43.285134, -92.81...	66	Mitchell	...	35926	Five O'clock PET Vodka	12	750	3.37	5.06	12	60.72	9.00	2.38
25 rows × 24 columns
  1. Show a detailed summary of the dataset, highlighting key metrics such as mean, median, standard deviation, minimum, maximum, and count for each numerical column, as well as unique counts for categorical variables.

Solution
df.describe()

# output

Store Number	County Number	Category	Vendor Number	Item Number	Pack	Bottle Volume (ml)	State Bottle Cost	State Bottle Retail	Bottles Sold	Sale (Dollars)	Volume Sold (Liters)	Volume Sold (Gallons)
count	919.000000	919.000000	9.190000e+02	919.000000	919.000000	919.000000	919.000000	919.000000	919.000000	919.000000	919.000000	919.000000	919.000000
mean	3510.945593	55.944505	1.055505e+06	273.323177	45061.603917	12.150163	986.751904	10.207998	15.341012	9.672470	141.239565	9.118020	2.408781
std	861.839393	26.879060	9.464502e+04	160.723409	48146.068812	12.938571	617.957560	9.340595	14.006247	21.921488	589.045174	26.681196	7.048362
min	2106.000000	3.000000	1.011100e+06	35.000000	173.000000	1.000000	100.000000	0.000000	0.000000	1.000000	0.000000	0.200000	0.050000
25%	2614.000000	31.000000	1.022100e+06	115.000000	27934.500000	6.000000	750.000000	5.775000	8.670000	3.000000	33.180000	2.000000	0.530000
50%	3666.000000	57.000000	1.032080e+06	260.000000	40682.000000	12.000000	750.000000	7.990000	12.290000	6.000000	69.930000	6.000000	1.590000
75%	4201.500000	77.000000	1.062310e+06	389.000000	57152.500000	12.000000	1000.000000	11.800000	17.700000	12.000000	137.160000	10.500000	2.770000
max	5181.000000	99.000000	1.701100e+06	971.000000	988063.000000	336.000000	6000.000000	99.950000	149.920000	480.000000	15840.000000	588.000000	155.330000
  1. Display the number of rows and columns in the dataset separately.

Solution
r,c=df.shape
print(f'rows:{r}',f'columns:{c}')


# Output
rows:919 columns:24
  1. Generate a sample dataset containing 30 rows based on the current dataset.

Solution

df.sample(30)


# Output---- It results in a dataset of 30 rows randomly
75MB
retail_data.csv
242KB
sample_iowa_liquor_sales.csv