PDA Assignments
  • Python For Data Analytics
    • 1.Python
      • 1.Python Documents
        • 1.Data Types
        • 2.Variables In Python
        • 3.Operators In Python
        • 4.User Input In Python
        • 5.TypeCasting In Python
        • 6.Strings In Python
        • 7.Conditional Statements In Python
        • 8.Branching using Conditional Statements and Loops in Python
        • 9.Lists In Python
        • 10.Sets In Python
        • 11.Tuples In Python
        • 12.Dictionary In Python
        • 13.Functions In Python
        • 14.File Handling In Python
        • 15.Numerical Computing with Python and Numpy
      • 2.Python Assignments
        • Data Type & Variables
        • Operators Assignment
        • User Input & Type Casting
        • Functions- Basic Assignments
        • String Assignments
          • String CheatSheet
        • Conditional Statements Assignments
        • Loops Assignments
        • List Assignments
          • List Cheatsheet
        • Set Assignments
          • Sets Cheatsheet
        • Dictionary Assignments
          • Dictionary Cheatsheet
        • Function Assignments
        • Functions used in Python
      • 3.Python Projects
        • Employee Management System
        • Hamming distance
        • Webscraping With Python
          • Introduction To Web Scraping
          • Importing Necessary Libraries
          • Basic Introduction To HTML
          • Introduction To BeautifulSoup
          • Flipkart Web Scraping
            • Scraping Step By Step
        • Retail Sales Analysis
        • Guess the Word Game
        • Data Collection Through APIs
        • To-Do List Manager
        • Atm-functionalities(nested if)
        • Distribution of Cards(List & Nested for)
        • Guess the Number Game
      • 4.Python + SQL Projects
        • Bookstore Management System
    • 2.Data Analytics
      • 1.Pandas
        • 1.Pandas Documents
          • 1.Introduction To Pandas
          • Reading and Loading Different Data
          • 2.Indexing and Slicing In Pandas
          • 3.Joining In Pandas
          • 4.Missing Values In Pandas
          • 5.Outliers In Pandas
          • 6.Aggregating Data
          • 7.DateTime In Pandas
          • 8.Validation In Pandas
          • 9.Fetching Data From SQL
          • 10. Automation In Pandas
          • 11.Matplotlib - Data Visualization
          • 12. Seaborn - Data Visualization
          • 13. Required Files
        • 3.Pandas Projects
          • Retail Sales Analysis
            • Retail Sales Step By Step
          • IMDB - Dataset Analysis - Basic
        • 2. Pandas Assignments
          • 1. Reading and Loading the Data
          • 2. Data frame Functions and Properties
          • 3. Series - Basic Operations
          • 4. Filtering in Pandas
          • 5. Advance Filtering
          • 6. Aggregate Functions & Groupby
          • 7. Pivot Tables
          • 8. Datetime
          • 9. String Functions
Powered by GitBook
On this page
  1. Python For Data Analytics
  2. 2.Data Analytics
  3. 1.Pandas
  4. 2. Pandas Assignments

8. Datetime

Previous7. Pivot TablesNext9. String Functions

Last updated 2 months ago

  1. Load the data and display.

Solution
import pandas as pd
df=pd.read_csv('bakery_sales.csv')
df

# Output
        datetime		total
0	11-07-2019 15:35	23800.0
1	11-07-2019 16:10	15800.0
2	12-07-2019 11:49	58000.0
3	13-07-2019 13:19	14800.0
4	13-07-2019 13:22	15600.0
...	...	...	...
2415	02-05-2020 11:37	19500.0
2416	02-05-2020 11:39	19800.0
2417	02-05-2020 12:15	14300.0
2418	02-05-2020 13:45	15000.0
2419	02-05-2020 14:45	24100.0
2420 rows × 3 columns
  1. Convert the datetime column into proper datetime format, considering the data in the column is in (dd-mm-yyyy hh:mm) format.

Solution
df['datetime']=pd.to_datetime(df['datetime'],format="%d-%m-%Y %H:%M")
df

# Output

        datetime		total
0	2019-07-11 15:35:00	23800.0
1	2019-07-11 16:10:00	15800.0
2	2019-07-12 11:49:00	58000.0
3	2019-07-13 13:19:00	14800.0
4	2019-07-13 13:22:00	15600.0
...	...	...	...
2415	2020-05-02 11:37:00	19500.0
2416	2020-05-02 11:39:00	19800.0
2417	2020-05-02 12:15:00	14300.0
2418	2020-05-02 13:45:00	15000.0
2419	2020-05-02 14:45:00	24100.0
2420 rows × 3 columns
  1. Separate day, month name, and year in different columns from datetime.

Solution
df['date']=df['datetime'].dt.date
df['time']=df['datetime'].dt.time
df

# Output
        datetime		total	date	        time
0	2019-07-11 15:35:00	23800.0	2019-07-11	15:35:00
1	2019-07-11 16:10:00	15800.0	2019-07-11	16:10:00
2	2019-07-12 11:49:00	58000.0	2019-07-12	11:49:00
3	2019-07-13 13:19:00	14800.0	2019-07-13	13:19:00
4	2019-07-13 13:22:00	15600.0	2019-07-13	13:22:00
...	...	...	...	...	...
2415	2020-05-02 11:37:00	19500.0	2020-05-02	11:37:00
2416	2020-05-02 11:39:00	19800.0	2020-05-02	11:39:00
2417	2020-05-02 12:15:00	14300.0	2020-05-02	12:15:00
2418	2020-05-02 13:45:00	15000.0	2020-05-02	13:45:00
2419	2020-05-02 14:45:00	24100.0	2020-05-02	14:45:00
  1. Separate date, month and year from date column.

Solution
df['day']=df['datetime'].dt.day
df['month']=df['datetime'].dt.month_name()
df['year']=df['datetime'].dt.year
df

# Output
        datetime	        total	day	month	year
0	2019-07-11 15:35:00	23800	11	July	2019
1	2019-07-11 16:10:00	15800	11	July	2019
2	2019-07-12 11:49:00	58000	12	July	2019
3	2019-07-13 13:19:00	14800	13	July	2019
4	2019-07-13 13:22:00	15600	13	July	2019
...	...	...	...	...	...
2415	2020-05-02 11:37:00	19500	2	May	2020
2416	2020-05-02 11:39:00	19800	2	May	2020
2417	2020-05-02 12:15:00	14300	2	May	2020
2418	2020-05-02 13:45:00	15000	2	May	2020
2419	2020-05-02 14:45:00	24100	2	May	2020
2420 rows × 5 columns
  1. Extract time from the datetime column.

Solution
df['time']=df['datetime'].dt.time
df

# Output
        datetime        	total	date	month	year	time
0	2019-07-11 15:35:00	23800	11	July	2019	15:35:00
1	2019-07-11 16:10:00	15800	11	July	2019	16:10:00
2	2019-07-12 11:49:00	58000	12	July	2019	11:49:00
3	2019-07-13 13:19:00	14800	13	July	2019	13:19:00
4	2019-07-13 13:22:00	15600	13	July	2019	13:22:00
...	...	...	...	...	...	...
2415	2020-05-02 11:37:00	19500	2	May	2020	11:37:00
2416	2020-05-02 11:39:00	19800	2	May	2020	11:39:00
2417	2020-05-02 12:15:00	14300	2	May	2020	12:15:00
2418	2020-05-02 13:45:00	15000	2	May	2020	13:45:00
2419	2020-05-02 14:45:00	24100	2	May	2020	14:45:00
2420 rows × 6 columns
  1. Create a sample of 700 rows from the dataset and find the oldest date in the dataset.

Solution
ndf=df.sample(700)
oldest_date=df['datetime'].max()
oldest_date
  1. Create a sample of 700 rows from the dataset and find the latest date in the dataset.

Solution
ndf=df.sample(700)
latest_date=df['datetime'].min()
latest_date
  1. Create a sample of 700 rows from the dataset and arrange them in ascending order on behalf of date.

Solution
ndf=df.sample(700)
ndf=ndf.sort_values(by='datetime')
ndf
  1. Display the data of the sale that happened between 1st Aug 2019 and 01 Dec 2019.

Solution
df_filtered = df[(df['datetime'] > '2019-08-01') & (df['datetime']< '2019-12-01')]
df_filtered

# Output
        datetime	        total	day	month	year	time
135	2019-08-01 11:03:00	14500	1	August	2019	11:03:00
136	2019-08-01 11:17:00	13100	1	August	2019	11:17:00
137	2019-08-01 11:25:00	16100	1	August	2019	11:25:00
138	2019-08-01 11:50:00	14800	1	August	2019	11:50:00
139	2019-08-01 12:02:00	21100	1	August	2019	12:02:00
...	...	...	...	...	...	...
1082	2019-11-30 13:01:00	38200	30	November	2019	13:01:00
1083	2019-11-30 13:44:00	25100	30	November	2019	13:44:00
1084	2019-11-30 13:49:00	20300	30	November	2019	13:49:00
1085	2019-11-30 14:24:00	15800	30	November	2019	14:24:00
1086	2019-11-30 17:04:00	27600	30	November	2019	17:04:00
952 rows × 6 columns
  1. Find out the total sales on each date.

Solution
df['date']=df['datetime'].dt.date
ndf=df.groupby('date').agg(
    total_sale=('total','sum'),
)
ndf.reset_index()

# Output

        date	    total_sale
0	2019-07-11	39600
1	2019-07-12	58000
2	2019-07-13	117400
3	2019-07-14	212000
4	2019-07-15	30900
...	...	...
244	2020-04-27	264600
245	2020-04-29	118100
246	2020-04-30	134100
247	2020-05-01	215700
248	2020-05-02	92700
249 rows × 2 columns
  1. Find out total sale on each day.

Solution
df['day_of_the_week']=df['datetime'].dt.day_name()
ndf=df.groupby('day_of_the_week').agg(
    total_sale=('total','sum')
)
ndf.reset_index()

# output
    day_of_the_week	total_sale
0	Friday	8177700
1	Monday	6887000
2	Saturday	9098000
3	Sunday	11287500
4	Thursday	8140700
5	Tuesday	56000
6	Wednesday	7590600
  1. Find rows where the transaction happened in the afternoon (12 PM - 6 PM)

Solution
df['hour']=df['datetime'].dt.hour
ndf=df[(df['hour']>12) & (df['hour']<18)]
ndf

# Output
                datetime	total	day	month	year	time	date	day_of_the_week	hour
0	2019-07-11 15:35:00	23800	11	July	2019	15:35:00	2019-07-11	Thursday	15
1	2019-07-11 16:10:00	15800	11	July	2019	16:10:00	2019-07-11	Thursday	16
3	2019-07-13 13:19:00	14800	13	July	2019	13:19:00	2019-07-13	Saturday	13
4	2019-07-13 13:22:00	15600	13	July	2019	13:22:00	2019-07-13	Saturday	13
5	2019-07-13 14:54:00	15800	13	July	2019	14:54:00	2019-07-13	Saturday	14
...	...	...	...	...	...	...	...	...	...
2412	2020-05-01 13:55:00	21300	1	May	2020	13:55:00	2020-05-01	Friday	13
2413	2020-05-01 15:03:00	14800	1	May	2020	15:03:00	2020-05-01	Friday	15
2414	2020-05-01 15:19:00	14500	1	May	2020	15:19:00	2020-05-01	Friday	15
2418	2020-05-02 13:45:00	15000	2	May	2020	13:45:00	2020-05-02	Saturday	13
2419	2020-05-02 14:45:00	24100	2	May	2020	14:45:00	2020-05-02	Saturday	14
1160 rows × 9 columns
  1. Find the sale happened on weekends(sat and sun).

Solution
ndf=df.loc[(df['day_of_the_week']=='Saturday') | (df['day_of_the_week']=='Sunday')]
ndf

# Output

datetime	total	day	month	year	time	date	day_of_the_week	hour
3	2019-07-13 13:19:00	14800	13	July	2019	13:19:00	2019-07-13	Saturday	13
4	2019-07-13 13:22:00	15600	13	July	2019	13:22:00	2019-07-13	Saturday	13
5	2019-07-13 14:54:00	15800	13	July	2019	14:54:00	2019-07-13	Saturday	14
6	2019-07-13 15:08:00	15800	13	July	2019	15:08:00	2019-07-13	Saturday	15
7	2019-07-13 15:09:00	14000	13	July	2019	15:09:00	2019-07-13	Saturday	15
...	...	...	...	...	...	...	...	...	...
2415	2020-05-02 11:37:00	19500	2	May	2020	11:37:00	2020-05-02	Saturday	11
2416	2020-05-02 11:39:00	19800	2	May	2020	11:39:00	2020-05-02	Saturday	11
2417	2020-05-02 12:15:00	14300	2	May	2020	12:15:00	2020-05-02	Saturday	12
2418	2020-05-02 13:45:00	15000	2	May	2020	13:45:00	2020-05-02	Saturday	13
2419	2020-05-02 14:45:00	24100	2	May	2020	14:45:00	2020-05-02	Saturday	14
1007 rows × 9 columns

57KB
bakery_sales.csv