Dates and Time – AICorr.com


Timestamp

This is a dates and time tutorial.

Pandas provides powerful methods for dealing with dates and time. The data structure for dates and time in Pandas is a Timestamp. A timestamp is a single point in time. Let’s see how to create and deal with timestamps in Pandas.

The timestamp data structure uses the constructer “Timestamp()” – capital T is a must, otherwise the program outputs an error. We can now check how to create a timestamp for the current time, specific date, and specific date and time.

import pandas as pd

# Current date and time
current_timestamp = pd.Timestamp.now()
print(current_timestamp)

# Specific date
timestamp1 = pd.Timestamp('2024-04-06')
print(timestamp1)

# Specific date and time
timestamp2 = pd.Timestamp(2024, 4, 6, 12, 30, 0)
print(timestamp2)
2024-04-06 16:42:59.073490

2024-04-06 00:00:00

2024-04-06 12:30:00

Access timestamp components

We can access separate components of a timestamp data structure.

import pandas as pd

timestamp = pd.Timestamp(2024, 4, 6, 12, 30, 0)
print(timestamp)

print("Year:", timestamp.year)
print("Month:", timestamp.month)
print("Day:", timestamp.day)
print("Hour:", timestamp.hour)
print("Minute:", timestamp.minute)
print("Second:", timestamp.second)
Year: 2024
Month: 4
Day: 6
Hour: 12
Minute: 30
Second: 0

Strings to timestamp

In real projects, very often data comes from external sources. Such data may be in the string format. Pandas has the method “to_datetime()“. This technique converts string data (text) into timestamp data structure. Let’s explore the method.

# Random string date and time
timestamp_str = '2024-04-06 12:30:00'
# Convert to Timestamp
parsed_timestamp = pd.to_datetime(timestamp_str)

# Display outcome
print(parsed_timestamp)

# Outcome: 2024-04-06 12:30:00

Convert DataFrame

Let’s try another example, a bit more realistic (from a Pandas dataframe). First, we create a random dataframe, containing some values and dates, and then convert the date column into a timestamp format.

import pandas as pd

# Sample DataFrame
df = pd.DataFrame({
    'date': ['2024-01-01', '2024-01-02', '2024-01-03'],
    'income': [100.79, 200.00, 305.95]
})

print(df)
print(df.dtypes)
# Data
         date  income
0  2024-01-01  100.79
1  2024-01-02  200.00
2  2024-01-03  305.95

# Data types
date       object
income    float64
dtype: object

The date column is the data type object. Now, we convert it to data type datetime. We do the converting only on the date column.

# Convert the 'date' column to datetime type
df['date'] = pd.to_datetime(df['date'])

print(df.dtypes)
date      datetime64[ns]
income           float64
dtype: object

Extract timestamp components

Within this section, extracting refers to the separation of values into different columns.

We continue the above example. Date column must be converted before extracting the components. Otherwise, the program outputs an error. Let’s explore the example.

import pandas as pd

# Sample DataFrame
df = pd.DataFrame({
    'date': ['2024-01-01', '2024-01-02', '2024-01-03'],
    'income': [100.79, 200.00, 305.95]
})

# Convert the 'date' column to datetime type
df['date'] = pd.to_datetime(df['date'])

# Extract componenets
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.day

print(df)
        date  income  year  month  day
0 2024-01-01  100.79  2024      1    1
1 2024-01-02  200.00  2024      1    2
2 2024-01-03  305.95  2024      1    3

Other methods

Once we have a datetime column, we can apply various Pandas methods. Such as, filtering, resampling and grouping, shifting dates, calculating differences, and many more.

Let’s dive into some of them.

# Filter by day
df_day = df[df['date'].dt.day == 2]
print(df_day)

# Resample data by month
monthly_data = df.resample('M', on='date').sum()
print(monthly_data)

# Shift dates by 1 period
df['previous_date'] = df['date'].shift(1)
print(df)

# Calculate the difference between dates
df['date_diff'] = df['date'] - df['previous_date']
print(df)

The method shift has multiple options for flexibility (check here). The previous date column shifts all data with 1 day backwards. Since the 1st of Jan 2024 is the first day of the year, the previous date shows as NaT (Pandas missing date).

        date  income
1 2024-01-02   200.0

            income
date              
2024-01-31  606.74

        date  income previous_date
0 2024-01-01  100.79           NaT
1 2024-01-02  200.00    2024-01-01
2 2024-01-03  305.95    2024-01-02

        date  income previous_date date_diff
0 2024-01-01  100.79           NaT       NaT
1 2024-01-02  200.00    2024-01-01    1 days
2 2024-01-03  305.95    2024-01-02    1 days

This is an original dates and time educational material created by aicorr.com.

Next: Time Series Analysis

We will be happy to hear your thoughts

Leave a reply

0
Your Cart is empty!

It looks like you haven't added any items to your cart yet.

Browse Products
Powered by Caddy
Shopping cart