Nail the use of dates in Python
"Time is money", goes the old saying. In the fast-paced world of finance, this has never been truer. Dates and time series are the lifeblood of financial markets, influencing every decision, every analysis, every transaction. Whether tracking the performance of a stock over a decade or analyzing market fluctuations minute by minute, precise mastery of dates is absolutely crucial.
In this chapter, we'll unravel the mysteries of date manipulation in Python, giving you the keys to navigate through time with ease.
Introduction to Dates in Python
datetime
Module
The In Python, date management is primarily handled by the datetime
module. It provides classes for manipulating both dates (year, month, day) and specific moments in time (adding hours, minutes, seconds).
Here's how to create your first date and datetime objects in Python:
import datetime
# Creating a simple date
d = datetime.date(2023, 8, 21)
print(d) # Displays "2023-08-21"
# Creating a precise moment (timestamp)
dt = datetime.datetime(2023, 8, 21, 14, 30)
print(dt) # Displays "2023-08-21 14:30:00"
To get the current date or the exact current moment:
today = datetime.date.today()
print(today) # Displays today's date
now = datetime.datetime.now()
print(now) # Displays the current precise moment
2. Formats courants de dates
2.1 String
Dates are often represented as strings, especially when read from files or databases.
date_string = "2023-08-21"
# Convert a string to a datetime object
date_object = datetime.datetime.strptime(date_string, "%Y-%m-%d")
print(date_object)
# Another example with day/month/year and hours:minutes
date_string2 = "21/08/2023 14:30"
date_object2 = datetime.datetime.strptime(date_string2, "%d/%m/%Y %H:%M")
print(date_object2)
2.2 Timestamp Unix
The Unix timestamp represents the number of seconds (or milliseconds) elapsed since January 1, 1970, known as "epoch". It's a compact, universal format widely used in finance.
import time
# Get the current timestamp
timestamp = time.time()
print(timestamp)
# Convert a timestamp to datetime
date_from_timestamp = datetime.datetime.fromtimestamp(timestamp)
print(date_from_timestamp)
# Example with milliseconds
timestamp_millis = timestamp * 1000
date_from_millis = datetime.datetime.fromtimestamp(timestamp_millis / 1000)
print(date_from_millis)
2.3 Summary of Date Types
To summarize, there are primarily four formats for storing a date:
String format, e.g.,
"2023-08-28"
, which is a string.Timestamp format, e.g.,
1693233091.2038012
, which is afloat
and represents the number of seconds (or milliseconds) since 1970.Datetime format, commonly used in Python for storing dates with hours and seconds, e.g.,
datetime.datetime(2023, 8, 21, 14, 30)
, this object is of typedatetime.datetime
.Date format, commonly used in Python for storing dates without hours or seconds, e.g.,
datetime.date(2023, 8, 21)
, this object is of typedatetime.date
.
Converting Between Date Formats
Navigating between different formats is a common requirement.
# From datetime to string
formatted_date = date_object.strftime("%Y-%m-%d")
print(formatted_date) # "2023-08-21"
# Another format
formatted_date2 = date_object.strftime("%d %B %Y")
print(formatted_date2) # "21 August 2023"
# From datetime to timestamp (seconds since epoch)
timestamp_from_date = date_object.timestamp()
print(timestamp_from_date)
# And in milliseconds
timestamp_millis_from_date = timestamp_from_date * 1000
print(timestamp_millis_from_date)
Above are various examples of converting one date format to another. These examples are not exhaustive, and you'll need to research according to your specific needs.
4. Using Dates with Pandas
Having explored the basics of date manipulation in Python, it's time to integrate these skills with Pandas. If you're in the financial world, Pandas will be your go-to tool for a multitude of tasks, especially processing financial time series. In this episode, we'll see how Pandas can aid us in manipulating dates for specific finance-related cases, like time series of Bitcoin (BTC) prices.
4.1 Creating the DataFrame and Converting the Date Format
For these examples, we'll load our DataFrame from a CSV file previously introduced in the training https://github.com/RobotTraders/Python_For_Finance/blob/main/BTC-USDT.csv. This CSV represents the hourly price evolution of Bitcoin in the OHLCV (Open, High, Low, Close, Volume) format.
df = pd.read_csv("BTC-USDT.csv")
As you can see, the date column is not immediately readable. However, if you recall the start of the episode, this format might ring a bell. You may have recognized it – these dates are timestamps
. They represent the number of milliseconds elapsed since 1970 (the numerous zeros at the end are a giveaway).
Here's how to convert timestamps
into datetime
(more readable and manageable):
df = pd.read_csv("BTC-USDT.csv")
print(df["date"].dtype)
df["date"] = pd.to_datetime(df["date"], unit="ms")
print(df["date"].dtype)
The dates are immediately more legible. In the code, we also included two lines to display the column type before and after the transformation. We notice that before the transformation, the column was of type int
, and now it's datetime
.
Now we can start performing manipulations on the dates.
4.2 Extracting the Day of the Week
Pandas makes it easy to extract the day of the week with its dayofweek
attribute.
df["weekday"] = df["date"].dt.weekday
The days of the week are displayed from 0 to 6, with 0 representing Monday and 6 representing Sunday.
If you wish to see them as string
like "Monday"
, you'll need to do what's known as mapping (matching one value to another). This is done using a dictionary and the Pandas .map()
function.
day_name_map = {
0: 'Monday',
1: 'Tuesday',
2: 'Wednesday',
3: 'Thursday',
4: 'Friday',
5: 'Saturday',
6: 'Sunday'
}
df['weekday'] = df['weekday'].map(day_name_map)
4.3 Subtracting Between Two Dates
Let's now verify that we indeed have an hourly interval between our data points. We can do this by checking the time difference between each data point and its predecessor.
df["date_diff"] = df["date"] - df["date"].shift(1)
print(df["date_diff"].value_counts())
Here, we simply calculate the difference between the value in the date column and the value in the previous row of the date column using shift(1)
. We then display all the unique values and their associated number of occurrences.
We notice that, in the vast majority of cases, we indeed have an hour between our two values. However, there are some exceptions:
26 cases with no gap (possibly duplicates)
About 30 cases with a difference of at least two hours up to a maximum of more than a day. It might be interesting to investigate the cause.
4.4 Grouping by Date
From our data, it would be interesting to know if the average trading volume of Bitcoin varies depending on the day of the week.
To do this, we can try grouping our data by the day of the week. Here we'll use the group_by()
function to group our data by the weekday, then the agg()
function to aggregate our data based on the grouped column.
This can be done in just one line of code:
df.groupby('weekday').agg({"volume": "mean"})
As expected, trading volumes on weekends (Saturday and Sunday) are significantly lower than on weekdays – nearly 50% lower. This indicates that a major portion of Bitcoin trading is conducted by
Mise en pratique
Let's explore a practical question: Are there noticeable differences in Bitcoin behavior based on the operational hours of major financial markets?
For instance, we can look at the European (Euronext), New York (NYSE), and Tokyo stock exchanges. The European exchange operates between 9 AM and 5 PM Paris time. You are encouraged to find out the operating hours for the other exchanges.
The first step will be to create a column for each stock exchange and mark it as True
if the exchange is open at the time of the data point, and False
if not.
Once this is done, you can compare the average trading volume depending on whether the exchange is open or closed. You can also compare the average variation of a candlestick by, for example, looking at the percentage difference between the high and low of each candlestick. You are really encouraged to conduct your own analyses.
A correction example can be found here: https://github.com/RobotTraders/Python_For_Finance/blob/main/exercise_correction_chapter_8.ipynb