Develop a comprehensive analysis of Bitcoin’s price trends from 2013 to 2023 using Python libraries and interactive visualization tools.
Delivered an analytical report with interactive visualizations exhibiting Bitcoin’s open, high, low, and close values, as well as the examination of its closing price on several scales and time periods. The project provides an in-depth analysis of the ten-year price changes of Bitcoin
The project, which included data collecting, cleaning, analysis, and visualization, took around a week to complete.
For this project, we stored Bitcoin price data from Yahoo Finance for the previous decade. We will save it as a CSV file and perform various analyses.
We are setting up a robust data analysis toolkit:
Pandas for data management,
Numpy for numerical operations,
Matplotlib for basic plotting,
and Seaborn for advanced visualizations.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import yfinance as yf
# Fetch Bitcoin data
btc_data = yf.download('BTC-USD', start='2013-01-01', end='2023-01-01')
# Save to CSV
btc_data.to_csv('bitcoin_data.csv')
import yfinance as yf
: This line brings in the yfinance library and gives it a shorter name, yf
. yfinance is a popular tool that allows you to pull financial market data from Yahoo Finance.
btc_data = yf.download('BTC-USD', start='2013-01-01', end='2023-01-01')
Here, you’re using yfinance to download historical data for Bitcoin (denoted by ‘BTC-USD’ for its USD value). We specify a date range from January 1, 2013, to January 1, 2023. This data includes things like daily opening and closing prices, highs and lows, and trading volume. The data is then stored in a DataFrame called btc_data
.
btc_data.to_csv('bitcoin_data.csv')
: This line takes the Bitcoin data you just downloaded and saves it as a CSV file named ‘bitcoin_data.csv’. It’s a way of storing this data on your machine, so you don’t have to download it every time you run your analysis.
In summary, we are downloading a decade’s worth of Bitcoin price data from Yahoo Finance for convenient access and analysis on your computer.
df = pd.read_csv(r'C:\Users\msr-h\OneDrive\Python\7 - Btc Analysis/bitcoin_data.csv')
This line loads the Bitcoin data from a CSV file into a pandas DataFrame named df
.
df.head(5)
df.head(5)
: Displays the first 5 rows of the DataFrame df
. It’s a quick way to get a glimpse of the data.
df.columns
df.shape
df.info()
df.columns
: Lists all the column names in the DataFrame.
df.shape
: Gives the dimensions of the DataFrame, showing the number of rows and columns.
df.info()
: Provides a concise summary of the DataFrame, including the number of non-null entries in each column and data types.
df.describe().T
df.describe().T
: Generates descriptive statistics that summarize the central tendency, dispersion, and shape of the dataset’s distribution.
df.dtypes
df['Date'].min()
df['Date'].max()
df['Date']
type(df['Date'][0])
df.head(5)
df.isnull().sum() # check whether we have missing value or not !
df.duplicated().sum()
df.dtypes
: Shows the data types of each column in the DataFrame.
df['Date'].min()
: Gets the earliest date in the ‘Date’ column.
df['Date'].max()
: Gets the most recent date in the ‘Date’ column.
df['Date']
: Displays the entire ‘Date’ column.
type(df['Date'][0])
: Checks the data type of the first element in the ‘Date’ column.
df.isnull().sum()
: Counts missing values in each column.
df.duplicated().sum()
: Counts duplicated rows in the DataFrame.
Looks like we don’t have missing values
The sort_values() method is usually used to sort a DataFrame. We can sort the data using this method according to the values in one or more columns.
df.head(5)
df.head(5)
: Displays the first 5 rows of the DataFrame to show a sample of the data.
df.tail(5)
df.tail(5)
: Shows the last 5 rows of the DataFrame, providing a quick view of the most recent data entries.
data = df.sort_index(ascending=False).reset_index()
data = df.sort_index(ascending=False).reset_index()
: This line sorts the DataFrame df
by its index in descending order (newest data first) and then resets the index with reset_index()
, which also adds a new sequential index to the DataFrame and moves the old index to a column. The result is stored in a new DataFrame named data
.
When examining a stock’s price history over time, time series data analysis is usually required. A date or timestamp column showing the time of each observation is frequently used in the format of this kind of data, together with one or more other columns listing the stock’s characteristics at that particular moment (e.g., opening price, closing price, volume of shares traded).
data.drop('index' , axis=1 , inplace=True)
data.drop('index', axis=1, inplace=True)
: This line removes the ‘index’ column from the DataFrame data
. The axis=1
parameter specifies that a column (not a row) should be dropped. inplace=True
means the change is made directly in data
without creating a new DataFrame.
data
data.columns
data.columns
: This line lists all the column names in the DataFrame data
. It’s a quick way to see what data you have available.
plt.figure(figsize=(20,12))
: Sets up a plotting area for your visualizations. The figsize=(20,12)
parameter defines the size of the figure (30 inches wide by 18 inches tall). It’s like preparing a canvas for painting.
The for loop with plotting:
for index, col in enumerate(['Open', 'High', 'Low', 'Close'], 1)
: This loop iterates over each column name in the list ['Open', 'High', 'Low', 'Close']
. The enumerate
function adds a counter (index
) starting from 1.plt.subplot(2,2,index)
: Creates a subplot in a 2×2 grid. For each iteration, a new subplot is created at the position indicated by index
.plt.plot(df['Date'], df[col])
: Plots the data from the col
column against the ‘Date’ column. For each iteration, it plots one of the ‘Open’, ‘High’, ‘Low’, or ‘Close’ values over time.plt.title(col)
: Sets the title of each subplot to the name of the column being plotted.These are our results;
data.drop('index', axis=1, inplace=True)
: This line removes the ‘index’ column from the DataFrame data
. The axis=1
parameter specifies that a column (not a row) should be dropped. inplace=True
means the change is made directly in data
without creating a new DataFrame.
data.columns
data.shape
: Shows the number of rows and columns in the data
DataFrame.
bitcoin_sample = data[0:50]
bitcoin_sample = data[0:50]
: Takes the first 50 rows from data
to create a smaller sample dataset.
!pip install chart-studio
!pip install plotly
!pip install chart-studio
and !pip install plotly
: These commands install the chart-studio and plotly packages, used for creating interactive charts and visualizations.
import chart_studio.plotly as py
import plotly.graph_objs as go
import plotly.express as px
from plotly.offline import download_plotlyjs , init_notebook_mode , plot , iplot
Import Statements: Import necessary modules from plotly for interactive plotting.
init_notebook_mode(connected=True)
Initializes Plotly’s notebook mode for interactive plots within a Jupyter notebook.
trace = go.Candlestick(x=bitcoin_sample['Date'] ,
high =bitcoin_sample['High'] ,
open = bitcoin_sample['Open'] ,
close = bitcoin_sample['Close'] ,
low = bitcoin_sample['Low'])
This code sets up a candlestick chart for Bitcoin prices using Plotly, showing open, high, low, and close values for each date in your sample data.
candle_data = [trace]
layout = {
'title':'Bitcoin Historical Price' ,
'xaxis':{'title':'Date'}
}
We are setting up the data and layout for your Plotly chart: candle_data
contains the candlestick trace for the Bitcoin data, and layout
defines the chart’s title and labels the x-axis as ‘Date’.
fig = go.Figure(data = candle_data , layout=layout)
fig.update_layout(xaxis_rangeslider_visible = False)
fig.show()
This code creates and displays a Plotly figure with the Bitcoin candlestick data, custom layout settings, and hides the range slider for the x-axis.
data['Close']
The closing prices of Bitcoin for various dates are shown in this column.
data['Close'].plot()
Makes a simple plot of the closing prices on the x-axis using the DataFrame’s default numerical index.
Since we are using pandas to plot “data[‘Close’],” our x-indices on the above plot are numbers. Let’s first set “date” as our index so that we will get date on “x-indexes”
.plot() “, thus our row indexes for the pandas plot need to be dates.
data.set_index('Date' , inplace=True)
Sets the date as the x-axis for all upcoming plots by changing the DataFrame’s index to the ‘Date’ column.
data
data['Close'].plot()
“data[‘Close’].plot()” is used again to plot the closing prices against dates.
np.log1p(data['Close']).plot()
plots the closing prices after applying a logarithmic transformation, which is helpful for data containing large ranges or outliers.
plt.figure(figsize=(20,6))
plt.subplot(1,2,1)
data['Close'].plot()
plt.title('No scaling')
plt.subplot(1,2,2)
np.log1p(data['Close']).plot()
plt.title('Log scaling')
plt.yscale('log')
Using plt.figure()
and plt.subplot()
, two plots are created side by side: one showing the original closing prices and the other showing the logarithmically transformed prices. This helps in comparing the data on both normal and logarithmic scales.
If your data contains any outliers or high values, using the log scale is always preferable.
In order to resample your date feature in a different way, you must first make it ‘row-index’.
a..yearly(‘Y’) ,
b..quarterly(‘Q’) ,
c..monthly(‘M’) ,
d..weekly basis (‘W’),
e..Daily_basis(‘D’)
f..minutes (‘3T’) ,
g..30 second bins(‘30S’) ,
h..resample(‘17min’)
data.head(5)
data.index = pd.to_datetime(data.index)
Given that it enables time-based resampling, this is crucial for time series analysis.
## finding avg price of bitcoin on annualy basis
data['Close'].resample('Y').mean()
data['Close'].resample('Y').mean()
: This line calculates the mean (average) for each year by resampling the ‘Close’ column to an annual (‘Y’) frequency. In essence, it determines the mean closing price of Bitcoin per year.
data['Close'].resample('Y').mean().plot()
data['Close'].resample('Y').mean().plot()
:The yearly average closing prices are plotted on this line, which gives an illustration of how the average price varies annually.
Import Statements: Import necessary modules from plotly for interactive plotting.
# finding the quarterly average price of bitcoin
data['Close'].resample('Q').mean()
Performs a quarterly (‘Q’) resampling of the ‘Close’ column and determines the mean for each quarter.
# calculating the average price of bitcoin each month
data['Close'].resample('M').mean()
This command calculates the mean closing price for each month by resampling the data to a monthly (‘M’) frequency.
data['Close'].resample('M').mean().plot()
To determine your daily earnings or losses on a particular stock, you can employ the daily stock return formula. This formula involves first finding the difference between the stock’s closing price and its opening price. Next, you multiply this difference by the quantity of shares you possess in that stock. Mathematically, the formula can be represented as:
Daily Return = (Closing Price−Opening Price) × Number of Shares Daily Return = (Closing Price−Opening Price) × Number of Shares
This calculation provides the amount of profit or loss you have made on that stock for a specific day.
data['Close']
data['Close_price_pct_change'] = data['Close'].pct_change()*100
The percentage change in the closing price from the preceding row is computed in this line. The outcome is converted to a percentage format by multiplying it by 100. This updated information is kept in a new column called
data['Close_price_pct_change']
data['Close_price_pct_change'].plot(color='#FF0000')
import chart_studio.plotly as py
## Chart_studio offers a web-based graph hosting solution
import plotly.graph_objs as go
import plotly.express as px
from plotly.offline import download_plotlyjs , init_notebook_mode , plot , iplot
init_notebook_mode(connected=True)
init_notebook_mode(connected=True)
: Initializes the notebook mode for interactive Plotly plots within a Jupyter notebook environment.
import cufflinks as cf
import cufflinks as cf
: Imports the cufflinks library, which links Plotly with pandas.
cf.go_offline()
cf.go_offline()
: Configures cufflinks to work offline for creating interactive charts.
data['Close_price_pct_change']
This line helps understand the data structure of the ‘Close_price_pct_change’ column.
type(data['Close_price_pct_change'])
Helps understand the data structure of the ‘Close_price_pct_change’ column.
data['Close_price_pct_change'].iplot()
The Bitcoin Price Analysis project provides a detailed and interactive examination of Bitcoin’s price history over a significant period. It offers useful insights into the trends and performance of the cryptocurrency, proving to be a practical resource for investors, analysts, and those interested in understanding Bitcoin’s market dynamics.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx