Skip to content

Latest commit

 

History

History
936 lines (706 loc) · 21.7 KB

File metadata and controls

936 lines (706 loc) · 21.7 KB
import numpy as np
import pandas as pd
from pathlib import Path
%matplotlib inline

Return Forecasting: Read Historical Daily Yen Futures Data

In this notebook, you will load historical Dollar-Yen exchange rate futures data and apply time series analysis and modeling to determine whether there is any predictable behavior.

# Futures contract on the Yen-dollar exchange rate:
# This is the continuous chain of the futures contracts that are 1 month to expiration
yen_futures = pd.read_csv(
    Path("yen.csv"), index_col="Date", infer_datetime_format=True, parse_dates=True
)
yen_futures.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Open High Low Last Change Settle Volume Previous Day Open Interest
Date
1976-08-02 3398.0 3401.0 3398.0 3401.0 NaN 3401.0 2.0 1.0
1976-08-03 3401.0 3401.0 3401.0 3401.0 NaN 3401.0 0.0 1.0
1976-08-04 3401.0 3401.0 3401.0 3401.0 NaN 3401.0 0.0 1.0
1976-08-05 3401.0 3401.0 3401.0 3401.0 NaN 3401.0 0.0 1.0
1976-08-06 3401.0 3401.0 3401.0 3401.0 NaN 3401.0 0.0 1.0
# Trim the dataset to begin on January 1st, 1990
yen_futures = yen_futures.loc["1990-01-01":, :]
yen_futures.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Open High Low Last Change Settle Volume Previous Day Open Interest
Date
1990-01-02 6954.0 6954.0 6835.0 6847.0 NaN 6847.0 48336.0 51473.0
1990-01-03 6877.0 6910.0 6865.0 6887.0 NaN 6887.0 38206.0 53860.0
1990-01-04 6937.0 7030.0 6924.0 7008.0 NaN 7008.0 49649.0 55699.0
1990-01-05 6952.0 6985.0 6942.0 6950.0 NaN 6950.0 29944.0 53111.0
1990-01-08 6936.0 6972.0 6936.0 6959.0 NaN 6959.0 19763.0 52072.0

Return Forecasting: Initial Time-Series Plotting

Start by plotting the "Settle" price. Do you see any patterns, long-term and/or short?

# Plot just the "Settle" column from the dataframe:
yen_futures.Settle.plot(figsize=(15,8), fontsize = 15, colormap='cool', title="Yen Futures Settle Prices")
<matplotlib.axes._subplots.AxesSubplot at 0x7fb94a197610>

png


Decomposition Using a Hodrick-Prescott Filter

Using a Hodrick-Prescott Filter, decompose the Settle price into a trend and noise.

import statsmodels.api as sm
# Apply the Hodrick-Prescott Filter by decomposing the "Settle" price into two separate series:

yen_futures_noise, yen_futures_trend = sm.tsa.filters.hpfilter(yen_futures["Settle"])
# Create a dataframe of just the settle price, and add columns for "noise" and "trend" series from above:
data = {'Settle': yen_futures.Settle, 'Noise': yen_futures_noise, 'Trend': yen_futures_trend}
df = pd.DataFrame(data)
df.index
DatetimeIndex(['1990-01-02', '1990-01-03', '1990-01-04', '1990-01-05',
               '1990-01-08', '1990-01-09', '1990-01-10', '1990-01-11',
               '1990-01-12', '1990-01-15',
               ...
               '2019-10-02', '2019-10-03', '2019-10-04', '2019-10-07',
               '2019-10-08', '2019-10-09', '2019-10-10', '2019-10-11',
               '2019-10-14', '2019-10-15'],
              dtype='datetime64[ns]', name='Date', length=7515, freq=None)
# Plot the Settle Price vs. the Trend for 2015 to the present
df[['Settle', 'Trend']].loc['2015':].plot(fontsize = 15, colormap='cool', figsize=(15,8), title="Trend vs Settle for 2015")
<matplotlib.axes._subplots.AxesSubplot at 0x7fb99213aa90>

png

# Plot the Settle Noise
yen_futures_noise.plot(figsize=(15,8), colormap='cool_r', title="Settle Price Noise", fontsize= 15)
<matplotlib.axes._subplots.AxesSubplot at 0x7fb9482046d0>

png


Forecasting Returns using an ARMA Model

Using futures Settle Returns, estimate an ARMA model

  1. ARMA: Create an ARMA model and fit it to the returns data. Note: Set the AR and MA ("p" and "q") parameters to p=2 and q=1: order=(2, 1).
  2. Output the ARMA summary table and take note of the p-values of the lags. Based on the p-values, is the model a good fit (p < 0.05)?
  3. Plot the 5-day forecast of the forecasted returns (the results forecast from ARMA model)
# Create a series using "Settle" price percentage returns, drop any nan"s, and check the results:
# (Make sure to multiply the pct_change() results by 100)
# In this case, you may have to replace inf, -inf values with np.nan"s
returns = (yen_futures[["Settle"]].pct_change() * 100)
returns = returns.replace(-np.inf, np.nan).dropna()
returns.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Settle
Date
1990-01-03 0.584197
1990-01-04 1.756933
1990-01-05 -0.827626
1990-01-08 0.129496
1990-01-09 -0.632275
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
import statsmodels.api as sm
from statsmodels.tsa.arima_model import ARMA

# Estimate and ARMA model using statsmodels (use order=(2, 1))
model = ARMA(returns.values,  order=(2,1))

# Fit the model and assign it to a variable called results
results = model.fit()
# Output model summary results:
results = model.fit()
results.summary()
ARMA Model Results
Dep. Variable: y No. Observations: 7514
Model: ARMA(2, 1) Log Likelihood -7894.071
Method: css-mle S.D. of innovations 0.692
Date: Sun, 23 Aug 2020 AIC 15798.142
Time: 19:03:52 BIC 15832.765
Sample: 0 HQIC 15810.030
coef std err z P>|z| [0.025 0.975]
const 0.0063 0.008 0.804 0.422 -0.009 0.022
ar.L1.y -0.3059 1.278 -0.239 0.811 -2.810 2.198
ar.L2.y -0.0019 0.019 -0.099 0.921 -0.040 0.036
ma.L1.y 0.2944 1.278 0.230 0.818 -2.210 2.798
Roots
Real Imaginary Modulus Frequency
AR.1 -3.3382 +0.0000j 3.3382 0.5000
AR.2 -157.3438 +0.0000j 157.3438 0.5000
MA.1 -3.3973 +0.0000j 3.3973 0.5000
# Plot the 5 Day Returns Forecast
pd.DataFrame(results.forecast(steps=5)[0]).plot(figsize=(15,8), colormap='cool_r', fontsize= 15, title= "5 Day Returns Forecast")
<matplotlib.axes._subplots.AxesSubplot at 0x7fb948204190>

png

results.forecast(steps=5)
(array([0.01229407, 0.00543711, 0.0066175 , 0.00626945, 0.00637368]),
 array([0.69187027, 0.69191656, 0.69191748, 0.69191756, 0.69191757]),
 array([[-1.34374675,  1.36833489],
        [-1.35069442,  1.36156865],
        [-1.34951585,  1.36275085],
        [-1.34986405,  1.36240295],
        [-1.34975984,  1.3625072 ]]))

Forecasting the Settle Price using an ARIMA Model

  1. Using the raw Yen Settle Price, estimate an ARIMA model.
    1. Set P=5, D=1, and Q=1 in the model (e.g., ARIMA(df, order=(5,1,1))
    2. P= # of Auto-Regressive Lags, D= # of Differences (this is usually =1), Q= # of Moving Average Lags
  2. Output the ARIMA summary table and take note of the p-values of the lags. Based on the p-values, is the model a good fit (p < 0.05)?
  3. Construct a 5 day forecast for the Settle Price. What does the model forecast will happen to the Japanese Yen in the near term?
from statsmodels.tsa.arima_model import ARIMA

# Estimate and ARIMA Model:
# Hint: ARIMA(df, order=(p, d, q))

model = ARIMA(yen_futures.Settle.values,  order=(5,1,1))
# Fit the model
result = model.fit()
# Output model summary results:
result.summary()
ARIMA Model Results
Dep. Variable: D.y No. Observations: 7514
Model: ARIMA(5, 1, 1) Log Likelihood -41944.619
Method: css-mle S.D. of innovations 64.281
Date: Sun, 23 Aug 2020 AIC 83905.238
Time: 19:03:54 BIC 83960.635
Sample: 1 HQIC 83924.259
coef std err z P>|z| [0.025 0.975]
const 0.3158 0.700 0.451 0.652 -1.056 1.688
ar.L1.D.y 0.2814 0.699 0.402 0.688 -1.090 1.652
ar.L2.D.y 0.0007 0.016 0.042 0.966 -0.030 0.032
ar.L3.D.y -0.0127 0.012 -1.032 0.302 -0.037 0.011
ar.L4.D.y -0.0137 0.015 -0.890 0.374 -0.044 0.016
ar.L5.D.y -0.0012 0.018 -0.066 0.948 -0.036 0.034
ma.L1.D.y -0.2964 0.699 -0.424 0.672 -1.667 1.074
Roots
Real Imaginary Modulus Frequency
AR.1 1.8905 -1.3790j 2.3400 -0.1003
AR.2 1.8905 +1.3790j 2.3400 0.1003
AR.3 -2.2637 -3.0253j 3.7785 -0.3522
AR.4 -2.2637 +3.0253j 3.7785 0.3522
AR.5 -10.8643 -0.0000j 10.8643 -0.5000
MA.1 3.3740 +0.0000j 3.3740 0.0000
# Plot the 5 Day Price Forecast
pd.DataFrame(result.forecast(steps=5)[0]).plot(figsize=(15,8), colormap='cool_r', fontsize= 15, title= "5 Day Returns Forecast")
<matplotlib.axes._subplots.AxesSubplot at 0x7fb927da7650>

png

result.forecast(steps=5)
(array([9224.00824179, 9225.50147784, 9226.57944549, 9227.66558386,
        9228.20532462]),
 array([ 64.28055074,  90.22570432, 110.09282009, 126.45210579,
        140.43685431]),
 array([[9098.02067743, 9349.99580615],
        [9048.66234689, 9402.34060878],
        [9010.80148316, 9442.35740782],
        [8979.82401074, 9475.50715699],
        [8952.95414807, 9503.45650118]]))

Volatility Forecasting with GARCH

Rather than predicting returns, let's forecast near-term volatility of Japanese Yen futures returns. Being able to accurately predict volatility will be extremely useful if we want to trade in derivatives or quantify our maximum loss.

Using futures Settle Returns, estimate an GARCH model

  1. GARCH: Create an GARCH model and fit it to the returns data. Note: Set the parameters to p=2 and q=1: order=(2, 1).
  2. Output the GARCH summary table and take note of the p-values of the lags. Based on the p-values, is the model a good fit (p < 0.05)?
  3. Plot the 5-day forecast of the volatility.
from arch import arch_model
# Estimate a GARCH model:

model = arch_model(returns, mean="Zero", vol="GARCH", p=2, q=1)

# Fit the model
results = model.fit(disp="off")
# Summarize the model results
results.summary()
Zero Mean - GARCH Model Results
Dep. Variable: Settle R-squared: 0.000
Mean Model: Zero Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: -7461.93
Distribution: Normal AIC: 14931.9
Method: Maximum Likelihood BIC: 14959.6
No. Observations: 7514
Date: Sun, Aug 23 2020 Df Residuals: 7510
Time: 19:03:55 Df Model: 4
Volatility Model
coef std err t P>|t| 95.0% Conf. Int.
omega 4.2896e-03 2.057e-03 2.085 3.708e-02 [2.571e-04,8.322e-03]
alpha[1] 0.0381 1.282e-02 2.970 2.974e-03 [1.295e-02,6.321e-02]
alpha[2] 0.0000 1.703e-02 0.000 1.000 [-3.338e-02,3.338e-02]
beta[1] 0.9536 1.420e-02 67.135 0.000 [ 0.926, 0.981]


Covariance estimator: robust
# Find the last day of the dataset
last_day = returns.index.max().strftime('%Y-%m-%d')
last_day
'2019-10-15'
# Create a 5 day forecast of volatility
forecast_horizon = 5
# Start the forecast using the last_day calculated above
forecasts = results.forecast(start=last_day, horizon=forecast_horizon)
forecasts
<arch.univariate.base.ARCHModelForecast at 0x7fb92a2d5050>
# Annualize the forecast
intermediate = np.sqrt(forecasts.variance.dropna() * 252)
intermediate.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
h.1 h.2 h.3 h.4 h.5
Date
2019-10-15 7.434048 7.475745 7.516867 7.557426 7.597434
# Transpose the forecast so that it is easier to plot
final = intermediate.dropna().T
final.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Date 2019-10-15 00:00:00
h.1 7.434048
h.2 7.475745
h.3 7.516867
h.4 7.557426
h.5 7.597434
# Plot the final forecast
final.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7fb92a428d50>

png


Conclusions

Based on your time series analysis, would you buy the yen now?

Is the risk of the yen expected to increase or decrease?

Based on the model evaluation, would you feel confident in using these models for trading?

Based on your time series analysis, would you buy the yen now?: Overall trend Yen/ USD is upward. Prices are increasing so i would buy Yen.

Is the risk of the yen expected to increase or decrease? - the volatility is increasing so yes the risk is increasing.

Based on the model evaluation, would you feel confident in using these models for trading? - ARMA model is not significant based on the (p > 0.05), so it doesn't allow us to do a good judgement call. ARIMA model (p > 0.05) - I would not use it for the estimations as well. GARCH model (p < 0.05) gives us more confidence to predict volatility but it does not allow to make a buy/sell call. I won't be confident in using these models at least in ARMA / ARIMA (p=2 and q=1 / p=5, d=1, q=1).