Comments on Homework Assignment
For homework one:
1) compounding return instead of simple return.
cumsum(R)
2) different way of calculating daily return from price data
R = diff(log(P))
or
Rt = (Pt - Pt-1)/Pt-1
3) when using strategy, Rs = strategy .* R
Evaluation of Models
Measures of Error
For Yt^
forecast data
Yt true data
ENMS : Normalized Mean Square Error (L2 norm):
ENMS = Sum( Yt^- Yt )2
/ Sum( Y- - Yt )2
Y- is mean of either training set or test set
MAD (Mean of Absolute Difference) (L1 norm) or
MAE (Mean of Absolute Error)
replace ( ...... )2 with | ...... |
While L2 norm gives the mean of error distribution, L1 norm calculates
the median of error distribution. It is more rubust and does not emphasis
too much on the outliers.
Combining Forecasting
Why not simply use the best model?
forecasting is noise.
Why not use all parameters?
over fitting
Error goes down when models are un-correlated.
factor 1/sqrt(N)
Error Sources
Statistical Error
Sampling
Hypothesis (date snooping)
IT Implementation Risk
Execution Risk
Market liquidity
Mistakes in Data
Quantalition
Timing (resolution)
Outliers
Rollover
(need data cleaning and cleansing)
Model Class Error
Assumption
Controlled variables
(try different modes, model evaluation with different
measures)
Regimes in Time
Non stationary
(smart preprocessing)
Error Propagation
(scenario analysis)
Missing Variables
as sources of randomness
Knowledge Discovery Process
Data <==> Model