Comments on Homework Assignment

For homework one:
1) compounding return instead of simple return.
cumsum(R)
2) different way of calculating daily return from price data
R = diff(log(P)) or
Rt = (Pt - Pt-1)/Pt-1
3) when using strategy, Rs = strategy .* R

Evaluation of Models

Measures of Error

For Yt^ forecast data
Yt true data
ENMS : Normalized Mean Square Error (L2 norm):
ENMS = Sum( Yt^- Yt )2 / Sum( Y- - Yt )2
Y- is mean of either training set or test set

MAD (Mean of Absolute Difference) (L1 norm) or
MAE (Mean of Absolute Error)
replace ( ...... )2 with | ...... |

While L2 norm gives the mean of error distribution, L1 norm calculates the median of error distribution. It is more rubust and does not emphasis too much on the outliers.

Combining Forecasting

Why not simply use the best model?
forecasting is noise.

Why not use all parameters?
over fitting

Error goes down when models are un-correlated.
factor 1/sqrt(N)

Error Sources

Statistical Error

Sampling
Hypothesis (date snooping)

IT Implementation Risk

Execution Risk

Market liquidity

Mistakes in Data

Quantalition
Timing (resolution)
Outliers
Rollover
(need data cleaning and cleansing)

Model Class Error

Assumption
Controlled variables
(try different modes, model evaluation with different measures)

Regimes in Time

Non stationary

(smart preprocessing)

Error Propagation

(scenario analysis)

Missing Variables

as sources of randomness

Knowledge Discovery Process

Data <==> Model