For a set of samples (patterns) with input X and output Y, the relationship between input and output can be depicted in a network of nodes. The number of input nodes equals the number of X, (input dimension); the number of output nodes equals the number of Y, (output dimension). In the following case, we consider one input dimension and one output dimension.
In a linear model, the input node X and the output node Y
are directly conncected by a line, representing the weight w.
Y^ = w * X
For polynomial estimation, the input node X and the output node Y are linked through a set of middle nodes. Each middle node represents one polynomial factor such as x, x2, x3, etct. Each middle node has an associated weight factor. All weigh factors can be represented in a weight matrix: W = {w1, w2, w3, ...}. Y^ = W * X
In a neural network, the inputs X and output Y are connected through a set so-called hidden units.
Each take the weighted sum of the inputs that are connected to it, and then “pipes” it through some nonlinearity.
The weight vectors can be represented in a weight matrix: W = {w1, w2, w3, ...}. Unlike the polynomial estimation, the meaning of the hidden units are not defined.
Tyupical activation function: h = tanh(w, x) + wo
This hyperbolic actuation function is bounded (between –1 and +1), montonically
increasing, and smooth
Overall output: Y^ = Sum[tanh(w * x) + wo] + W^
Possible to choose metric (orthogonality) such that adding a new term (higher power) does not change the lower terms (previously computed weights or coefficients)
|
·
|
· Neural Network |
· Polynomial |
|
· Parameters |
· Parameters are after non-linearity |
· Parameters are before non-linearity |
|
· Estimating the weights |
· “Training” |
· “Fitting” |
|
· Meaning of middle nodes |
· Difficult |
· Clear |
This is the s
During network training, the weights are gradually adjusted to reduce the in-sample error
Easiest way: local linearization, i.e., take a small step along the gradient in the weight space
function SR = sharpe(x,Rf)
%To compute annualized Sharpe
ratio of daily log returns series x
%Optional argument: annual risk
free interest rate (Rf)
%Note: we assume Rf to be constant
%If used in bootstrapping, then
bootstrap x
N = length(x); %length
of data set
T = 253; %number of
trading days per year
if nargin==1
Rf = 0.05; %default for risk free interest rate
end
%NUMERATOR
%First make annual Rf daily:
dailyRf = (1+Rf)^(1/T) - 1;
%Then compute daily excess return:
excessRet = x - dailyRf;
%Now annualize by compounding
these excess daily returns:
AnnExcessRet = (mean(excessRet) + 1) ^ T - 1;
%DENOMINATOR
%std(x) is the (daily) standard
deviation since we use daily returns
%to annualize, need to muliply
with sqrt (T) (scaling for standard deviations)
AnnStd = sqrt(T) * std(x);
SR = AnnExcessRet / AnnStd;
function MD = maxdrawdown(x)
%To compute maximum relative
drawdown of return series x
%generate price time series from
return series
P = exp(cumsum(x));
%Cumulative maximum price
N = length(P);
M = zeros(N,1);
M(1) = P(1);
for i=2:N
M(i) =
max(M(i-1),P(i));
end
%drawdown for each step
drawdown = (M - P) ./ M;
%maximun drawdown
MD = max(drawdown);
function [sample] = resample(nsamp)
% usage: [sample] =
resample(nsamp)
% bootstrap sampling buiding block
% nsamp = sample size
% draw a new sample with
replacement
sample = fix(rand(nsamp,1) * nsamp) + 1;
%Further analysis for HOMEWORK 3
%MATLAB SCRIPT
cd 'D:\My Documents\DMFChenggang'
load HW03Results.mat
%MDBoot contains 100k replications
of max drawdown
x = MDBoot; %pick
the data set to be analyzed
%x=[randn(1,10000) - 10 2 * randn(1,10000) + 10];
%%PDF
nbin = 200;
[nh,xh] = hist(x,nbin);
plot(xh,nh)
grid
xlabel('Value');
ylabel('PDF (distribution, counts)');
%%CDF
p = (0.5:1:(length(x)-0.5))/length(x); %percentile of each point
plot(sort(x),p)
xlabel('Value');
ylabel('CDF');
grid;
%COMPARE TO NORMAL (via
percentiles)
p = (0.5:1:(length(x)-0.5))/length(x); %as before
plot(mean(x) + std(x) * norminv(p),sort(x),'r.')
grid;
xlabel('Normal
distribution'); ylabel('Data')
minx = min(x); maxx = max(x);
line([minx maxx],[minx maxx]); axis equal
%COMPARE TO NORMAL (via
realizations)
plot(sort(randn(size(x))),sort(x),'.')
xlabel('drawings from
randn'); ylabel('Data')
%See also the MATLAB FUNCTION qqplot
qqplot(norminv(p),sort(x))
Copy over the files given in http://www.stern.nyu.edu/~aweigend/DMFS99/Notes/Class07
Run the Matlab script yielddemo
You won’t need the polyfit part of the code, so you can delete that.
Vary the number of data points you generate (ndata) and vary the level of the noise you add (noise).
Add some code that computes an “out-of-sample” performance.
In one plot, show three curves for the out-of-sample performance.
The first curve corresponds to a noise level of 0.1, the second to 0.3, the third to 1.0.
In each case, evaluate the out-of-sample performance for 10, 30, and 100, 300, and 1000 training data points.
Hand in the plot on a semilogarithmic axis (use semilogx instead of plot): the x-axis corresponds to the number of data points available.
The y-axis corresponds to the performance, measured in squared error normalized by predicting the error you obtain by using the mean of the training set as prediction.