Stern School of Business

 

Data Mining in Finance

Andreas S. Weigend

Spring 1999

About This Course

Data Mining in Finance develops links between Information Systems, Statistics and Finance. It covers the foundations of modern modeling, knowledge discovery and data mining techniques, as well as specific methods including neural networks, hidden Markov models, and model-based clustering. This course highlights the assumptions of different model classes, stresses the critical evaluation and comparison with established methods, and offers techniques for interpreting the results. Applications include current problems in finance such as understanding and managing risk, and building and evaluating trading models. Weekly computer assignments are based on MATLAB. Possibilities of in-depth group projects in conjunction with major financial firms exist. Towards the end of the semester, several Wall Street practitioners present their perspectives on data mining in finance.

Prerequisites: One of the following two courses is required:

  • Statistical Inference and Regression Analysis (B90.3302)
  • Regression and Multivariate Data Analysis (B90.2301 = B90.3311)
Furthermore, some programming experience (in any language) is helpful.

Logistics

The logistics page contains the general information about the instructor and teaching assistant, readings, software, and the grading.
If you are looking for specific files, the following links will take you directly to the directories:   

Schedule (session-by-session)

#
Date
Day Topic
1 1/20/99 Wed Introduction: Data Mining and Data Snooping
2 1/25/99 Mon Learning from Data: Bayesian Learning, 7 Steps of Modeling
3 1/27/99 Wed Bootstrapping
4 2/1/99 Mon Evaluation, Model Risk
5 2/3/99 Wed Data and Representation, Linear Models
6 2/8/99 Mon Nonlinear Models (see also the Information Theory notes) 
7 2/10/99 Wed Neural Networks: Introduction, Yield Curve Demo
PD 2/15/99 Mon (no class)
8 2/17/99 Wed Neural Networks: Theory
9 2/22/99 Mon Conditional Normal Predictions
10 2/24/99 Wed Conditional Non-normal Predictions
11 3/1/99 Mon Tail Predictions
12 3/3/99 Wed Computing Non-normal Implied Densities
13 3/8/99 Mon Principal Component Analysis
14 3/10/99 Wed Independent Component Analysis
SB 3/15/99 Mon (no class)
SB 3/17/99 Wed (no class)
The schedule after spring break is likely to shift.
There will be one or two additional guest speakers.
15 3/22/99 Mon  
16 3/24/99 Wed  
17 3/29/99 Mon (Martin, Robust Approaches)
18 3/31/99 Wed Markov Models
19 4/5/99 Mon Hidden Markov Models
20 4/7/99 Wed (Dhar, How Investors Look at Financial Data Mining Models)
WS 4/10/99 Sat (Zimmermann, Portfolio Workshop)
21 4/12/99 Mon Classification
22 4/14/99 Wed (Grody, Practical Implications of Implementing Enterprise Wide Risk Management Systems)
23 4/19/99 Mon Style Analysis: Traders (Clustering)
24 4/21/99 Wed Style Analysis: Mutual Funds
25 4/26/99 Mon Style Analysis: Hedge Funds
26 4/28/99 Wed (Li, Risk in Practice)
27 5/3/99 Mon The Big Picture
F 5/10/99 Mon Final
 

Homework Assignments

#
Due
Assignment Solution
1 1/28/99 Profit and Loss Curve Matlab
2 2/4/99 Bayes Rule, Data Snooping (Best Performer) Matlab
3 2/16/99 Bootstrap (Sharpe Ratio, Maximum Drawdown)
4 2/25/99 Neural Network (Noise, Sample Size)
5 TBA TBA
6 TBA TBA
7 ... ...
 
 

Related Courses at Stern

In addition to calculus and probability and the IS, Finance and Statistics core courses, this course uses material from the following two courses: 
  • Statistical Inference and Regression Analysis (B90.3302) (the second half of B90.3302 is particularly important), 
  • Regression and Multivariate Data Analysis (B90.2301 = B90.3311 for PhDs ).

Forecasting Time Series Data (B90.2302 = B90.3312 for PhDs = C22.0018 for undergraduates), or another time series course, provides a useful background for the prediction part of this course.

 Deeper insights can be gained if this courseis taken after or at the same time with: 
 Bayesian Inference and Statistical Decision Theory (B90.3305), and 
 Stochastic Processes I (B90.3321). 

Comparing this course to related IS courses, this course is statistically more advanced than Knowledge Systems in Organizations (B20.3336 = C20.3336), and also more theoretical and covering a larger area than Risk Management Systems (B20.3351). 

A more detailed general description includes the teaching philosophy. Furthermore, the book chapter Data Mining in Finance: Report from the Post-NNCM-96 Workshop on Teaching Computer Intensive Methods for Financial Modeling and Data Analysis (as pdf, as ps) provides some history of this course.

For a wealth of interesting courses, please visit the list of courses in the Finance Department, and the complete list of all Stern courses and syllabi.

 
Please address comments and questions to Andreas S. Weigend at aweigend@stern.nyu.edu.