Data Mining in Finance

 Andreas S. Weigend

B20.3355/B90.3355  -  Spring 1999

About This Course

 

Data Mining in Finance is a new course that develops links between Information Systems, Statistics and Finance. It covers fundamentals of modern modeling, knowledge discovery, and data mining techniques. Among the topics surveyed are neural networks, hidden Markov models, and clustering techniques. This course highlights the assumptions of different model classes, stresses the critical evaluation and comparison with established methods, and offers techniques for interpreting the results. These methods are useful for current problems in finance including measuring and understanding risk, and building and evaluating trading models. Seven hands-on assignments, based on MATLAB, help develop intuitions and serve as starting points to solve real problems. Possibilities of in-depth group projects in conjunction with major financial firms exist. Towards the end of the semester, several Wall Street practitioners present their perspectives on data mining in finance.

 

Prerequisites: One of the following two courses is required:

  • Statistical Inference and Regression Analysis (B90.3302)
  • Regression and Multivariate Data Analysis (B90.2301 = B90.3311)

Furthermore, some programming experience (in any language) is helpful.

 

Information about related courses at Stern.

Logistics

 

The logistics page contains the general information about the instructor and teaching assistant, readings, software, and the grading for this course.


If you are looking for specific files, the following links will take you directly to the directories: 

 

Schedule (session-by-session)

 

#

Date

Day

Topic

Readings

1

1/20/99

Wed

Introduction: Data Mining and Data Snooping

 

2

1/25/99

Mon

Learning from Data; Seven Steps of Modeling

 

3

1/27/99

Wed

Bootstrapping

 

4

2/1/99

Mon

Evaluation, Model Risk

 

5

2/3/99

Wed

Data and Representation, Linear Models

 

6

2/8/99

Mon

Nonlinear Models (see also the Information Theory notes) 

 

7

2/10/99

Wed

Neural Network Introduction: Yield Curve Demo

 

PD

2/15/99

Mon

(no class)

 

8

2/17/99

Wed

Neural Network Learning: Error Backpropagation, Overfitting Problem

 

9

2/22/99

Mon

Neural Network Theory: Maximum Likelihood Framework

 

10

2/24/99

Wed

Predicting Conditional Normal Distributions (Local Error Bars)

 

11

3/1/99

Mon

Predicting Conditional Non-normal Distributions (Gated Experts)

Mangeas

12

3/3/99

Wed

Predicting Quantiles, Tails of Distributions, and Rare Events

Chang

13

3/8/99

Mon

Extracting Risk-Neutral Densities from Options Prices (Mixture Binomial Trees)

Pirkner

14

3/10/99

Wed

Summary of Approaches to Nonlinear Prediction and Risk Management

 

SB

3/15/99

Mon

(no class)

 

SB

3/17/99

Wed

(no class)

 

15

3/22/99

Mon

 Reducing the Dimensionality of the Data (Principal Component Analysis)

B310-317,454-456
C182-186,431-436

16

3/24/99

Wed

 Discovering Statistically Independent Sources (Independent Component Analysis)

Back

17

3/29/99

Mon

Doug Martin, MathSoft, Trellis Graphics and Robust Approaches for Financial Modeling

 

18

3/31/99

Wed

Fidelio Tata, Vice President, CSFB, Mining for Short-Term Micro-Arbitrage

 

19

4/5/99

Mon

Discovering Hidden States in the Market (Hidden Markov Experts)

 

20

4/7/99

Wed

Vasant Dhar, How Decisions Makers View Data Mining in Finance

 

WS

4/10/99

Sat

Georg Zimmermann, Building Trading Models: Tricks of the Trade [9am-5pm]

 

21

4/12/99

Mon

Classification

 

22

4/14/99

Wed

Credit Risk and Bankruptcy Prediction

 

23

4/19/99

Mon

David Modest, Principal, Long Term Capital Management, The Crisis at Long Term Capital Management: An Insider's View

 

24

4/21/99

Wed

Style Analysis: Traders (Clustering)

 

25

4/26/99

Mon

Style Analysis: Funds

 

26

4/28/99

Wed

Allan Grody, Implementing Enterprise-Wide Risk Management Systems

 

27

5/3/99

Mon

Review

 

F

5/10/99

Mon

Final

 

 

Homework Assignments

 

#

Due

Assignment

Solution

1

1/28/99

Profit and Loss Curve

HW01 Solution

2

2/4/99

Bayes Rule, Data Snooping (Best Performing Stock)

HW02 Solution

3

2/16/99

Bootstrap (Sharpe Ratio, Maximum Drawdown)

HW03 Solution

4

3/4/99

Modeling Yield Curves With Neural Networks (Noise, Sample Size, Learning Behavior)

HW04 Solution

5

3/25/99

Modeling Tails of Distributions and Rare Events

HW05 Solution

6

4/13/99

Comparing PCA and ICA on Foreign Exchange Data

 

HW06 Solution

7

5/10/99

Feedback form (html) (doc)

 

 

 


Please address comments and questions to Andreas Weigend at aweigend@stern.nyu.edu.