ITCS6162/ITCS8162

Knowledge Discovery in Databases - KDD


Prerequisites: ITCS6160, full graduate standing or content of the department.
Textbook: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbauch, Vipin Kumar, Addison Wesley 2005.


Course Outline

Lectures

Class 1-1 Overview I (PPT format)

Class 1-2 Overview II (PPT format)

Class 1-3 Overview III (PPT format)

Class 2 (Association Rules)(WORD format)

Association Rules (PPT format)

Class 3 (ID3, Gini Index)(PPT format)

Class 3.1 (Rosetta)(WORD format)

Class 3.2 (LERS/ERID)(PPT format)

Class 3.3 (LERS vs ERID)(WORD format)

Class 3.4 (TV-Trees)(PPT)

Class 3.5 (TV-Trees, Pages: 23-42)(PDF Format)

Class 3.6 (Discretization in Rosetta)(WORD format)

Class 4 (Action Rules I)(PPT format)

Class 4.1 (Action Rules II)(PPT format)

Class 5.1 (Extracting Rules from Incomplete Tables)(WORD format)

Class 5.2 (Mining Incomplete Data)(WORD format)

Class 5.3 (Working Version) (Mining Incomplete Data)(WORD format)

Class 6 (Clustering Methods)(WORD format)

Class 6.1 (Clustering Methods - Part 2)(WORD format)

Class 6.2 (Textbook: Clustering I)(PPT format)

Class 6.3 (Textbook: Clustering II)(PPT format)

Class 6.4 (AQ Clustering, AQ18)(WORD format)

Class 6.5 (Textbook: Anomaly Detection)(PPT format)

Class 7 (Temporal DB Mining)(WORD format)

Class 8 (Rule Discovery based on Hyper-Planes)(WORD format)

Class 9 (Chase Methods)


Sample Problems (WORD format)

Solutions (WORD format)


Sample Problems II(WORD format)


Sample Problems (Final Exam) (WORD format)


Group Project (maximum 3 students in a group):
Implement algorithm ARED (ARD)(Action Rules Discovery) described in the paper and in Class 4.1.
Students working individually may implement algorithm ARAS described in Class4
[deadline to submit: December 8 (Monday), 2008]





Instructor:       Zbigniew W. Ras

Office: Location: Woodward Hall 430C
Telephone: 704-687-8574
Office Hours: Monday, Wednesday: 4:30-6:00pm
e-mail: ras@uncc.edu


Test: October 29, Final: 11:30am, December 13 (Saturday), 2008.
Points: 30 points Test, 40 points Final, 30 points Project.
Grades: A [86-100], B [71-85], C [50-70].


TA:       Venxin Jiang

Office: Location: Stech 432 (KDD Lab.)
Telephone: 704-687-8546
Office Hours: Monday, Wednesday [3:00-4:30pm]
e-mail: wjiang3@uncc.edu

Lisp Miner(by Jan Rauch)

Rough Set Exploration System (RSES)

Bratko's ORANGE

Random Forests

WEKA

iAQ

LERS - Version for PC (Manual) and LERS System (software)

More software for data mining

Repository of large datasets

Medical Data

GMU KDD Software