글
대표적인 의사결정트리 알고리즘으로 널리 알려져 있는 C4.5 알고리즘의 책에 대한 Review 자료입니다. |
논문정보
원본파일 | |
검색방법 | |
출처정보 |
Quinlan, J. R. (1993). C4.5 programs for machine learning. Morgan Kaufman. |
Review : C4.5 programs for machine learning |
1. Overview (개관)
Algorithms for constructing decision trees are among the most well known and widely used for all machine learning methods. Among decision tree algorithms, J. Ross Quinlan's ID3 and its successor, C4.5, are probably the most popular in the machine learning community. These algorithms and variations on them have been the subject of numerous research papers since Quinlan introduced ID3. Until recently, most researchers looking for an introduction to decision trees turned to Quinlan's seminal 1986 Machine Learning journal article [Quinlan, 1986]. In this book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments. As such, this book will be a welcome addition to the library of many researchers and students. Quinlan discusses a wide range of issues related to decision trees, from the core alorithm for building an initial tree to methods for pruning, converting trees to rules, and handling various problems such as missing attribute values. For each of these issues, he gives a clear description of the problem, usually accompanied by an example, and he describes how C4.5 handles it. The detailed examples are usually drawn from real data sets, and they help greatly to illustrate each problem. |
2. Summary of contents (내용 요약)
Decisioin tree algorithms begin with a set of cases, or examples, and create a tree data structure that can be used to classify new cases. Each case is described by a set of attributes (or features) which can have numeric or symbolic value. Associated with each training case is a label representing the name of class. Each internal node of a decision tree contains a test, the result of which is used to decide what branch to follow from that node. For example, a test might ask "is x > 4 for attribute x?" If the test is true, then the case will processd down the left branch, and if not then it will follow the right branch. The leaf nodes contains class labelss instead of tests. In classification mode, when a test case (which has no label) reaches a leaf node, C4.5 classifies it using the label stroed there. |
'참고논문 > 참고논문전체' 카테고리의 다른 글
[논문] Instability of decision tree classification algorithms (0) | 2009.08.24 |
---|---|
[참고논문] 대졸자의 취업에 대한 CHAID 결정트리 분석 (0) | 2009.07.29 |
[참고논문] CHAID 와 CART 알고리즘의 비교 분석 (0) | 2009.07.29 |
[논문] 상호작용효과를 포함한 다중회귀분석에서 주효과의검증에대한연구 (0) | 2009.07.29 |
[참고논문] 능형회귀에서의 로버스트한 k의 선택 방법 (0) | 2009.07.29 |
RECENT COMMENT