MeritBadges/DecisionTrees

From Noisebridge
Revision as of 13:00, 11 January 2009 by Jbm (talk | contribs)
Jump to navigation Jump to search

Introduction

Decision trees are the most approachable and most fundamental sort of machine learned labelling algorithm.

Subject Matter Expert

Josh

Requirements

  1. Explain the idea behind a decision tree, including converting a set of decision criteria into a graphical representation
  2. Discuss the strengths and weaknesses of decision trees
  3. Explain fundamental machine learning concepts relevant to decision trees
    1. Explain the process of discretization of data
    2. Explain the causes of, and problems resulting from, an overfit model
  4. Describe the relationship between decision trees and entropy
    1. Demonstrate an understanding of information-theoretic entropy, including at least 3 computations by hand
    2. Explain information gain and how it relates to entropy
    3. Explain how entropy guides the learning of a decision tree
  5. Demonstrate decision tree creation
    1. Demonstrate the creation of a decision tree by hand on a small dataset (all nominal)
    2. Demonstrate the creation of a decision tree on a larger dataset, using computer tools (off-the-shelf or custom)
  6. Demonstrate converting a set of criteria into executable code in any programming language, and validate with a test set

Resources