Optimal Generalized Decision Trees via Integer Programming.
Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if allowed to grow large, they lose interpretability. In this paper, we present a novel mixed integer programming formulation to construct optimal decision trees of specified size. We take special structure of categorical features into account and allow combinatorial decisions (based on subsets of values of such a feature) at each node. We show that very good accuracy can be achieved with small trees using moderately-sized training sets. The optimization problems we solve are easily tractable with modern solvers.
Publisher URL: http://arxiv.org/abs/1612.03225
Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.
Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.