## Optimal Sparse Regression Trees

Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been little effort towards full provable optimization, mainly due to the computational hardness of the problem. This work proposes a dynamic-programming-with-bounds approach to the construction of provably-optimal sparse regression trees. We leverage a novel lower bound based on an optimal solution to the k-Means clustering algorithm on one dimensional data. We are often able to find optimal sparse trees in seconds, even for challenging datasets that involve large numbers of samples and highly-correlated features.

### Duke Scholars

## Published In

## ISBN

## Publication Date

## Volume

## Start / End Page

### Citation

*Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023*(Vol. 37, pp. 11270–11279).

*Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023*, 37:11270–79, 2023.

*Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023*, vol. 37, 2023, pp. 11270–79.