Skip to main content

Ronald Parr

Professor of Computer Science
Computer Science
Box 90129, Durham, NC 27708-0129
D209 Lev Sci Res Ctr, Durham, NC 27708

Selected Publications


Position: Amazing Things Come From Having Many Good Models

Conference Proceedings of Machine Learning Research · January 1, 2024 The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, but mos ... Cite

A Path to Simpler Models Starts With Noise.

Journal Article Advances in neural information processing systems · December 2023 The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular dat ... Cite

On the Existence of Simpler Machine Learning Models

Conference ACM International Conference Proceeding Series · June 21, 2022 It is almost always easier to find an accurate-but-complex model than an accurate-yet-simple model. Finding optimal, sparse, accurate models of various forms (linear models with integer coefficients, decision sets, rule lists, decision trees) is generally ... Full text Cite

Deep Radial-Basis Value Functions for Continuous Control

Journal Article 35th AAAI Conference on Artificial Intelligence, AAAI 2021 · January 1, 2021 A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep radial-b ... Cite

Policy Caches with Successor Features

Conference Proceedings of Machine Learning Research · January 1, 2021 Transfer in reinforcement learning is based on the idea that it is possible to use what is learned in one task to improve the learning process in another task. For transfer between tasks which share transition dynamics but differ in reward function, succes ... Cite

Revisiting the softmax bellman operator: New benefits and new perspective

Conference 36th International Conference on Machine Learning, ICML 2019 · January 1, 2019 The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contraction properties of the Bellman operator. Surprisingly, de ... Cite

Computing Optimal Strategies to Commit to in Stochastic Games

Conference Proceedings of the 26th AAAI Conference on Artificial Intelligence, AAAI 2012 · January 1, 2012 Significant progress has been made recently in the following two lines of research in the intersection of AI and game theory: (1) the computation of optimal strategies to commit to (Stackelberg strategies), and (2) the computation of correlated equilibria ... Cite

Non-Parametric Approximate Linear Programming for MDPs

Conference Proceedings of the 25th AAAI Conference on Artificial Intelligence, AAAI 2011 · August 11, 2011 The Approximate Linear Programming (ALP) approach to value function approximation for MDPs is a parametric value function approximation method, in that it represents the value function as a linear combination of features which are chosen a priori. Choosing ... Full text Cite

Complexity of Computing Optimal Stackelberg Strategies in Security Resource Allocation Games

Conference Proceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI 2010 · July 15, 2010 Recently, algorithms for computing game-theoretic solutions have been deployed in real-world security applications, such as the placement of checkpoints and canine units at Los Angeles International Airport. These algorithms assume that the defender (secur ... Cite

Linear complementarity for regularized policy evaluation and improvement

Conference Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010 · January 1, 2010 Recent work in reinforcement learning has emphasized the power of L 1 regularization to perform feature selection and prevent overfitting. We propose formulating the L1 regularized linear fixed point problemas a linear complementarity problem (LCP). This f ... Cite

Learning in Zero-Sum Team Markov Games Using Factored Value Functions

Conference NIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems · January 1, 2002 We present a new method for learning good strategies in zero-sum Markov games in which each side is composed of multiple agents collaborating against an opposing team of agents. Our method requires full observability and communication during learning, but ... Cite

Selecting the Right Algorithm

Conference AAAI Fall Symposium - Technical Report · January 1, 2001 Cite

Making Rational Decisions using Adaptive Utility Elicitation

Conference Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence, AAAI 2000 · January 1, 2000 Rational decision making requires full knowledge of the utility function of the person affected by the decisions. However, in many cases, the task of acquiring such knowledge is not feasible due to the size of the outcome space and the complexity of the ut ... Cite

Bayesian Fault Detection and Diagnosis in Dynamic Systems

Conference Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence, AAAI 2000 · January 1, 2000 This paper addresses the problem of tracking and diagnosing complex systems with mixtures of discrete and continuous variables. This problem is a difficult one, particularly when the system dynamics are nondeterministic, not all aspects of the system are d ... Cite

Approximating Optimal Policies for Partially Observable Stochastic Domains

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 1995 The problem of making optimaJ decisions in uncertain conditions is central to Artificial Intelligence If the state of the world is known at all times, the world can be modeled as a Markov Decision Pro cess (MDP) MDPs have been studied extensively and many ... Cite