George Dimitri Konidaris

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2024 Deploying robots in real-world environments, such as households and manufacturing lines, requires generalization across novel task specifications without violating safety constraints. Linear temporal logic (LTL) is a widely used task specification language ... Full text Cite

Composable Interaction Primitives: A Structured Policy Class for Efficiently Learning Sustained-Contact Manipulation Skills

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2024 We propose a new policy class, Composable Interaction Primitives (CIPs), specialized for learning sustained-contact manipulation skills like opening a drawer, pulling a lever, turning a wheel, or shifting gears. CIPs have two primary design goals: to minim ... Full text Cite

Robot Task Planning under Local Observability

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2024 Real-world robot task planning is intractable in part due to partial observability. A common approach to reducing complexity is introducing additional structure into the decision process, such as mixed-observability, factored states, or temporally-extended ... Full text Cite

Language-guided Skill Learning with Temporal Variational Inference

Conference Proceedings of Machine Learning Research · January 1, 2024 We present an algorithm for skill discovery from expert demonstrations. The algorithm first utilizes Large Language Models (LLMs) to propose an initial segmentation of the trajectories. Following that, a hierarchical variational inference framework incorpo ... Cite

Model-based Reinforcement Learning for Parameterized Action Spaces

Conference Proceedings of Machine Learning Research · January 1, 2024 We propose a novel model-based reinforcement learning algorithm-Dynamics Learning and predictive control with Parameterized Actions (DLPA)-for Parameterized Action Markov Decision Processes (PAMDPs).The agent learns a parameterized-action-conditioned dynam ... Cite

Lang2LTL-2: Grounding Spatiotemporal Navigation Commands Using Large Language and Vision-Language Models

Conference IEEE International Conference on Intelligent Robots and Systems · January 1, 2024 Grounding spatiotemporal navigation commands to structured task specifications enables autonomous robots to understand a broad range of natural language and solve long-horizon tasks with safety guarantees. Prior works mostly focus on grounding spatial or t ... Full text Cite

EPO: Hierarchical LLM Agents with Environment Preference Optimization

Conference EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference · January 1, 2024 Long-horizon decision-making tasks present significant challenges for LLM-based agents due to the need for extensive planning over multiple steps. In this paper, we propose a hierarchical framework that decomposes complex tasks into manageable subgoals, ut ... Cite

Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy

Conference Advances in Neural Information Processing Systems · January 1, 2024 Reinforcement learning algorithms typically rely on the assumption that the environment dynamics and value function can be expressed in terms of a Markovian state representation. However, when state information is only partially observable, how can an agen ... Cite

Q-functionals for Value-Based Continuous Control

Conference Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 · June 27, 2023 We present Q-functionals, an alternative architecture for continuous control deep reinforcement learning. Instead of returning a single value for a state-action pair, our network transforms a state into a function that can be rapidly evaluated in parallel ... Full text Cite

Automatic encoding and repair of reactive high-level tasks with learned abstract representations

Journal Article International Journal of Robotics Research · April 1, 2023 We present a framework for the automatic encoding and repair of high-level tasks. Given a set of skills a robot can perform, our approach first abstracts sensor data into symbols and then automatically encodes the robot’s capabilities in Linear Temporal Lo ... Full text Cite

A domain-agnostic approach for characterization of lifelong learning systems.

Journal Article Neural networks : the official journal of the International Neural Network Society · March 2023 Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original trainin ... Full text Cite

Coarse-Grained Smoothness for Reinforcement Learning in Metric Spaces

Conference Proceedings of Machine Learning Research · January 1, 2023 Principled decision-making in continuous state-action spaces is impossible without some assumptions. A common approach is to assume Lipschitz continuity of the Q-function. We show that, unfortunately, this property fails to hold in many typical domains. We ... Cite

Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning

Conference Proceedings of Machine Learning Research · January 1, 2023 We propose a new method for count-based exploration in high-dimensional state spaces. Unlike previous work which relies on density models, we show that counts can be derived by averaging samples from the Rademacher distribution (or coin flips). This insigh ... Cite

Meta-Learning Parameterized Skills

Conference Proceedings of Machine Learning Research · January 1, 2023 We propose a novel parameterized skill-learning algorithm that aims to learn transferable parameterized skills and synthesize them into a new action space that supports efficient learning in long-horizon tasks. We propose to leverage off-policy Meta-RL com ... Cite

RLang: A Declarative Language for Describing Partial World Knowledge to Reinforcement Learning Agents

Conference Proceedings of Machine Learning Research · January 1, 2023 We introduce RLang, a domain-specific language (DSL) for communicating domain knowledge to an RL agent. Unlike existing RL DSLs that ground to single elements of a decision-making formalism (e.g., the reward function or policy), RLang can specify informati ... Cite

Constrained Dynamic Movement Primitives for Collision Avoidance in Novel Environments

Conference IEEE International Conference on Intelligent Robots and Systems · January 1, 2023 Dynamic movement primitives are widely used for learning skills that can be demonstrated to a robot by a skilled human or controller. While their generalization capabilities and simple formulation make them very appealing to use, they possess no strong gua ... Full text Cite

Skill Generalization with Verbs

Conference IEEE International Conference on Intelligent Robots and Systems · January 1, 2023 It is imperative that robots can understand natural language commands issued by humans. Such commands typically contain verbs that signify what action should be performed on a given object and that are applicable to many objects. We propose a method for ge ... Full text Cite

Improved Inference of Human Intent by Combining Plan Recognition and Language Feedback

Conference IEEE International Conference on Intelligent Robots and Systems · January 1, 2023 Conversational assistive robots can aid people, especially those with cognitive impairments, to accomplish various tasks such as cooking meals, performing exercises, or operating machines. However, to interact with people effectively, robots must recognize ... Full text Cite

Synthesizing Navigation Abstractions for Planning with Portable Manipulation Skills

Conference Proceedings of Machine Learning Research · January 1, 2023 We address the problem of efficiently learning high-level abstractions for task-level robot planning. Existing approaches require large amounts of data and fail to generalize learned abstractions to new environments. To address this, we propose to exploit ... Cite

Effectively Learning Initiation Sets in Hierarchical Reinforcement Learning

Conference Advances in Neural Information Processing Systems · January 1, 2023 An agent learning an option in hierarchical reinforcement learning must solve three problems: identify the option's subgoal (termination condition), learn a policy, and learn where that policy will succeed (initiation set). The termination condition is typ ... Cite

PERFORMANCE BOUNDS FOR MODEL AND POLICY TRANSFER IN HIDDEN-PARAMETER MDPS

Conference 11th International Conference on Learning Representations, ICLR 2023 · January 1, 2023 In the Hidden-Parameter MDP (HiP-MDP) framework, a family of reinforcement learning tasks is generated by varying hidden parameters specifying the dynamics and reward function for each individual task. The HiP-MDP is a natural model for families of tasks i ... Cite

Innovation Paths for Machine Learning in Robotics [Industry Activities]

Journal Article IEEE Robotics and Automation Magazine · December 1, 2022 Full text Cite

IKFlow: Generating Diverse Inverse Kinematics Solutions

Journal Article IEEE Robotics and Automation Letters · July 1, 2022 Inverse kinematics - finding joint poses that reach a given Cartesian-space end-effector pose - is a fundamental operation in robotics, since goals and waypoints are typically defined in Cartesian space, but robots must be controlled in joint space. Howeve ... Full text Cite

Optimistic Initialization for Exploration in Continuous Control

Conference Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 · June 30, 2022 Optimistic initialization underpins many theoretically sound exploration schemes in tabular domains; however, in the deep function approximation setting, optimism can quickly disappear if initialized naïvely. We propose a framework for more effectively inc ... Full text Cite

Automatic Encoding and Repair of Reactive High-Level Tasks with Learned Abstract Representations

Conference Springer Proceedings in Advanced Robotics · January 1, 2022 We present a framework that, given a set of skills a robot can perform, abstracts sensor data into symbols that are used to automatically encode the robot’s capabilities in Linear Temporal Logic (LTL). We specify reactive high-level tasks based on these ca ... Full text Cite

RMPs for Safe Impedance Control in Contact-Rich Manipulation

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2022 Variable impedance control in operation-space is a promising approach to learning contact-rich manipulation behaviors. One of the main challenges with this approach is producing a manipulation behavior that ensures the safety of the arm and the environment ... Full text Cite

Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2022 Recent work on using natural language to specify commands to robots has grounded that language to LTL. However, mapping natural language task specifications to LTL task specifications using language models require probability distributions over finite voca ... Full text Cite

Learning to Infer Kinematic Hierarchies for Novel Object Instances

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2022 Manipulating an articulated object requires perceiving its kinematic hierarchy: its parts, how each can move, and how those motions are coupled. Previous work has explored perception for kinematics, but none infers a complete kinematic hierarchy on never-b ... Full text Cite

Using Language to Generate State Abstractions for Long-Range Planning in Outdoor Environments

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2022 Robots that process navigation instructions in large outdoor environments will need to operate at different levels of abstraction. For example, a land-surveying aerial robot receiving the instruction 'go to Boston and go through the state forest on the way ... Full text Cite

Towards Optimal Correlational Object Search

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2022 In realistic applications of object search, robots will need to locate target objects in complex environments while coping with unreliable sensors, especially for small or hard-to-detect objects. In such settings, correlational information can be valuable ... Full text Cite

AUTONOMOUS LEARNING OF OBJECT-CENTRIC ABSTRACTIONS FOR HIGH-LEVEL PLANNING

Conference ICLR 2022 - 10th International Conference on Learning Representations · January 1, 2022 We propose a method for autonomously learning an object-centric representation of a continuous and high-dimensional environment that is suitable for planning. Such representations can immediately be transferred between tasks that share the same types of ob ... Cite

Model-based Lifelong Reinforcement Learning with Bayesian Exploration

Conference Advances in Neural Information Processing Systems · January 1, 2022 We propose a model-based lifelong reinforcement-learning approach that estimates a hierarchical Bayesian posterior distilling the common structure shared across different tasks. The learned posterior combined with a sample-based Bayesian exploration proced ... Cite

Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex

Conference Advances in Neural Information Processing Systems · January 1, 2022 AlphaZero, an approach to reinforcement learning that couples neural networks and Monte Carlo tree search (MCTS), has produced state-of-the-art strategies for traditional board games like chess, Go, shogi, and Hex. While researchers and game commentators h ... Cite

Effects of Data Geometry in Early Deep Learning

Conference Advances in Neural Information Processing Systems · January 1, 2022 Deep neural networks can approximate functions on different types of data, from images to graphs, with varied underlying structure. This underlying structure can be viewed as the geometry of the data manifold. By extending recent advances in the theoretica ... Cite

A review of robot learning for manipulation: Challenges, representations, and algorithms

Journal Article Journal of Machine Learning Research · January 1, 2021 A key challenge in intelligent robotics is creating robots that are capable of directly interacting with the world around them to achieve their goals. The last decade has seen substantial growth in research on the problem of robot manipulation, which aims ... Cite

Deep Radial-Basis Value Functions for Continuous Control

Journal Article 35th AAAI Conference on Artificial Intelligence, AAAI 2021 · January 1, 2021 A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep radial-b ... Full text Cite

Bootstrapping Motor Skill Learning with Motion Planning

Journal Article IEEE International Conference on Intelligent Robots and Systems · January 1, 2021 Learning a robot motor skill from scratch is impractically slow; so much so that in practice, learning must typically be bootstrapped using human demonstration. However, relying on human demonstration necessarily degrades the autonomy of robots that must l ... Full text Cite

Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion

Conference 35th AAAI Conference on Artificial Intelligence, AAAI 2021 · January 1, 2021 We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO), a novel algorithm for visual transfer in Reinforcement Learning that explicitly learns to align the distributions of extracted features between a source and target task. WAPPO appro ... Full text Cite

Learning to Detect Multi-Modal Grasps for Dexterous Grasping in Dense Clutter

Conference IEEE International Conference on Intelligent Robots and Systems · January 1, 2021 We propose an approach to multi-modal grasp detection that jointly predicts the probabilities that several types of grasps succeed at a given grasp pose. Given a partial point cloud of a scene, the algorithm proposes a set of feasible grasp candidates, the ... Full text Cite

Multi-Resolution POMDP Planning for Multi-Object Search in 3D

Conference IEEE International Conference on Intelligent Robots and Systems · January 1, 2021 Robots operating in households must find objects on shelves, under tables, and in cupboards. In such environments, it is crucial to search efficiently at 3D scale while coping with limited field of view and the complexity of searching for multiple objects. ... Full text Cite

Robustly Learning Composable Options in Deep Reinforcement Learning

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2021 Hierarchical reinforcement learning (HRL) is only effective for long-horizon problems when high-level skills can be reliably sequentially executed. Unfortunately, learning reliably composable skills is difficult, because all the components of every skill a ... Full text Cite

Efficient Black-Box Planning Using Macro-Actions with Focused Effects

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2021 The difficulty of deterministic planning increases exponentially with search-tree depth. Black-box planning presents an even greater challenge, since planners must operate without an explicit model of the domain. Heuristics can make search more efficient, ... Full text Cite

Learning Collaborative Pushing and Grasping Policies in Dense Clutter

Conference Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2021 Robots must reason about pushing and grasping in order to engage in flexible manipulation in cluttered environments. Earlier works on learning pushing and grasping only consider each operation in isolation or are limited to top-down grasping and bin-pickin ... Full text Cite

Learning Markov State Abstractions for Deep Reinforcement Learning

Conference Advances in Neural Information Processing Systems · January 1, 2021 A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically learn by way of an abstract state representation, ... Cite

Skill Discovery for Exploration and Planning using Deep Skill Graphs

Conference Proceedings of Machine Learning Research · January 1, 2021 We introduce a new skill-discovery algorithm that builds a discrete graph representation of large continuous MDPs, where nodes correspond to skill subgoals and the edges to skill policies. The agent constructs this graph during an unsupervised training pha ... Cite

Roadmap subsampling for changing environments

Conference IEEE International Conference on Intelligent Robots and Systems · October 24, 2020 Precomputed roadmaps can enable effective multi-query motion planning: a roadmap can be built for a robot as if no obstacles were present, and then after edges invalidated by obstacles observed at query time are deleted, path search through the remaining r ... Full text Cite

Building plannable representations with mixed reality

Conference IEEE International Conference on Intelligent Robots and Systems · October 24, 2020 We propose Action-Oriented Semantic Maps (AOSMs), a representation that enables a robot to acquire object manipulation behaviors and semantic information about the environment from a human teacher with a Mixed Reality Head-Mounted Display (MR-HMD). AOSMs a ... Full text Cite

Optical Coherence Tomography-Guided Robotic Ophthalmic Microsurgery via Reinforcement Learning from Demonstration.

Journal Article IEEE Trans Robot · August 2020 Ophthalmic microsurgery is technically difficult because the scale of required surgical tool manipulations challenge the limits of the surgeon's visual acuity, sensory perception, and physical dexterity. Intraoperative optical coherence tomography (OCT) im ... Full text Link to item Cite

Learning portable representations for high-level planning

Journal Article 37th International Conference on Machine Learning, ICML 2020 · January 1, 2020 We present a framework for autonomously learning a portable representation that describes a collection of low-level continuous environments. We show that these abstract representations can be learned in a task-independent egocentric space specific to the a ... Cite

Communicating Robot Arm Motion Intent Through Mixed Reality Head-Mounted Displays

Conference · January 1, 2020 Efficient motion intent communication is necessary for safe and collaborative work environments with collocated humans and robots. Humans efficiently communicate their motion intent to other humans through gestures, gaze, and social cues. However, robots o ... Full text Cite

Comparing Robot Grasping Teleoperation Across Desktop and Virtual Reality with ROS Reality

Conference · January 1, 2020 Teleoperation allows a human to remotely operate a robot to perform complex and potentially dangerous tasks such as defusing a bomb, repairing a nuclear reactor, or maintaining the exterior of a space station. Existing teleoperation approaches generally re ... Full text Cite

EXPLORATION IN REINFORCEMENT LEARNING WITH DEEP COVERING OPTIONS

Conference 8th International Conference on Learning Representations, ICLR 2020 · January 1, 2020 While many option discovery methods have been proposed to accelerate exploration in reinforcement learning, they are often heuristic. Recently, covering options was proposed to discover a set of options that provably reduce the upper bound of the environme ... Cite

Task Scoping for Efficient Planning in OpenWorlds

Conference AAAI 2020 - 34th AAAI Conference on Artificial Intelligence · January 1, 2020 We propose an abstraction method for open-world environments expressed as Factored Markov Decision Processes (FMDPs) with very large state and action spaces. Our method prunes state and action variables that are irrelevant to the optimal value function on ... Cite

Simultaneously Learning Transferable Symbols and Language Groundings from Perceptual Data for Instruction Following

Conference Robotics: Science and Systems · January 1, 2020 Enabling robots to learn tasks and follow instructions as easily as humans is important for many real-world robot applications. Previous approaches have applied machine learning to teach the mapping from language to low dimensional symbolic representations ... Full text Cite

OPTION DISCOVERY USING DEEP SKILL CHAINING

Conference 8th International Conference on Learning Representations, ICLR 2020 · January 1, 2020 Autonomously discovering temporally extended actions, or skills, is a longstanding goal of hierarchical reinforcement learning. We propose a new algorithm that combines skill chaining with deep neural networks to autonomously discover skills in high-dimens ... Cite

Grounding Language Attributes to Objects using Bayesian Eigenobjects

Journal Article IEEE International Conference on Intelligent Robots and Systems · November 1, 2019 We develop a system to disambiguate object instances within the same class based on simple physical descriptions. The system takes as input a natural language phrase and a depth image containing a segmented object and predicts how similar the observed obje ... Full text Cite

Bounded-Error LQR-Trees

Conference IEEE International Conference on Intelligent Robots and Systems · November 1, 2019 We present a feedback motion planning algorithm, Bounded-Error LQR-Trees, that leverages reinforcement learning theory to find a policy with a bounded amount of error. The algorithm composes locally valid linear-quadratic regulators (LQR) into a nonlinear ... Full text Cite

On the necessity of abstraction

Journal Article Current Opinion in Behavioral Sciences · October 1, 2019 A generally intelligent agent faces a dilemma: it requires a complex sensorimotor space to be capable of solving a wide range of problems, but many tasks are only feasible given the right problem-specific formulation. I argue that a necessary but understud ... Full text Cite

Communicating and controlling robot arm motion intent through mixed-reality head-mounted displays

Journal Article International Journal of Robotics Research · October 1, 2019 Efficient motion intent communication is necessary for safe and collaborative work environments with co-located humans and robots. Humans efficiently communicate their motion intent to other humans through gestures, gaze, and other non-verbal cues, and can ... Full text Cite

A programmable architecture for robot motion planning acceleration

Conference Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors · July 1, 2019 We have designed a programmable architecture to accelerate collision detection and graph search, two of the principal components of robotic motion planning. The programmability enables the architecture to be applied to a wide range of different robots and ... Full text Cite

End-user robot programming using mixed reality

Conference Proceedings - IEEE International Conference on Robotics and Automation · May 1, 2019 Mixed Reality (MR) is a promising interface for robot programming because it can project an immersive 3D visualization of a robot's intended movement onto the real world. MR can also support hand gestures, which provide an intuitive way for users to constr ... Full text Cite

Scanning the internet for ROS: A view of security in robotics research

Conference Proceedings - IEEE International Conference on Robotics and Automation · May 1, 2019 Security is particularly important in robotics, as robots can directly perceive and affect the physical world. We describe the results of a scan of the entire IPv4 address space of the Internet for instances of the Robot Operating System (ROS), a widely us ... Full text Cite

Modeling and planning with macro-actions in decentralized POMDPs

Journal Article Journal of Artificial Intelligence Research · March 1, 2019 Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized multi-agent decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent’s actions ... Full text Cite

Learning multi-level hierarchies with hindsight

Conference 7th International Conference on Learning Representations, ICLR 2019 · January 1, 2019 © 7th International Conference on Learning Representations, ICLR 2019. All Rights Reserved. Multi-level hierarchies have the potential to accelerate learning in sparse reward tasks because they can divide a problem into a set of short horizon subproblems. ... Cite

DeepMellow: Removing the need for a target network in deep q-learning

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2019 Deep Q-Network (DQN) is an algorithm that achieves human-level performance in complex domains like Atari games. One of the important elements of DQN is its use of a target network, which is necessary to stabilize learning. We argue that using a target netw ... Full text Cite

Removing the target network from deep Q-networks with the mellowmax operator

Conference Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS · January 1, 2019 Deep Q-Network (DQX) is a learning algorithm that achieves humanlevel performance in high-dimensional domains like Atari games. We propose that using an softmax operator, Mellowmax, in DQN reduces its need for a separate target network, which is otherwise ... Cite

Learning multi-level hierarchies with hindsight

Conference 7th International Conference on Learning Representations, ICLR 2019 · January 1, 2019 Multi-level hierarchies have the potential to accelerate learning in sparse reward tasks because they can divide a problem into a set of short horizon subproblems. In order to realize this potential, Hierarchical Reinforcement Learning (HRL) algorithms nee ... Cite

Learning to Generalize Kinematic Models to Novel Objects

Conference Proceedings of Machine Learning Research · January 1, 2019 Robots operating in human environments must be capable of interacting with a wide variety of articulated objects such as cabinets, refrigerators, and drawers. Existing approaches require human demonstration or minutes of interaction to fit kinematic models ... Cite

Learning Symbolic Representations for Planning with Parameterized Skills

Conference IEEE International Conference on Intelligent Robots and Systems · December 27, 2018 A critical capability required for generally intelligent robot behavior is the ability to sequence motor skills to reach a goal. This requires a (typically abstract) representation that supports goal-directed planning, which raises the question of how to c ... Full text Cite

Hybrid Bayesian Eigenobjects: Combining Linear Subspace and Deep Network Methods for 3D Robot Vision

Conference IEEE International Conference on Intelligent Robots and Systems · December 27, 2018 We introduce Hybrid Bayesian Eigenobjects (HBEOs), a novel representation for 3D objects designed to allow a robot to jointly estimate the pose, class, and full 3D geometry of a novel object observed from a single viewpoint in a single practical framework. ... Full text Cite

Representing, learning, and controlling complex object interactions

Journal Article Autonomous Robots · October 1, 2018 We present a framework for representing scenarios with complex object interactions, where a robot cannot directly interact with the object it wishes to control and must instead influence it via intermediate objects. For instance, a robot learning to drive ... Full text Cite

Handedness and Reach-to-Place Kinematics in Adults: Left-Handers Are Not Reversed Right-Handers.

Journal Article Journal of motor behavior · July 2018 The primary goal of this study was to examine the relations between limb control and handedness in adults. Participants were categorized as left or right handed for analyses using the Edinburgh Handedness Inventory. Three-dimensional recordings were made o ... Full text Cite

From skills to symbols: Learning symbolic representations for abstract high-level planning

Journal Article Journal of Artificial Intelligence Research · January 1, 2018 We consider the problem of constructing abstract representations for planning in highdimensional, continuous environments. We assume an agent equipped with a collection of high-level actions, and construct representations provably capable of evaluating pla ... Full text Cite

Policy and Value Transfer in Lifelong Reinforcement Learning

Conference 35th International Conference on Machine Learning, ICML 2018 · January 1, 2018 We consider the problem of how best to use prior experience to bootstrap lifelong learning, where an agent faces a series of task instances drawn from some task distribution. First, we identify the initial policy that optimizes expected performance over th ... Cite

Robust and efficient transfer learning with hidden parameter Markov decision processes

Conference 31st AAAI Conference on Artificial Intelligence, AAAI 2017 · January 1, 2017 An intriguing application of transfer learning emerges when tasks arise with similar, but not identical, dynamics. Hidden Parameter Markov Decision Processes (HiP-MDP) embed these tasks into a low-dimensional space; given the embedding parameters one can i ... Cite

An analysis of Monte Carlo tree search

Conference 31st AAAI Conference on Artificial Intelligence, AAAI 2017 · January 1, 2017 Monte Carlo Tree Search (MCTS) is a family of directed search algorithms that has gained widespread attention in recent years. Despite the vast amount of research into MCTS, the effect of modifications on the algorithm, as well as the manner in which it pe ... Cite

Robust and efficient transfer learning with hidden parameter Markov decision processes

Conference Advances in Neural Information Processing Systems · January 1, 2017 We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings. Our new framework correctly models the joint uncertainty in the latent par ... Cite

Active exploration for learning symbolic representations

Conference Advances in Neural Information Processing Systems · January 1, 2017 We introduce an online active exploration algorithm for data-efficiently learning an abstract symbolic model of an environment. Our algorithm is divided into two parts: the first part quickly generates an intermediate Bayesian symbolic model from the data ... Cite

Bayesian Eigenobjects: A unified framework for 3D robot perception

Conference Robotics: Science and Systems · January 1, 2017 We introduce Bayesian Eigenobjects (BEOs), a novel object representation that is the first technique able to perform joint classification, pose estimation, and 3D geometric completion on previously unencountered and partially observed query objects. BEOs e ... Full text Cite

The microarchitecture of a real-Time robot motion planning accelerator

Conference Proceedings of the Annual International Symposium on Microarchitecture, MICRO · December 14, 2016 We have developed a hardware accelerator for motion planning, a critical operation in robotics. In this paper, we present the microarchitecture of our accelerator and describe a prototype implementation on an FPGA. We experimentally show that the accelerat ... Full text Cite

Policy search for multi-robot coordination under uncertainty

Conference International Journal of Robotics Research · December 1, 2016 We introduce a principled method for multi-robot coordination based on a general model (termed a MacDec-POMDP) of multi-robot cooperative planning in the presence of stochasticity, uncertain sensing, and communication limitations. A new MacDec-POMDP planni ... Full text Cite

Hidden parameter markov decision processes: A semiparametric regression approach for discovering latent task parametrizations

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2016 Control applications often feature tasks with similar, but not identical, dynamics. We introduce the Hidden Parameter Markov Decision Process (HiPMDP), a framework that parametrizes a family of related dynamical systems with a low-dimensional set of latent ... Cite

Constructing abstraction hierarchies using a skill-symbol loop

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2016 We describe a framework for building abstraction hierarchies whereby an agent alternates skill- and representation-construction phases to construct a sequence of increasingly abstract Markov decision processes. Our formulation builds on recent results show ... Cite

Reinforcement learning with parameterized actions

Conference 30th AAAI Conference on Artificial Intelligence, AAAI 2016 · January 1, 2016 We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. ... Cite

Robot motion planning on a chip

Conference Robotics: Science and Systems · January 1, 2016 We describe a process that constructs robot-specific circuitry for motion planning, capable of generating motion plans approximately three orders of magnitude faster than existing methods. Our method is based on building collision detection circuits for a ... Cite

Representing and learning complex object interactions

Conference Robotics: Science and Systems · January 1, 2016 We present a framework for representing scenarios with complex object interactions, in which a robot cannot directly interact with the object it wishes to control, but must instead do so via intermediate objects. For example, a robot learning to drive a ca ... Full text Cite

Nonparametric Bayesian reward segmentation for skill discovery using inverse reinforcement learning

Conference IEEE International Conference on Intelligent Robots and Systems · December 11, 2015 We present a method for segmenting a set of unstructured demonstration trajectories to discover reusable skills using inverse reinforcement learning (IRL). Each skill is characterised by a latent reward function which the demonstrator is assumed to be opti ... Full text Cite

Regularized feature selection in reinforcement learning

Journal Article Machine Learning · September 17, 2015 We introduce feature regularization during feature selection for value function approximation. Feature regularization introduces a prior into the selection process, improving function approximation accuracy and reducing overfitting. We show that the smooth ... Full text Cite

Planning for decentralized control of multiple robots under uncertainty

Journal Article Proceedings - IEEE International Conference on Robotics and Automation · June 29, 2015 This paper presents a probabilistic framework for synthesizing control policies for general multi-robot systems that is based on decentralized partially observable Markov decision processes (Dec-POMDPs). Dec-POMDPs are a general model of decision-making wh ... Full text Cite

Learning grounded finite-state representations from unstructured demonstrations

Journal Article International Journal of Robotics Research · March 3, 2015 Robots exhibit flexible behavior largely in proportion to their degree of knowledge about the world. Such knowledge is often meticulously hand-coded for a narrow class of tasks, limiting the scope of possible robot competencies. Thus, the primary limiting ... Full text Cite

Reports of the AAAI 2014 conference workshops

Conference AI Magazine · March 1, 2015 The AAAI-14 Workshop program was held Sunday and Monday, July 27-28, 2014, at the Québec City Convention Centre in Québec, Canada. The AAAI-14 workshop program included 15 workshops covering a wide range of topics in artificial intelligence. The titles of ... Full text Cite

Policy search for multi-robot coordination under uncertainty

Conference Robotics: Science and Systems · January 1, 2015 We introduce a principled method for multi-robot coordination based on a generic model (termed a MacDec-POMDP) of multi-robot cooperative planning in the presence of stochasticity, uncertain sensing and communication limitations. We present a new MacDec-PO ... Full text Cite

Policy evaluation using the Ω-return

Conference Advances in Neural Information Processing Systems · January 1, 2015 We propose the-return as an alternative to the λ-return currently used by the TD(λ) family of algorithms. The benefit of the-return is that it accounts for the correlation of different length returns. Because it is difficult to compute exactly, we suggest ... Cite

Probabilistic planning for decentralized multi-robot systems

Conference AAAI Fall Symposium - Technical Report · January 1, 2015 Multi-robot systems are an exciting application domain for Al research and Dec-POMDPs, specifically. MacDec- POMDP methods can produce high-quality general solutions for realistic heterogeneous multi-robot coordination problems by automatically generating ... Cite

Symbol acquisition for probabilistic high-level planning

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2015 We introduce a framework that enables an agent to autonomously learn its own symbolic representation of a low-level, continuous environment. Propositional symbols are formalized as names for probability distributions, providing a natural means of dealing w ... Cite

Hand preference status and reach kinematics in infants.

Journal Article Infant behavior & development · November 2014 Infants show age-related improvements in reach straightness and smoothness over the first years of life as well as a decrease in average movement speed. This period of changing kinematics overlaps the emergence of handedness. We examined whether infant han ... Full text Cite

Learning parameterized motor skills on a humanoid robot

Journal Article Proceedings - IEEE International Conference on Robotics and Automation · September 22, 2014 We demonstrate a sample-efficient method for constructing reusable parameterized skills that can solve families of related motor tasks. Our method uses learned policies to analyze the policy space topology and learn a set of regression models which, given ... Full text Cite

Planning with macro-actions in decentralized POMDPs

Conference 13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014 · January 1, 2014 Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent's actions are primitiv ... Cite

Active learning of parameterized skills

Journal Article 31st International Conference on Machine Learning, ICML 2014 · January 1, 2014 We introduce a method for actively learning parameterized skills. Parameterized skills are flexible behaviors that can solve any task drawn from a distribution of parameterized reinforcement learning problems. Approaches to learning such skills have been p ... Cite

Behavioral hierarchy: Exploration and representation

Journal Article · January 1, 2014 Behavioral modules are units of behavior providing reusable building blocks that can be composed sequentially and hierarchically to generate extensive ranges of behavior. Hierarchies of behavioral modules facilitate learning complex skills and planning at ... Full text Cite

Constructing symbolic representations for high-level planning

Journal Article Proceedings of the National Conference on Artificial Intelligence · January 1, 2014 We consider the problem of constructing a symbolic description of a continuous, low-level environment for use in planning. We show that symbols that can represent the preconditions and effects of an agent's actions are both necessary and sufficient for hig ... Cite

Hidden parameter Markov decision processes: An emerging paradigm for modeling families of related tasks

Journal Article AAAI Fall Symposium - Technical Report · January 1, 2014 The goal of transfer is to use knowledge obtained by solving one task to improve a robot's (or software agent's) performance in future tasks. In general, we do not expect this to work; for transfer to be feasible, there must be something in common between ... Cite

Optimizing a start-stop controller using policy search

Journal Article Proceedings of the National Conference on Artificial Intelligence · January 1, 2014 We applied a policy search algorithm to the problem of optimizing a start-stop controller-a controller used in a car to turn off the vehicle's engine, and thus save energy, when the vehicle comes to a temporary halt. We were able to improve the existing po ... Cite

Optimal sampling-based planning for linear-quadratic kinodynamic systems

Journal Article Proceedings - IEEE International Conference on Robotics and Automation · November 14, 2013 We propose a new method for applying RRT* to kinodynamic motion planning problems by using finite-horizon linear quadratic regulation (LQR) to measure cost and to extend the tree. First, we introduce the method in the context of arbitrary affine dynamical ... Full text Cite

Robots, skills, and symbols

Journal Article ACM International Conference Proceeding Series · August 30, 2013 This extended abstract summarizes recent work on skill acquisition, which shows that autonomous robot skill acquisition is feasible, and that a robot can thereby improve its own problem-solving capabilities; and on the symbolic representation of plans comp ... Full text Cite

Reports of the 2013 AAAI Spring Symposium Series

Journal Article AI Magazine · January 1, 2013 The Association for the Advancement of Artificial Intelligence was pleased to present the AAAI 2013 Spring Symposium Series, held Monday through Wednesday, March 25-27, 2013. The titles of the eight symposia were Analyzing Microtext; Creativity and (Early) ... Full text Cite

Symbol acquisition for task-level planning

Journal Article AAAI Workshop - Technical Report · January 1, 2013 We consider the problem of how to plan efficiently in low-level, continuous state spaces with temporally abstract actions (or skills), by constructing abstract representations of the problem suitable for task-level planning. The central question this effor ... Cite

The AAAI-13 conference workshops

Journal Article AI Magazine · January 1, 2013 The AAAI-13 Workshop Program, a part of the 27th AAAI Conference on Artificial Intelligence, was held Sunday and Monday, July 14-15, 2013, at the Hyatt Regency Bellevue Hotel in Bellevue, Washington, USA. The program included 12 workshops covering a wide r ... Full text Cite

Primal decomposition and online algorithms for flow optimization in wireless DTNs

Conference Proceedings - IEEE Global Communications Conference, GLOBECOM · January 1, 2013 We study flow optimization in wireless Delay Tolerant Networks (DTNs), using Capacity Region Evolving Graphs (CREGs). CREGs comprise cascaded subgraphs which represent the network topology at consecutive time intervals called epochs. The data flows jointly ... Full text Cite

Learning and generalization of complex tasks from unstructured demonstrations

Journal Article IEEE International Conference on Intelligent Robots and Systems · December 1, 2012 We present a novel method for segmenting demonstrations, recognizing repeated skills, and generalizing complex tasks from unstructured demonstrations. This method combines many of the advantages of recent automatic segmentation methods for learning from de ... Full text Cite

Learning parameterized skills

Journal Article Proceedings of the 29th International Conference on Machine Learning, ICML 2012 · October 10, 2012 We introduce a method for constructing skills capable of solving tasks drawn from a distribution of parameterized reinforcement learning problems. The method draws example tasks from a distribution of interest and uses the corresponding learned policies to ... Cite

Flow optimization in delay tolerant networks using dual decomposition

Conference 2012 10th International Symposium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks, WiOpt 2012 · October 5, 2012 We study flow optimization in Delay Tolerant Networks (DTNs), which we model using Capacity Region Evolving Graphs (CREGs). CREGs consist of different instances (called replicas) of the network graph in cascade; each replica is associated with a distinct t ... Cite

Kinematics of reaching and implications for handedness in rhesus monkey infants.

Journal Article Developmental psychobiology · May 2012 Kinematic studies of reaching in human infants using two-dimensional (2-D) and three-dimensional (3-D) recordings have complemented behavioral studies of infant handedness by providing additional evidence of early right asymmetries. Right hand reaches have ... Full text Cite

Transfer in reinforcement learning via shared features

Journal Article Journal of Machine Learning Research · May 1, 2012 We present a framework for transfer in reinforcement learning based on the idea that related tasks share some common features, and that transfer can be achieved via those shared features. The framework attempts to capture the notion of tasks that are relat ... Cite

Robot learning from demonstration by constructing skill trees

Journal Article International Journal of Robotics Research · March 1, 2012 We describe CST, an online algorithm for constructing skill trees from demonstration trajectories. CST segments a demonstration trajectory into a chain of component skills, where each skill has a goal and is assigned a suitable abstraction from an abstract ... Full text Cite

LQR-RRT*: Optimal sampling-based motion planning with automatically derived extension heuristics

Journal Article Proceedings - IEEE International Conference on Robotics and Automation · January 1, 2012 The RRT* algorithm has recently been proposed as an optimal extension to the standard RRT algorithm [1]. However, like RRT, RRT* is difficult to apply in problems with complicated or underactuated dynamics because it requires the design of a two domain-spe ... Full text Cite

Reports of the AAAI 2012 spring symposia

Journal Article AI Magazine · January 1, 2012 The focus of the AI, The Fundamental Social Aggregation Challenge, and the Autonomy of Hybrid Agent Groups symposium was to explore issues associated with the control of teams of humans, autonomous machines, and robots working together as hybrid agent grou ... Full text Cite

TD γ : Re-evaluating complex backups in temporal difference learning

Conference Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 · December 1, 2011 We show that the λ-return target used in the TD(λ) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an n-step return estimate increases with n. We introduce the γ-return estimator, an alternative target b ... Cite

Autonomous skill acquisition on a mobile manipulator

Journal Article Proceedings of the National Conference on Artificial Intelligence · November 2, 2011 We describe a robot system that autonomously acquires skills through interaction with its environment. The robot learns to sequence the execution of a set of innate controllers to solve a task, extracts and retains components of that solution as portable s ... Cite

Value function approximation in reinforcement learning using the Fourier basis

Journal Article Proceedings of the National Conference on Artificial Intelligence · November 2, 2011 We describe the Fourier basis, a linear value function approximation scheme based on the Fourier series. We empirically demonstrate that it performs well compared to radial basis functions and the polynomial basis, the two most popular fixed bases for line ... Cite

Value Function Approximation in Reinforcement Learning Using the Fourier Basis

Conference Proceedings of the 25th AAAI Conference on Artificial Intelligence, AAAI 2011 · August 11, 2011 We describe the Fourier basis, a linear value function approximation scheme based on the Fourier series. We empirically demonstrate that it performs well compared to radial basis functions and the polynomial basis, the two most popular fixed bases for line ... Cite

Autonomous Skill Acquisition on a Mobile Manipulator

Conference Proceedings of the 25th AAAI Conference on Artificial Intelligence, AAAI 2011 · August 11, 2011 We describe a robot system that autonomously acquires skills through interaction with its environment. The robot learns to sequence the execution of a set of innate controllers to solve a task, extracts and retains components of that solution as portable s ... Cite

TD _γ : Re-evaluating complex backups in temporal difference learning

Journal Article Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 · 2011 We show that the λ-return target used in the TD(λ) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an n-step return estimate increases with n. We introduce the γ-return estimator, an alternative target b ... Cite

TDγ : Re-evaluating complex backups in temporal difference learning

Conference Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 · January 1, 2011 We show that the λ-return target used in the TD(λ) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an n-step return estimate increases with n. We introduce the γ-return estimator, an alternative target b ... Cite

Constructing skill trees for reinforcement learning agents from demonstration trajectories

Journal Article Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010 · December 1, 2010 We introduce CST, an algorithm for constructing skill trees from demonstration trajectories in continuous reinforcement learning domains. CST uses a changepoint detection method to segment each trajectory into a skill chain by detecting a change of appropr ... Cite

Constructing skill trees for reinforcement learning agents from demonstration trajectories

Conference Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010 · January 1, 2010 We introduce CST, an algorithm for constructing skill trees from demonstration trajectories in continuous reinforcement learning domains. CST uses a changepoint detection method to segment each trajectory into a skill chain by detecting a change of appropr ... Cite

Efficient skill learning using abstraction selection

Journal Article IJCAI International Joint Conference on Artificial Intelligence · January 1, 2009 We present an algorithm for selecting an appropriate abstraction when learning a new skill. We show empirically that it can consistently select an appropriate abstraction using very little sample data, and that it significantly improves skill learning perf ... Cite

Skill discovery in continuous reinforcement learning domains using skill chaining

Journal Article Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference · January 1, 2009 We introduce a skill discovery method for reinforcement learning in continuous domains that constructs chains of skills leading to an end-of-task reward. We demonstrate experimentally that it creates appropriate skills and achieves performance benefits in ... Cite

Autonomous robot skill acquisition

Journal Article Proceedings of the National Conference on Artificial Intelligence · December 23, 2008 Cite

Sensorimotor abstraction selection for efficient, autonomous robot skill acquisition

Journal Article 2008 IEEE 7th International Conference on Development and Learning, ICDL · December 1, 2008 To achieve truly autonomous robot skill acquisition, a robot can use neither a single large general state space (because learning is not feasible), nor a small problem-speci c state space (because it is not general).We propose that instead a robot should h ... Full text Cite

Autonomous Robot Skill Acquisition

Conference Proceedings of the 23rd AAAI Conference on Artificial Intelligence, AAAI 2008 · January 1, 2008 Cite

A forward model of optic flow for detecting external forces

Conference IEEE International Conference on Intelligent Robots and Systems · December 1, 2007 Robot positioning is an important function of autonomous intelligent robots. However, the application of external forces to a robot can disrupt its normal operation and cause localisation errors. We present a novel approach for detecting external disturban ... Full text Cite

Building portable options: Skill transfer in reinforcement learning

Journal Article IJCAI International Joint Conference on Artificial Intelligence · December 1, 2007 The options framework provides methods for reinforcement learning agents to build new high-level skills. However, since options are usually learned in the same state space as the problem the agent is solving, they cannot be used in other tasks that are sim ... Cite

Language performance at high school and success in first year computer science

Journal Article Proceedings of the Thirty-Seventh SIGCSE Technical Symposium on Computer Science Education · December 1, 2007 We describe the first part of a study investigating the usefulness of high school language results as a predictor of success in first year computer science courses at a university where students have widely varying English language skills. Our results indi ... Full text Cite

Autonomous shaping: Knowledge transfer in reinforcement learning

Journal Article ACM International Conference Proceeding Series · December 1, 2006 We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks ... Full text Cite

Autonomous shaping: Knowledge transfer in reinforcement learning

Journal Article ICML 2006 - Proceedings of the 23rd International Conference on Machine Learning · October 6, 2006 We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks ... Cite

An adaptive robot motivational system

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2006 We present a robot motivational system design framework. The framework represents the underlying (possibly conflicting) goals of the robot as a set of drives, while ensuring comparable drive levels and providing a mechanism for drive priority adaptation du ... Full text Cite

An architecture for behavior-based reinforcement learning

Journal Article Adaptive Behavior · October 17, 2005 This paper introduces an integration of reinforcement learning and behavior-based control designed to produce real-time learning in situated agents. The model layers a distributed and asynchronous reinforcement learning algorithm over a learned topological ... Full text Cite

METAMorph: Experimenting with genetic regulatory networks for artificial development

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2005 We introduce METAMorph, an open source software platform for the experimental design of simulated cellular development processes using genomes encoded as genetic regulatory networks (GRNs). METAMorph allows researchers to design GRNs by hand and to visuali ... Full text Cite