ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2024
Deploying robots in real-world environments, such as households and manufacturing lines, requires generalization across novel task specifications without violating safety constraints. Linear temporal logic (LTL) is a widely used task specification language ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2024
We propose a new policy class, Composable Interaction Primitives (CIPs), specialized for learning sustained-contact manipulation skills like opening a drawer, pulling a lever, turning a wheel, or shifting gears. CIPs have two primary design goals: to minim ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2024
Real-world robot task planning is intractable in part due to partial observability. A common approach to reducing complexity is introducing additional structure into the decision process, such as mixed-observability, factored states, or temporally-extended ...
Full textCite
ConferenceProceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 · June 27, 2023
We present Q-functionals, an alternative architecture for continuous control deep reinforcement learning. Instead of returning a single value for a state-action pair, our network transforms a state into a function that can be rapidly evaluated in parallel ...
Cite
Journal ArticleInternational Journal of Robotics Research · April 1, 2023
We present a framework for the automatic encoding and repair of high-level tasks. Given a set of skills a robot can perform, our approach first abstracts sensor data into symbols and then automatically encodes the robot’s capabilities in Linear Temporal Lo ...
Full textCite
Journal ArticleNeural networks : the official journal of the International Neural Network Society · March 2023
Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original trainin ...
Full textCite
ConferenceProceedings of Machine Learning Research · January 1, 2023
Principled decision-making in continuous state-action spaces is impossible without some assumptions. A common approach is to assume Lipschitz continuity of the Q-function. We show that, unfortunately, this property fails to hold in many typical domains. We ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2023
We propose a new method for count-based exploration in high-dimensional state spaces. Unlike previous work which relies on density models, we show that counts can be derived by averaging samples from the Rademacher distribution (or coin flips). This insigh ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2023
We propose a novel parameterized skill-learning algorithm that aims to learn transferable parameterized skills and synthesize them into a new action space that supports efficient learning in long-horizon tasks. We propose to leverage off-policy Meta-RL com ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2023
We introduce RLang, a domain-specific language (DSL) for communicating domain knowledge to an RL agent. Unlike existing RL DSLs that ground to single elements of a decision-making formalism (e.g., the reward function or policy), RLang can specify informati ...
Cite
ConferenceIEEE International Conference on Intelligent Robots and Systems · January 1, 2023
Dynamic movement primitives are widely used for learning skills that can be demonstrated to a robot by a skilled human or controller. While their generalization capabilities and simple formulation make them very appealing to use, they possess no strong gua ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · January 1, 2023
It is imperative that robots can understand natural language commands issued by humans. Such commands typically contain verbs that signify what action should be performed on a given object and that are applicable to many objects. We propose a method for ge ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · January 1, 2023
Conversational assistive robots can aid people, especially those with cognitive impairments, to accomplish various tasks such as cooking meals, performing exercises, or operating machines. However, to interact with people effectively, robots must recognize ...
Full textCite
ConferenceProceedings of Machine Learning Research · January 1, 2023
We address the problem of efficiently learning high-level abstractions for task-level robot planning. Existing approaches require large amounts of data and fail to generalize learned abstractions to new environments. To address this, we propose to exploit ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2023
An agent learning an option in hierarchical reinforcement learning must solve three problems: identify the option's subgoal (termination condition), learn a policy, and learn where that policy will succeed (initiation set). The termination condition is typ ...
Cite
Conference11th International Conference on Learning Representations, ICLR 2023 · January 1, 2023
In the Hidden-Parameter MDP (HiP-MDP) framework, a family of reinforcement learning tasks is generated by varying hidden parameters specifying the dynamics and reward function for each individual task. The HiP-MDP is a natural model for families of tasks i ...
Cite
Journal ArticleIEEE Robotics and Automation Letters · July 1, 2022
Inverse kinematics - finding joint poses that reach a given Cartesian-space end-effector pose - is a fundamental operation in robotics, since goals and waypoints are typically defined in Cartesian space, but robots must be controlled in joint space. Howeve ...
Full textCite
ConferenceProceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 · June 30, 2022
Optimistic initialization underpins many theoretically sound exploration schemes in tabular domains; however, in the deep function approximation setting, optimism can quickly disappear if initialized naïvely. We propose a framework for more effectively inc ...
Cite
ConferenceSpringer Proceedings in Advanced Robotics · January 1, 2022
We present a framework that, given a set of skills a robot can perform, abstracts sensor data into symbols that are used to automatically encode the robot’s capabilities in Linear Temporal Logic (LTL). We specify reactive high-level tasks based on these ca ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2024
Deploying robots in real-world environments, such as households and manufacturing lines, requires generalization across novel task specifications without violating safety constraints. Linear temporal logic (LTL) is a widely used task specification language ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2024
We propose a new policy class, Composable Interaction Primitives (CIPs), specialized for learning sustained-contact manipulation skills like opening a drawer, pulling a lever, turning a wheel, or shifting gears. CIPs have two primary design goals: to minim ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2024
Real-world robot task planning is intractable in part due to partial observability. A common approach to reducing complexity is introducing additional structure into the decision process, such as mixed-observability, factored states, or temporally-extended ...
Full textCite
ConferenceProceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 · June 27, 2023
We present Q-functionals, an alternative architecture for continuous control deep reinforcement learning. Instead of returning a single value for a state-action pair, our network transforms a state into a function that can be rapidly evaluated in parallel ...
Cite
Journal ArticleInternational Journal of Robotics Research · April 1, 2023
We present a framework for the automatic encoding and repair of high-level tasks. Given a set of skills a robot can perform, our approach first abstracts sensor data into symbols and then automatically encodes the robot’s capabilities in Linear Temporal Lo ...
Full textCite
Journal ArticleNeural networks : the official journal of the International Neural Network Society · March 2023
Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original trainin ...
Full textCite
ConferenceProceedings of Machine Learning Research · January 1, 2023
Principled decision-making in continuous state-action spaces is impossible without some assumptions. A common approach is to assume Lipschitz continuity of the Q-function. We show that, unfortunately, this property fails to hold in many typical domains. We ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2023
We propose a new method for count-based exploration in high-dimensional state spaces. Unlike previous work which relies on density models, we show that counts can be derived by averaging samples from the Rademacher distribution (or coin flips). This insigh ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2023
We propose a novel parameterized skill-learning algorithm that aims to learn transferable parameterized skills and synthesize them into a new action space that supports efficient learning in long-horizon tasks. We propose to leverage off-policy Meta-RL com ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2023
We introduce RLang, a domain-specific language (DSL) for communicating domain knowledge to an RL agent. Unlike existing RL DSLs that ground to single elements of a decision-making formalism (e.g., the reward function or policy), RLang can specify informati ...
Cite
ConferenceIEEE International Conference on Intelligent Robots and Systems · January 1, 2023
Dynamic movement primitives are widely used for learning skills that can be demonstrated to a robot by a skilled human or controller. While their generalization capabilities and simple formulation make them very appealing to use, they possess no strong gua ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · January 1, 2023
It is imperative that robots can understand natural language commands issued by humans. Such commands typically contain verbs that signify what action should be performed on a given object and that are applicable to many objects. We propose a method for ge ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · January 1, 2023
Conversational assistive robots can aid people, especially those with cognitive impairments, to accomplish various tasks such as cooking meals, performing exercises, or operating machines. However, to interact with people effectively, robots must recognize ...
Full textCite
ConferenceProceedings of Machine Learning Research · January 1, 2023
We address the problem of efficiently learning high-level abstractions for task-level robot planning. Existing approaches require large amounts of data and fail to generalize learned abstractions to new environments. To address this, we propose to exploit ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2023
An agent learning an option in hierarchical reinforcement learning must solve three problems: identify the option's subgoal (termination condition), learn a policy, and learn where that policy will succeed (initiation set). The termination condition is typ ...
Cite
Conference11th International Conference on Learning Representations, ICLR 2023 · January 1, 2023
In the Hidden-Parameter MDP (HiP-MDP) framework, a family of reinforcement learning tasks is generated by varying hidden parameters specifying the dynamics and reward function for each individual task. The HiP-MDP is a natural model for families of tasks i ...
Cite
Journal ArticleIEEE Robotics and Automation Letters · July 1, 2022
Inverse kinematics - finding joint poses that reach a given Cartesian-space end-effector pose - is a fundamental operation in robotics, since goals and waypoints are typically defined in Cartesian space, but robots must be controlled in joint space. Howeve ...
Full textCite
ConferenceProceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 · June 30, 2022
Optimistic initialization underpins many theoretically sound exploration schemes in tabular domains; however, in the deep function approximation setting, optimism can quickly disappear if initialized naïvely. We propose a framework for more effectively inc ...
Cite
ConferenceSpringer Proceedings in Advanced Robotics · January 1, 2022
We present a framework that, given a set of skills a robot can perform, abstracts sensor data into symbols that are used to automatically encode the robot’s capabilities in Linear Temporal Logic (LTL). We specify reactive high-level tasks based on these ca ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2022
Variable impedance control in operation-space is a promising approach to learning contact-rich manipulation behaviors. One of the main challenges with this approach is producing a manipulation behavior that ensures the safety of the arm and the environment ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2022
Recent work on using natural language to specify commands to robots has grounded that language to LTL. However, mapping natural language task specifications to LTL task specifications using language models require probability distributions over finite voca ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2022
Manipulating an articulated object requires perceiving its kinematic hierarchy: its parts, how each can move, and how those motions are coupled. Previous work has explored perception for kinematics, but none infers a complete kinematic hierarchy on never-b ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2022
Robots that process navigation instructions in large outdoor environments will need to operate at different levels of abstraction. For example, a land-surveying aerial robot receiving the instruction 'go to Boston and go through the state forest on the way ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2022
In realistic applications of object search, robots will need to locate target objects in complex environments while coping with unreliable sensors, especially for small or hard-to-detect objects. In such settings, correlational information can be valuable ...
Full textCite
ConferenceICLR 2022 - 10th International Conference on Learning Representations · January 1, 2022
We propose a method for autonomously learning an object-centric representation of a continuous and high-dimensional environment that is suitable for planning. Such representations can immediately be transferred between tasks that share the same types of ob ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2022
We propose a model-based lifelong reinforcement-learning approach that estimates a hierarchical Bayesian posterior distilling the common structure shared across different tasks. The learned posterior combined with a sample-based Bayesian exploration proced ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2022
AlphaZero, an approach to reinforcement learning that couples neural networks and Monte Carlo tree search (MCTS), has produced state-of-the-art strategies for traditional board games like chess, Go, shogi, and Hex. While researchers and game commentators h ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2022
Deep neural networks can approximate functions on different types of data, from images to graphs, with varied underlying structure. This underlying structure can be viewed as the geometry of the data manifold. By extending recent advances in the theoretica ...
Cite
Journal ArticleJournal of Machine Learning Research · January 1, 2021
A key challenge in intelligent robotics is creating robots that are capable of directly interacting with the world around them to achieve their goals. The last decade has seen substantial growth in research on the problem of robot manipulation, which aims ...
Cite
Journal Article35th AAAI Conference on Artificial Intelligence, AAAI 2021 · January 1, 2021
A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep radial-b ...
Cite
Journal ArticleIEEE International Conference on Intelligent Robots and Systems · January 1, 2021
Learning a robot motor skill from scratch is impractically slow; so much so that in practice, learning must typically be bootstrapped using human demonstration. However, relying on human demonstration necessarily degrades the autonomy of robots that must l ...
Full textCite
Conference35th AAAI Conference on Artificial Intelligence, AAAI 2021 · January 1, 2021
We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO), a novel algorithm for visual transfer in Reinforcement Learning that explicitly learns to align the distributions of extracted features between a source and target task. WAPPO appro ...
Cite
ConferenceIEEE International Conference on Intelligent Robots and Systems · January 1, 2021
We propose an approach to multi-modal grasp detection that jointly predicts the probabilities that several types of grasps succeed at a given grasp pose. Given a partial point cloud of a scene, the algorithm proposes a set of feasible grasp candidates, the ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · January 1, 2021
Robots operating in households must find objects on shelves, under tables, and in cupboards. In such environments, it is crucial to search efficiently at 3D scale while coping with limited field of view and the complexity of searching for multiple objects. ...
Full textCite
ConferenceIJCAI International Joint Conference on Artificial Intelligence · January 1, 2021
Hierarchical reinforcement learning (HRL) is only effective for long-horizon problems when high-level skills can be reliably sequentially executed. Unfortunately, learning reliably composable skills is difficult, because all the components of every skill a ...
Cite
ConferenceIJCAI International Joint Conference on Artificial Intelligence · January 1, 2021
The difficulty of deterministic planning increases exponentially with search-tree depth. Black-box planning presents an even greater challenge, since planners must operate without an explicit model of the domain. Heuristics can make search more efficient, ...
Cite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · January 1, 2021
Robots must reason about pushing and grasping in order to engage in flexible manipulation in cluttered environments. Earlier works on learning pushing and grasping only consider each operation in isolation or are limited to top-down grasping and bin-pickin ...
Full textCite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2021
A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically learn by way of an abstract state representation, ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2021
We introduce a new skill-discovery algorithm that builds a discrete graph representation of large continuous MDPs, where nodes correspond to skill subgoals and the edges to skill policies. The agent constructs this graph during an unsupervised training pha ...
Cite
ConferenceIEEE International Conference on Intelligent Robots and Systems · October 24, 2020
Precomputed roadmaps can enable effective multi-query motion planning: a roadmap can be built for a robot as if no obstacles were present, and then after edges invalidated by obstacles observed at query time are deleted, path search through the remaining r ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · October 24, 2020
We propose Action-Oriented Semantic Maps (AOSMs), a representation that enables a robot to acquire object manipulation behaviors and semantic information about the environment from a human teacher with a Mixed Reality Head-Mounted Display (MR-HMD). AOSMs a ...
Full textCite
Journal ArticleIEEE Trans Robot · August 2020
Ophthalmic microsurgery is technically difficult because the scale of required surgical tool manipulations challenge the limits of the surgeon's visual acuity, sensory perception, and physical dexterity. Intraoperative optical coherence tomography (OCT) im ...
Full textLink to itemCite
Journal Article37th International Conference on Machine Learning, ICML 2020 · January 1, 2020
We present a framework for autonomously learning a portable representation that describes a collection of low-level continuous environments. We show that these abstract representations can be learned in a task-independent egocentric space specific to the a ...
Cite
Conference · January 1, 2020
Efficient motion intent communication is necessary for safe and collaborative work environments with collocated humans and robots. Humans efficiently communicate their motion intent to other humans through gestures, gaze, and social cues. However, robots o ...
Full textCite
Conference · January 1, 2020
Teleoperation allows a human to remotely operate a robot to perform complex and potentially dangerous tasks such as defusing a bomb, repairing a nuclear reactor, or maintaining the exterior of a space station. Existing teleoperation approaches generally re ...
Full textCite
Conference8th International Conference on Learning Representations, ICLR 2020 · January 1, 2020
While many option discovery methods have been proposed to accelerate exploration in reinforcement learning, they are often heuristic. Recently, covering options was proposed to discover a set of options that provably reduce the upper bound of the environme ...
Cite
ConferenceAAAI 2020 - 34th AAAI Conference on Artificial Intelligence · January 1, 2020
We propose an abstraction method for open-world environments expressed as Factored Markov Decision Processes (FMDPs) with very large state and action spaces. Our method prunes state and action variables that are irrelevant to the optimal value function on ...
Cite
ConferenceRobotics: Science and Systems · January 1, 2020
Enabling robots to learn tasks and follow instructions as easily as humans is important for many real-world robot applications. Previous approaches have applied machine learning to teach the mapping from language to low dimensional symbolic representations ...
Full textCite
Conference8th International Conference on Learning Representations, ICLR 2020 · January 1, 2020
Autonomously discovering temporally extended actions, or skills, is a longstanding goal of hierarchical reinforcement learning. We propose a new algorithm that combines skill chaining with deep neural networks to autonomously discover skills in high-dimens ...
Cite
Journal ArticleIEEE International Conference on Intelligent Robots and Systems · November 1, 2019
We develop a system to disambiguate object instances within the same class based on simple physical descriptions. The system takes as input a natural language phrase and a depth image containing a segmented object and predicts how similar the observed obje ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · November 1, 2019
We present a feedback motion planning algorithm, Bounded-Error LQR-Trees, that leverages reinforcement learning theory to find a policy with a bounded amount of error. The algorithm composes locally valid linear-quadratic regulators (LQR) into a nonlinear ...
Full textCite
Journal ArticleCurrent Opinion in Behavioral Sciences · October 1, 2019
A generally intelligent agent faces a dilemma: it requires a complex sensorimotor space to be capable of solving a wide range of problems, but many tasks are only feasible given the right problem-specific formulation. I argue that a necessary but understud ...
Full textCite
Journal ArticleInternational Journal of Robotics Research · October 1, 2019
Efficient motion intent communication is necessary for safe and collaborative work environments with co-located humans and robots. Humans efficiently communicate their motion intent to other humans through gestures, gaze, and other non-verbal cues, and can ...
Full textCite
ConferenceProceedings of the International Conference on Application-Specific Systems, Architectures and Processors · July 1, 2019
We have designed a programmable architecture to accelerate collision detection and graph search, two of the principal components of robotic motion planning. The programmability enables the architecture to be applied to a wide range of different robots and ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · May 1, 2019
Mixed Reality (MR) is a promising interface for robot programming because it can project an immersive 3D visualization of a robot's intended movement onto the real world. MR can also support hand gestures, which provide an intuitive way for users to constr ...
Full textCite
ConferenceProceedings - IEEE International Conference on Robotics and Automation · May 1, 2019
Security is particularly important in robotics, as robots can directly perceive and affect the physical world. We describe the results of a scan of the entire IPv4 address space of the Internet for instances of the Robot Operating System (ROS), a widely us ...
Full textCite
Journal ArticleJournal of Artificial Intelligence Research · March 1, 2019
Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized multi-agent decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent’s actions ...
Full textCite
ConferenceIJCAI International Joint Conference on Artificial Intelligence · January 1, 2019
Deep Q-Network (DQN) is an algorithm that achieves human-level performance in complex domains like Atari games. One of the important elements of DQN is its use of a target network, which is necessary to stabilize learning. We argue that using a target netw ...
Full textCite
ConferenceProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS · January 1, 2019
Deep Q-Network (DQX) is a learning algorithm that achieves humanlevel performance in high-dimensional domains like Atari games. We propose that using an softmax operator, Mellowmax, in DQN reduces its need for a separate target network, which is otherwise ...
Cite
Conference7th International Conference on Learning Representations, ICLR 2019 · January 1, 2019
Multi-level hierarchies have the potential to accelerate learning in sparse reward tasks because they can divide a problem into a set of short horizon subproblems. In order to realize this potential, Hierarchical Reinforcement Learning (HRL) algorithms nee ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2019
Robots operating in human environments must be capable of interacting with a wide variety of articulated objects such as cabinets, refrigerators, and drawers. Existing approaches require human demonstration or minutes of interaction to fit kinematic models ...
Cite
ConferenceIEEE International Conference on Intelligent Robots and Systems · December 27, 2018
A critical capability required for generally intelligent robot behavior is the ability to sequence motor skills to reach a goal. This requires a (typically abstract) representation that supports goal-directed planning, which raises the question of how to c ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · December 27, 2018
We introduce Hybrid Bayesian Eigenobjects (HBEOs), a novel representation for 3D objects designed to allow a robot to jointly estimate the pose, class, and full 3D geometry of a novel object observed from a single viewpoint in a single practical framework. ...
Full textCite
Journal ArticleAutonomous Robots · October 1, 2018
We present a framework for representing scenarios with complex object interactions, where a robot cannot directly interact with the object it wishes to control and must instead influence it via intermediate objects. For instance, a robot learning to drive ...
Full textCite
Journal ArticleJournal of motor behavior · July 2018
The primary goal of this study was to examine the relations between limb control and handedness in adults. Participants were categorized as left or right handed for analyses using the Edinburgh Handedness Inventory. Three-dimensional recordings were made o ...
Full textCite
Journal ArticleJournal of Artificial Intelligence Research · January 1, 2018
We consider the problem of constructing abstract representations for planning in highdimensional, continuous environments. We assume an agent equipped with a collection of high-level actions, and construct representations provably capable of evaluating pla ...
Full textCite
Conference35th International Conference on Machine Learning, ICML 2018 · January 1, 2018
We consider the problem of how best to use prior experience to bootstrap lifelong learning, where an agent faces a series of task instances drawn from some task distribution. First, we identify the initial policy that optimizes expected performance over th ...
Cite
Conference31st AAAI Conference on Artificial Intelligence, AAAI 2017 · January 1, 2017
An intriguing application of transfer learning emerges when tasks arise with similar, but not identical, dynamics. Hidden Parameter Markov Decision Processes (HiP-MDP) embed these tasks into a low-dimensional space; given the embedding parameters one can i ...
Cite
Conference31st AAAI Conference on Artificial Intelligence, AAAI 2017 · January 1, 2017
Monte Carlo Tree Search (MCTS) is a family of directed search algorithms that has gained widespread attention in recent years. Despite the vast amount of research into MCTS, the effect of modifications on the algorithm, as well as the manner in which it pe ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2017
We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings. Our new framework correctly models the joint uncertainty in the latent par ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2017
We introduce an online active exploration algorithm for data-efficiently learning an abstract symbolic model of an environment. Our algorithm is divided into two parts: the first part quickly generates an intermediate Bayesian symbolic model from the data ...
Cite
ConferenceRobotics: Science and Systems · January 1, 2017
We introduce Bayesian Eigenobjects (BEOs), a novel object representation that is the first technique able to perform joint classification, pose estimation, and 3D geometric completion on previously unencountered and partially observed query objects. BEOs e ...
Full textCite
ConferenceProceedings of the Annual International Symposium on Microarchitecture, MICRO · December 14, 2016
We have developed a hardware accelerator for motion planning, a critical operation in robotics. In this paper, we present the microarchitecture of our accelerator and describe a prototype implementation on an FPGA. We experimentally show that the accelerat ...
Full textCite
ConferenceInternational Journal of Robotics Research · December 1, 2016
We introduce a principled method for multi-robot coordination based on a general model (termed a MacDec-POMDP) of multi-robot cooperative planning in the presence of stochasticity, uncertain sensing, and communication limitations. A new MacDec-POMDP planni ...
Full textCite
ConferenceIJCAI International Joint Conference on Artificial Intelligence · January 1, 2016
Control applications often feature tasks with similar, but not identical, dynamics. We introduce the Hidden Parameter Markov Decision Process (HiPMDP), a framework that parametrizes a family of related dynamical systems with a low-dimensional set of latent ...
Cite
ConferenceIJCAI International Joint Conference on Artificial Intelligence · January 1, 2016
We describe a framework for building abstraction hierarchies whereby an agent alternates skill- and representation-construction phases to construct a sequence of increasingly abstract Markov decision processes. Our formulation builds on recent results show ...
Cite
Conference30th AAAI Conference on Artificial Intelligence, AAAI 2016 · January 1, 2016
We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. ...
Cite
ConferenceRobotics: Science and Systems · January 1, 2016
We describe a process that constructs robot-specific circuitry for motion planning, capable of generating motion plans approximately three orders of magnitude faster than existing methods. Our method is based on building collision detection circuits for a ...
Cite
ConferenceRobotics: Science and Systems · January 1, 2016
We present a framework for representing scenarios with complex object interactions, in which a robot cannot directly interact with the object it wishes to control, but must instead do so via intermediate objects. For example, a robot learning to drive a ca ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · December 11, 2015
We present a method for segmenting a set of unstructured demonstration trajectories to discover reusable skills using inverse reinforcement learning (IRL). Each skill is characterised by a latent reward function which the demonstrator is assumed to be opti ...
Full textCite
Journal ArticleMachine Learning · September 17, 2015
We introduce feature regularization during feature selection for value function approximation. Feature regularization introduces a prior into the selection process, improving function approximation accuracy and reducing overfitting. We show that the smooth ...
Full textCite
Journal ArticleProceedings - IEEE International Conference on Robotics and Automation · June 29, 2015
This paper presents a probabilistic framework for synthesizing control policies for general multi-robot systems that is based on decentralized partially observable Markov decision processes (Dec-POMDPs). Dec-POMDPs are a general model of decision-making wh ...
Full textCite
Journal ArticleInternational Journal of Robotics Research · March 3, 2015
Robots exhibit flexible behavior largely in proportion to their degree of knowledge about the world. Such knowledge is often meticulously hand-coded for a narrow class of tasks, limiting the scope of possible robot competencies. Thus, the primary limiting ...
Full textCite
ConferenceAI Magazine · March 1, 2015
The AAAI-14 Workshop program was held Sunday and Monday, July 27-28, 2014, at the Québec City Convention Centre in Québec, Canada. The AAAI-14 workshop program included 15 workshops covering a wide range of topics in artificial intelligence. The titles of ...
Full textCite
ConferenceRobotics: Science and Systems · January 1, 2015
We introduce a principled method for multi-robot coordination based on a generic model (termed a MacDec-POMDP) of multi-robot cooperative planning in the presence of stochasticity, uncertain sensing and communication limitations. We present a new MacDec-PO ...
Full textCite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2015
We propose the-return as an alternative to the λ-return currently used by the TD(λ) family of algorithms. The benefit of the-return is that it accounts for the correlation of different length returns. Because it is difficult to compute exactly, we suggest ...
Cite
ConferenceAAAI Fall Symposium - Technical Report · January 1, 2015
Multi-robot systems are an exciting application domain for Al research and Dec-POMDPs, specifically. MacDec- POMDP methods can produce high-quality general solutions for realistic heterogeneous multi-robot coordination problems by automatically generating ...
Cite
ConferenceIJCAI International Joint Conference on Artificial Intelligence · January 1, 2015
We introduce a framework that enables an agent to autonomously learn its own symbolic representation of a low-level, continuous environment. Propositional symbols are formalized as names for probability distributions, providing a natural means of dealing w ...
Cite
Journal ArticleInfant behavior & development · November 2014
Infants show age-related improvements in reach straightness and smoothness over the first years of life as well as a decrease in average movement speed. This period of changing kinematics overlaps the emergence of handedness. We examined whether infant han ...
Full textCite
Journal ArticleProceedings - IEEE International Conference on Robotics and Automation · September 22, 2014
We demonstrate a sample-efficient method for constructing reusable parameterized skills that can solve families of related motor tasks. Our method uses learned policies to analyze the policy space topology and learn a set of regression models which, given ...
Full textCite
Conference13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014 · January 1, 2014
Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent's actions are primitiv ...
Cite
Journal Article31st International Conference on Machine Learning, ICML 2014 · January 1, 2014
We introduce a method for actively learning parameterized skills. Parameterized skills are flexible behaviors that can solve any task drawn from a distribution of parameterized reinforcement learning problems. Approaches to learning such skills have been p ...
Cite
Journal Article · January 1, 2014
Behavioral modules are units of behavior providing reusable building blocks that can be composed sequentially and hierarchically to generate extensive ranges of behavior. Hierarchies of behavioral modules facilitate learning complex skills and planning at ...
Full textCite
Journal ArticleProceedings of the National Conference on Artificial Intelligence · January 1, 2014
We consider the problem of constructing a symbolic description of a continuous, low-level environment for use in planning. We show that symbols that can represent the preconditions and effects of an agent's actions are both necessary and sufficient for hig ...
Cite
Journal ArticleAAAI Fall Symposium - Technical Report · January 1, 2014
The goal of transfer is to use knowledge obtained by solving one task to improve a robot's (or software agent's) performance in future tasks. In general, we do not expect this to work; for transfer to be feasible, there must be something in common between ...
Cite
Journal ArticleProceedings of the National Conference on Artificial Intelligence · January 1, 2014
We applied a policy search algorithm to the problem of optimizing a start-stop controller-a controller used in a car to turn off the vehicle's engine, and thus save energy, when the vehicle comes to a temporary halt. We were able to improve the existing po ...
Cite
Journal ArticleProceedings - IEEE International Conference on Robotics and Automation · November 14, 2013
We propose a new method for applying RRT* to kinodynamic motion planning problems by using finite-horizon linear quadratic regulation (LQR) to measure cost and to extend the tree. First, we introduce the method in the context of arbitrary affine dynamical ...
Full textCite
Journal ArticleACM International Conference Proceeding Series · August 30, 2013
This extended abstract summarizes recent work on skill acquisition, which shows that autonomous robot skill acquisition is feasible, and that a robot can thereby improve its own problem-solving capabilities; and on the symbolic representation of plans comp ...
Full textCite
Journal ArticleAI Magazine · January 1, 2013
The Association for the Advancement of Artificial Intelligence was pleased to present the AAAI 2013 Spring Symposium Series, held Monday through Wednesday, March 25-27, 2013. The titles of the eight symposia were Analyzing Microtext; Creativity and (Early) ...
Full textCite
Journal ArticleAAAI Workshop - Technical Report · January 1, 2013
We consider the problem of how to plan efficiently in low-level, continuous state spaces with temporally abstract actions (or skills), by constructing abstract representations of the problem suitable for task-level planning. The central question this effor ...
Cite
Journal ArticleAI Magazine · January 1, 2013
The AAAI-13 Workshop Program, a part of the 27th AAAI Conference on Artificial Intelligence, was held Sunday and Monday, July 14-15, 2013, at the Hyatt Regency Bellevue Hotel in Bellevue, Washington, USA. The program included 12 workshops covering a wide r ...
Full textCite
ConferenceProceedings - IEEE Global Communications Conference, GLOBECOM · January 1, 2013
We study flow optimization in wireless Delay Tolerant Networks (DTNs), using Capacity Region Evolving Graphs (CREGs). CREGs comprise cascaded subgraphs which represent the network topology at consecutive time intervals called epochs. The data flows jointly ...
Full textCite
Journal ArticleIEEE International Conference on Intelligent Robots and Systems · December 1, 2012
We present a novel method for segmenting demonstrations, recognizing repeated skills, and generalizing complex tasks from unstructured demonstrations. This method combines many of the advantages of recent automatic segmentation methods for learning from de ...
Full textCite
Journal ArticleProceedings of the 29th International Conference on Machine Learning, ICML 2012 · October 10, 2012
We introduce a method for constructing skills capable of solving tasks drawn from a distribution of parameterized reinforcement learning problems. The method draws example tasks from a distribution of interest and uses the corresponding learned policies to ...
Cite
Conference2012 10th International Symposium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks, WiOpt 2012 · October 5, 2012
We study flow optimization in Delay Tolerant Networks (DTNs), which we model using Capacity Region Evolving Graphs (CREGs). CREGs consist of different instances (called replicas) of the network graph in cascade; each replica is associated with a distinct t ...
Cite
Journal ArticleDevelopmental psychobiology · May 2012
Kinematic studies of reaching in human infants using two-dimensional (2-D) and three-dimensional (3-D) recordings have complemented behavioral studies of infant handedness by providing additional evidence of early right asymmetries. Right hand reaches have ...
Full textCite
Journal ArticleJournal of Machine Learning Research · May 1, 2012
We present a framework for transfer in reinforcement learning based on the idea that related tasks share some common features, and that transfer can be achieved via those shared features. The framework attempts to capture the notion of tasks that are relat ...
Cite
Journal ArticleInternational Journal of Robotics Research · March 1, 2012
We describe CST, an online algorithm for constructing skill trees from demonstration trajectories. CST segments a demonstration trajectory into a chain of component skills, where each skill has a goal and is assigned a suitable abstraction from an abstract ...
Full textCite
Journal ArticleProceedings - IEEE International Conference on Robotics and Automation · January 1, 2012
The RRT* algorithm has recently been proposed as an optimal extension to the standard RRT algorithm [1]. However, like RRT, RRT* is difficult to apply in problems with complicated or underactuated dynamics because it requires the design of a two domain-spe ...
Full textCite
Journal ArticleAI Magazine · January 1, 2012
The focus of the AI, The Fundamental Social Aggregation Challenge, and the Autonomy of Hybrid Agent Groups symposium was to explore issues associated with the control of teams of humans, autonomous machines, and robots working together as hybrid agent grou ...
Full textCite
ConferenceAdvances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 · December 1, 2011
We show that the λ-return target used in the TD(λ) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an n-step return estimate increases with n. We introduce the γ-return estimator, an alternative target b ...
Cite
Journal ArticleProceedings of the National Conference on Artificial Intelligence · November 2, 2011
We describe a robot system that autonomously acquires skills through interaction with its environment. The robot learns to sequence the execution of a set of innate controllers to solve a task, extracts and retains components of that solution as portable s ...
Cite
Journal ArticleProceedings of the National Conference on Artificial Intelligence · November 2, 2011
We describe the Fourier basis, a linear value function approximation scheme based on the Fourier series. We empirically demonstrate that it performs well compared to radial basis functions and the polynomial basis, the two most popular fixed bases for line ...
Cite
ConferenceProceedings of the 25th AAAI Conference on Artificial Intelligence, AAAI 2011 · August 11, 2011
We describe the Fourier basis, a linear value function approximation scheme based on the Fourier series. We empirically demonstrate that it performs well compared to radial basis functions and the polynomial basis, the two most popular fixed bases for line ...
Cite
ConferenceProceedings of the 25th AAAI Conference on Artificial Intelligence, AAAI 2011 · August 11, 2011
We describe a robot system that autonomously acquires skills through interaction with its environment. The robot learns to sequence the execution of a set of innate controllers to solve a task, extracts and retains components of that solution as portable s ...
Cite
Journal ArticleAdvances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 · 2011
We show that the λ-return target used in the TD(λ) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an n-step return estimate increases with n. We introduce the γ-return estimator, an alternative target b ...
Cite
ConferenceAdvances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 · January 1, 2011
We show that the λ-return target used in the TD(λ) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an n-step return estimate increases with n. We introduce the γ-return estimator, an alternative target b ...
Cite
Journal ArticleAdvances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010 · December 1, 2010
We introduce CST, an algorithm for constructing skill trees from demonstration trajectories in continuous reinforcement learning domains. CST uses a changepoint detection method to segment each trajectory into a skill chain by detecting a change of appropr ...
Cite
ConferenceAdvances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010 · January 1, 2010
We introduce CST, an algorithm for constructing skill trees from demonstration trajectories in continuous reinforcement learning domains. CST uses a changepoint detection method to segment each trajectory into a skill chain by detecting a change of appropr ...
Cite
Journal ArticleIJCAI International Joint Conference on Artificial Intelligence · January 1, 2009
We present an algorithm for selecting an appropriate abstraction when learning a new skill. We show empirically that it can consistently select an appropriate abstraction using very little sample data, and that it significantly improves skill learning perf ...
Cite
Journal ArticleAdvances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference · January 1, 2009
We introduce a skill discovery method for reinforcement learning in continuous domains that constructs chains of skills leading to an end-of-task reward. We demonstrate experimentally that it creates appropriate skills and achieves performance benefits in ...
Cite
Journal Article2008 IEEE 7th International Conference on Development and Learning, ICDL · December 1, 2008
To achieve truly autonomous robot skill acquisition, a robot can use neither a single large general state space (because learning is not feasible), nor a small problem-speci c state space (because it is not general).We propose that instead a robot should h ...
Full textCite
ConferenceIEEE International Conference on Intelligent Robots and Systems · December 1, 2007
Robot positioning is an important function of autonomous intelligent robots. However, the application of external forces to a robot can disrupt its normal operation and cause localisation errors. We present a novel approach for detecting external disturban ...
Full textCite
Journal ArticleIJCAI International Joint Conference on Artificial Intelligence · December 1, 2007
The options framework provides methods for reinforcement learning agents to build new high-level skills. However, since options are usually learned in the same state space as the problem the agent is solving, they cannot be used in other tasks that are sim ...
Cite
Journal ArticleProceedings of the Thirty-Seventh SIGCSE Technical Symposium on Computer Science Education · December 1, 2007
We describe the first part of a study investigating the usefulness of high school language results as a predictor of success in first year computer science courses at a university where students have widely varying English language skills. Our results indi ...
Full textCite
Journal ArticleACM International Conference Proceeding Series · December 1, 2006
We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks ...
Full textCite
Journal ArticleICML 2006 - Proceedings of the 23rd International Conference on Machine Learning · October 6, 2006
We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks ...
Cite
Journal ArticleLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2006
We present a robot motivational system design framework. The framework represents the underlying (possibly conflicting) goals of the robot as a set of drives, while ensuring comparable drive levels and providing a mechanism for drive priority adaptation du ...
Full textCite
Journal ArticleAdaptive Behavior · October 17, 2005
This paper introduces an integration of reinforcement learning and behavior-based control designed to produce real-time learning in situated agents. The model layers a distributed and asynchronous reinforcement learning algorithm over a learned topological ...
Full textCite
Journal ArticleLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2005
We introduce METAMorph, an open source software platform for the experimental design of simulated cellular development processes using genomes encoded as genetic regulatory networks (GRNs). METAMorph allows researchers to design GRNs by hand and to visuali ...
Full textCite