Skip to main content

Hai "Helen" Li

Professor in the Department of Electrical and Computer Engineering
Electrical and Computer Engineering
RM130 Hudson Hall, Box 90291, Durham, NC 27701
#407 Wilkinson Building, 534 Research Drive, Durham, NC 27701
Office hours Appointment by email hai.li@duke.edu  

Selected Publications


TFSRAM: A 249.8TOPS/W Timing-to-First-Spike Compute-in-Memory Neuromorphic Processing Engine With Twin-Column SRAM Synapses

Journal Article IEEE Transactions on Circuits and Systems for Artificial Intelligence · September 2024 Full text Cite

Processing-in-Memory Designs Based on Emerging Technology for Efficient Machine Learning Acceleration

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · June 12, 2024 The unprecedented success of artificial intelligence (AI) enriches machine learning (ML)-based applications. The availability of big data and compute-intensive algorithms empowers versatility and high accuracy in ML approaches. However, the data processing ... Full text Cite

Neural architecture search for in-memory computing-based deep learning accelerators

Journal Article Nature Reviews Electrical Engineering · May 20, 2024 Full text Cite

NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models

Journal Article IEEE Transactions on Computers · May 1, 2024 Recent advances in deep neural networks (DNNs) have enabled highly effective recommendation models for diverse web services. In such DNN-based recommendation models, the embedding layer comprises the majority of model parameters. As these models scale rapi ... Full text Cite

Efficient, Direct, and Restricted Black-Box Graph Evasion Attacks to Any-Layer Graph Neural Networks via Influence Function

Conference WSDM 2024 - Proceedings of the 17th ACM International Conference on Web Search and Data Mining · March 4, 2024 Graph neural network (GNN), the mainstream method to learn on graph data, is vulnerable to graph evasion attacks, where an attacker slightly perturbing the graph structure can fool trained GNN models. Existing work has at least one of the following drawbac ... Full text Cite

Neuro-Symbolic Computing: Advancements and Challenges in Hardware-Software Co-Design

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · March 1, 2024 The rapid progress of artificial intelligence (AI) has led to the emergence of a highly promising field known as neuro-symbolic (NeSy) computing. This approach combines the strengths of neural networks, which excel at data-driven learning, with the reasoni ... Full text Cite

Block-Wise Mixed-Precision Quantization: Enabling High Efficiency for Practical ReRAM-based DNN Accelerators

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · January 1, 2024 Resistive random access memory (ReRAM)-based processing-in-memory (PIM) architectures have demonstrated great potential to accelerate Deep Neural Network (DNN) training/ inference. However, the computational accuracy of analog PIM is compromised due to the ... Full text Cite

Hybrid Digital/Analog Memristor-based Computing Architecture for Sparse Deep Learning Acceleration

Conference Proceedings - IEEE International Symposium on Circuits and Systems · January 1, 2024 Fine-grained sparsity in recent bio-inspired models such as attention-based model could reduce the computation complexity dramatically. However, the unique sparsity pattern challenges the mapping efficiency of the conventional pure analog memristor-based c ... Full text Cite

NDSEARCH: Accelerating Graph-Traversal-Based Approximate Nearest Neighbor Search through Near Data Processing

Conference Proceedings - International Symposium on Computer Architecture · January 1, 2024 Approximate nearest neighbor search (ANNS) is a key retrieval technique for vector database and many data center applications, such as person re-identification and recommendation systems. It is also fundamental to retrieval augmented generation (RAG) for l ... Full text Cite

Monolithic 3D stacking for neural network acceleration

Journal Article Nature Electronics · December 1, 2023 Full text Cite

Guest Editorial Special Issue on the International Symposium on Integrated Circuits and Systems'ISICAS 2023

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · December 1, 2023 Full text Cite

Outgoing Editorial

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · December 1, 2023 Full text Cite

Si-Kintsugi: Towards Recovering Golden-Like Performance of Defective Many-Core Spatial Architectures for AI

Conference Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023 · October 28, 2023 The growing demand for higher compute and memory capacity driven by artificial intelligence (AI) applications pushes higher core counts in modern systems. Many-core architectures exhibiting spatial interconnects with high on-chip bandwidth are ideal for th ... Full text Cite

EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference

Journal Article ACM Transactions on Embedded Computing Systems · September 9, 2023 Recommendation systems have been widely embedded into many Internet services. For example, Meta's deep learning recommendation model (DLRM) shows high prefictive accuracy of click-through rate in processing large-scale embedding tables. The SparseLengthSum ... Full text Cite

ESSENCE: Exploiting Structured Stochastic Gradient Pruning for Endurance-Aware ReRAM-Based In-Memory Training Systems

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · July 1, 2023 Processing-in-memory (PIM) enables energy-efficient deployment of convolutional neural networks (CNNs) from edge to cloud. Resistive random-access memory (ReRAM) is one of the most commonly used technologies for PIM architectures. One of the primary limita ... Full text Cite

SpikeSen: Low-Latency In-Sensor-Intelligence Design With Neuromorphic Spiking Neurons

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · June 1, 2023 In-sensor-processing (ISP) paradigm has been exploited in state-of-the-art vision system designs to pave the way towards power-efficient sensing and processing. The redundant data transmission between sensors and processors is significantly minimized by lo ... Full text Cite

NASRec: Weight Sharing Neural Architecture Search for Recommender Systems

Conference ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023 · April 30, 2023 The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that trains a single s ... Full text Cite

ReaLPrune: ReRAM Crossbar-Aware Lottery Ticket Pruning for CNNs

Journal Article IEEE Transactions on Emerging Topics in Computing · April 1, 2023 Training machine learning (ML) models at the edge (on-chip training on end user devices) can address many pressing challenges including data privacy/security, increase the accessibility of ML applications to different parts of the world by reducing the dep ... Full text Cite

DefT: Boosting Scalability of Deformable Convolution Operations on GPUs

Conference International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS · March 25, 2023 Deformable Convolutional Networks (DCN) have been proposed as a powerful tool to boost the representation power of Convolutional Neural Networks (CNN) in computer vision tasks via adaptive sampling of the input feature map. Much like vision transformers, D ... Full text Cite

DyNNamic: Dynamically Reshaping, High Data-Reuse Accelerator for Compact DNNs

Journal Article IEEE Transactions on Computers · March 1, 2023 Convolutional layers dominate the computation and energy costs of Deep Neural Network (DNN) inference. Recent algorithmic works attempt to reduce these bottlenecks via compact DNN structures and model compression. Likewise, state-of-the-art accelerator des ... Full text Cite

ISLPED 2022: An Experience of a Hybrid Conference in the Time of COVID-19

Journal Article IEEE Design and Test · February 1, 2023 Full text Cite

MWSCAS Guest Editorial Special Issue Based on the 64th International Midwest Symposium on Circuits and Systems

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · January 1, 2023 Full text Cite

: Joint Point Interaction-Dimension Search for 3D Point Cloud

Conference Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023 · January 1, 2023 The interaction and dimension of points are two important axes in designing point operators to serve hierarchical 3D models. Yet, these two axes are heterogeneous and challenging to fully explore. Existing works craft point operator under a single axis and ... Full text Cite

INCA: Input-stationary Dataflow at Outside-the-box Thinking about Deep Learning Accelerators

Conference Proceedings - International Symposium on High-Performance Computer Architecture · January 1, 2023 This paper first presents an input-stationary (IS) implemented crossbar accelerator (INCA), supporting inference and training for deep neural networks (DNNs). Processing-in-memory (PIM) accelerators for DNNs have been actively researched, specifically, wit ... Full text Cite

Mixture Outlier Exposure: Towards Out-of-Distribution Detection in Fine-grained Environments

Conference Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023 · January 1, 2023 Many real-world scenarios in which DNN-based recognition systems are deployed have inherently fine-grained attributes (e.g., bird-species recognition, medical image classification). In addition to achieving reliable accuracy, a critical subtask for these m ... Full text Cite

Dynamic Task Remapping for Reliable CNN Training on ReRAM Crossbars

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2023 A ReRAM crossbar-based computing system (RCS) can accelerate CNN training. However, hardware faults due to manufacturing defects and limited endurance impede the widespread adoption of RCS. We propose a dynamic task remapping-based technique for reliable C ... Full text Cite

On a New Type of Neural Computation for Probabilistic Symbolic Reasoning

Conference Proceedings of the International Joint Conference on Neural Networks · January 1, 2023 New types of neural computations, i.e., methods of computing neuron activities based on other neurons' activities and connectivity strengths, are continuously pushing the boundary of more powerful neural networks. For example, the attention mechanism with ... Full text Cite

Accelerating Sparse Attention with a Reconfigurable Non-volatile Processing-In-Memory Architecture

Conference Proceedings - Design Automation Conference · January 1, 2023 Attention-based neural networks have shown superior performance in a wide range of tasks. Non-volatile processing-in-memory (NVPIM) architecture shows its great potential to accelerate the dense attention model. However, the unique unstructured and dynamic ... Full text Cite

Refloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear Solvers

Conference International Conference for High Performance Computing, Networking, Storage and Analysis, SC · January 1, 2023 Resistive random access memory (ReRAM) is a promising technology that can perform low-cost and in-situ matrix-vector multiplication (MVM) in analog domain. Scientific computing requires high-precision floating-point (FP) processing. However, performing flo ... Full text Cite

Global Vision Transformer Pruning with Hessian-Aware Saliency

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2023 Transformers yield state-of-the-art results across many tasks. However, their heuristically designed architecture impose huge computational costs during inference. This work aims on challenging the common design philosophy of the Vision Transformer (ViT) m ... Full text Cite

Introduction to the Special Issue on Accelerating AI on the Edge - Part 2

Journal Article ACM Transactions on Embedded Computing Systems · December 12, 2022 Full text Cite

Rethinking normalization methods in federated learning

Conference DistributedML 2022 - Proceedings of the 3rd International Workshop on Distributed Machine Learning, Part of CoNEXT 2022 · December 9, 2022 Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data. In this work, we explicitly uncover external covariate shift problem in FL, which is caused by the independent local t ... Full text Cite

Guest Editorial Special Issue on the International Symposium on Integrated Circuits and Systems - ISICAS 2022

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · December 1, 2022 Full text Cite

FedSEA: A Semi-Asynchronous Federated Learning Framework for Extremely Heterogeneous Devices

Conference SenSys 2022 - Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems · November 6, 2022 Federated learning (FL) has attracted increasing attention as a promising technique to drive a vast number of edge devices with artificial intelligence. However, it is very challenging to guarantee the efficiency of a FL system in practice due to the heter ... Full text Cite

Accelerating Large-Scale Graph Neural Network Training on Crossbar Diet

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · November 1, 2022 Resistive random-access memory (ReRAM)-based manycore architectures enable acceleration of graph neural network (GNN) inference and training. GNNs exhibit characteristics of both DNNs and graph analytics. Hence, GNN training/inferencing on ReRAM-based many ... Full text Cite

Approximate computing and the efficient machine learning expedition

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · October 30, 2022 Approximate computing (AxC) has been long accepted as a design alternative for efficient system implementation at the cost of relaxed accuracy requirements. Despite the AxC research activities in various application domains, AxC thrived the past decade whe ... Full text Cite

Cascading Structured Pruning: Enabling High Data Reuse for Sparse DNN Accelerators

Conference Proceedings - International Symposium on Computer Architecture · June 18, 2022 Performance and efciency of running modern Deep Neural Networks (DNNs) are heavily bounded by data movement. To mitigate the data movement bottlenecks, recent DNN inference accelerator designs widely adopt aggressive compression techniques and sparse-skipp ... Full text Cite

Toward Efficient and Adaptive Design of Video Detection System with Deep Neural Networks

Journal Article ACM Transactions on Embedded Computing Systems · May 1, 2022 In the past decade, Deep Neural Networks (DNNs), e.g., Convolutional Neural Networks, achieved human-level performance in vision tasks such as object classification and detection. However, DNNs are known to be computationally expensive and thus hard to be ... Full text Cite

Guest Editors' Introduction: Near-Memory and In-Memory Processing

Journal Article IEEE Design and Test · April 1, 2022 Full text Cite

The Untapped Potential of Off-the-Shelf Convolutional Neural Networks

Conference Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022 · January 1, 2022 Over recent years, a myriad of novel convolutional network architectures have been developed to advance state-of-the-art performance on challenging recognition tasks. As computational resources improve, a great deal of effort has been placed on efficiently ... Full text Cite

Privacy Leakage of Adversarial Training Models in Federated Learning Systems

Conference IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · January 1, 2022 Adversarial Training (AT) is crucial for obtaining deep neural networks that are robust to adversarial attacks, yet recent works found that it could also make models more vulnerable to privacy attacks. In this work, we further reveal this unsettling proper ... Full text Cite

CMOS Implementation of Spiking Equilibrium Propagation for Real-Time Learning

Conference Proceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022 · January 1, 2022 Equilibrium propagation (EqProp) and its adaptations for spiking neural networks (SNN) are presented as biologically plausible alternatives to back-propagation (BP) which describe a potential low-energy means of learning complex tasks in neuromorphic hardw ... Full text Cite

NashAE: Disentangling Representations Through Adversarial Covariance Minimization

Chapter · January 1, 2022 We present a self-supervised method to disentangle factors of variation in high-dimensional data that does not rely on prior knowledge of the underlying variation profile (e.g., no assumptions on the number or distribution of the individual latent variable ... Full text Cite

FedCor: Correlation-Based Active Client Selection Strategy for Heterogeneous Federated Learning

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2022 Client-wise data heterogeneity is one of the major issues that hinder effective training in federated learning (FL). Since the data distribution on each client may vary dramatically, the client selection strategy can significantly influence the convergence ... Full text Cite

Bionic Robust Memristor-Based Artificial Nociception System for Robotics

Conference Proceedings - IEEE International Symposium on Circuits and Systems · January 1, 2022 Nociception is an important ability for robots to interact safely with humans or work in hostile environments. By referring to previous research in mechanical receptors with ring oscillators and memristor-based nociceptors, we propose a complete robotic se ... Full text Cite

RRAM-based Neuromorphic Computing: Data Representation, Architecture, Logic, and Programming

Conference Proceedings - 2022 25th Euromicro Conference on Digital System Design, DSD 2022 · January 1, 2022 RRAM crossbars provide a promising hardware plat-form to accelerate matrix-vector multiplication in deep neural networks (DNNs). To exploit the efficiency of RRAM crossbars, extensive research ex-amining architecture, data representation, logic de-sign as ... Full text Cite

On Building Efficient and Robust Neural Network Designs

Conference Conference Record - Asilomar Conference on Signals, Systems and Computers · January 1, 2022 Neural network models have demonstrated outstanding performance in a variety of applications, from image classification to natural language processing. However, deploying the models to hardware raises efficiency and reliability issues. From the efficiency ... Full text Cite

Next Generation Federated Learning for Edge Devices: An Overview

Conference Proceedings - 2022 IEEE 8th International Conference on Collaboration and Internet Computing, CIC 2022 · January 1, 2022 Federated learning (FL) is a popular distributed machine learning paradigm involving numerous edge devices with enhanced privacy protection. Recently, an extensive literature has been developing on the research which aims at promoting the innovations of FL ... Full text Cite

Neuromorphic Algorithm-hardware Codesign for Temporal Pattern Learning

Conference Proceedings - Design Automation Conference · December 5, 2021 Neuromorphic computing and spiking neural networks (SNN) mimic the behavior of biological systems and have drawn interest for their potential to perform cognitive tasks with high energy efficiency. However, some factors such as temporal dynamics and spike ... Full text Cite

Learning to Train CNNs on Faulty ReRAM-based Manycore Accelerators

Journal Article ACM Transactions on Embedded Computing Systems · October 31, 2021 The growing popularity of convolutional neural networks (CNNs) has led to the search for efficient computational platforms to accelerate CNN training. Resistive random-access memory (ReRAM)-based manycore architectures offer a promisin ... Full text Cite

ESCALATE: Boosting the efficiency of sparse CNN accelerator with kernel decomposition

Conference Proceedings of the Annual International Symposium on Microarchitecture, MICRO · October 18, 2021 The ever-growing parameter size and computation cost of Convolutional Neural Network (CNN) models hinder their deployment onto resource-constrained platforms. Network pruning techniques are proposed to remove the redundancy in CNN parameters and produce a ... Full text Cite

Dynamic Regularization on Activation Sparsity for Neural Network Efficiency Improvement

Journal Article ACM Journal on Emerging Technologies in Computing Systems · October 1, 2021 When deploying deep neural networks in embedded systems, it is crucial to decrease the model size and computational complexity for improving the execution speed and efficiency. In addition to conventional compression techniques, e.g., weight pruning and qu ... Full text Cite

Privacy-Preserving Representation Learning on Graphs: A Mutual Information Perspective

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 14, 2021 Learning with graphs has attracted significant attention recently. Existing representation learning methods on graphs have achieved state-of-the-art performance on various graph-related tasks such as node classification, link prediction, etc. However, we o ... Full text Cite

The Fifth International Workshop on Automation in Machine Learning

Conference Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining · August 14, 2021 Full text Cite

Defending against GAN-based DeepFake Attacks via Transformation-aware Adversarial Faces

Conference Proceedings of the International Joint Conference on Neural Networks · July 18, 2021 DeepFake represents a category of face-swapping attacks that leverage machine learning models such as autoen-coders or generative adversarial networks. Although the concept of the face-swapping is not new, its recent technical advances make fake content (e ... Full text Cite

TPrune: Efficient Transformer Pruning for Mobile Devices

Journal Article ACM Transactions on Cyber-Physical Systems · July 1, 2021 The invention of Transformer model structure boosts the performance of Neural Machine Translation (NMT) tasks to an unprecedented level. Many previous works have been done to make the Transformer model more execution-friendly on resource-constrained platfo ... Full text Cite

Efficient FPGA Implementation of a Convolutional Neural Network for Radar Signal Processing

Conference 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems, AICAS 2021 · June 6, 2021 Although neural networks, especially convolutional neural networks (CNNs), have been successfully applied to many domains, there have not found many radar applications mainly due to a paucity of available training data. Focusing on fixed-site radars, this ... Full text Cite

An Overview of Hardware Security and Trust: Threats, Countermeasures, and Design Tools

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · June 1, 2021 Hardware security and trust have become a pressing issue during the last two decades due to the globalization of the semiconductor supply chain and ubiquitous network connection of computing devices. Computing hardware is now an attractive attack surface f ... Full text Cite

AccuReD: High Accuracy Training of CNNs on ReRAM/GPU Heterogeneous 3-D Architecture

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · May 1, 2021 The growing popularity of convolutional neural networks (CNNs) along with their complexity has led to the search for efficient computational platforms suitable for them. Resistive random-access memory (ReRAM)-based architectures offer a promising alternati ... Full text Cite

An Efficient 3D ReRAM Convolution Processor Design for Binarized Weight Networks

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · May 1, 2021 Convolutional neural networks (CNNs) have been evolving with tremendous success in visual recognition, obtaining human-level accuracy. The conventional hardware architecture, however, is facing difficulty in realizing real-time and energy-efficient operati ... Full text Cite

BitSystolic: A 26.7 TOPS/W 2b8b NPU with Configurable Data Flows for Edge Devices

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · March 1, 2021 Efficient deployment of deep neural networks (DNNs) emerges with the exploding demand for artificial intelligence on edge devices. Mixed-precision inference with both compressed model and reduced computation cost enlightens a way for accurate and efficient ... Full text Cite

Marvel: A Vertical Resistive Accelerator for Low-Power Deep Learning Inference in Monolithic 3D

Conference Proceedings -Design, Automation and Test in Europe, DATE · February 1, 2021 Resistive memory (ReRAM) based Deep Neural Network (DNN) accelerators have achieved state-of-the-art DNN inference throughput. However, the power efficiency of such resistive accelerators is greatly limited by their peripheral circuitry including analog-to ... Full text Cite

RAISE: A Resistive Accelerator for Subject-Independent EEG Signal Classification

Conference Proceedings -Design, Automation and Test in Europe, DATE · February 1, 2021 State-of-the-art deep neural networks (DNNs) for electroencephalography (EEG) signals classification focus on subject-related tasks, in which the test data and the training data needs to be collected from the same subject. In addition, due to limited compu ... Full text Cite

Efficient AUTOSAR-Compliant CAN-FD Frame Packing with Observed Optimality

Conference Proceedings -Design, Automation and Test in Europe, DATE · February 1, 2021 With the trend towards automated driving, Controller Area Network (CAN) is migrating to CAN with Flexible Data-Rate (CAN-FD), where frame packing (i.e., packing signals of various periods, deadlines, and payloads into frames following the standard CAN- FD ... Full text Cite

An Efficient Programming Framework for Memristor-based Neuromorphic Computing

Conference Proceedings -Design, Automation and Test in Europe, DATE · February 1, 2021 Memristor-based crossbars are considered to be promising candidates to accelerate vector-matrix computation in deep neural networks. Before being applied for inference, mem-ristors in the crossbars should be programmed to conductances corresponding to the ... Full text Cite

Efficient neural network using pointwise convolution kernels with linear phase constraint

Journal Article Neurocomputing · January 29, 2021 In current efficient convolutional neural networks, 1 × 1 convolution is widely used. However, the amount of computation and the number of parameters of 1 × 1 convolution layers account for a large part of these neural network models. In this paper, we pro ... Full text Cite

Connection-based Processing-In-Memory Engine Design Based on Resistive Crossbars

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 18, 2021 Deep neural networks have successfully been applied to various fields. The efficient deployment of neural network models emerges as a new challenge. Processing-in-memory (PIM) engines that carry out computation within memory structures are widely studied f ... Full text Cite

Exploring Applications of STT-RAM in GPU Architectures

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · January 1, 2021 Use of modern GPUs has been extended from traditional 3D graphic processing to computing acceleration of many scientific, engineering, and enterprise applications. In modern GPUs, on-chip memory capacity keeps increasing to support thousands of chip-reside ... Full text Cite

NASGEM: Neural Architecture Search via Graph Embedding Method

Conference 35th AAAI Conference on Artificial Intelligence, AAAI 2021 · January 1, 2021 Neural Architecture Search (NAS) automates and prospers the design of neural networks. Estimator-based NAS has been proposed recently to model the relationship between architectures and their performance to enable scalable and flexible search. However, exi ... Full text Cite

Line Art Correlation Matching Feature Transfer Network for Automatic Animation Colorization

Conference 2021 IEEE Winter Conference on Applications of Computer Vision (WACV) · January 2021 Full text Cite

1S1R-based stable learning through single-spike-encoded spike-timing-dependent plasticity

Conference Proceedings - IEEE International Symposium on Circuits and Systems · January 1, 2021 Spike-timing-dependent plasticity (STDP) is emerging as a simple and biologically-plausible approach to learning, and specialized digital implementations are readily available. Memristor technology has been embraced as a much denser solution than digital s ... Full text Cite

Soteria: Provable Defense against Privacy Leakage in Federated Learning from Representation Perspective

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2021 Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data. However, recent works have demonstrated that sharing model updates makes FL vulnerable to inference attack. In this wo ... Full text Cite

REREC: In-ReRAM Acceleration with Access-Aware Mapping for Personalized Recommendation

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2021 Personalized recommendation systems are widely used in many Internet services. The sparse embedding lookup in recommendation models dominates the computational cost of inference due to its intensive irregular memory accesses. Applying resistive random acce ... Full text Cite

Heterogeneous Manycore Architectures Enabled by Processing-in-Memory for Deep Learning: From CNNs to GNNs

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2021 Resistive random-access memory (ReRAM)-based processing-in-memory (PIM) architectures have recently become a popular architectural choice for deep-learning applications. ReRAM-based architectures can accelerate inferencing and training of deep learning alg ... Full text Cite

Multi-Objective Optimization of ReRAM Crossbars for Robust DNN Inferencing under Stochastic Noise

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2021 Resistive random-access memory (ReRAM) is a promising technology for designing hardware accelerators for deep neural network (DNN) inferencing. However, stochastic noise in ReRAM crossbars can degrade the DNN inferencing accuracy. We propose the design and ... Full text Cite

Peripheral Circuitry Assisted Mapping Framework for Resistive Logic-In-Memory Computing

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2021 In-memory computing has been applied in different fields due to its superior speed and energy efficiency. Among a variety of memory technologies that have been explored, resistive memory has widely been adopted for various purposes, including Processing-In ... Full text Cite

LotteryFL: Empower Edge Intelligence with Personalized and Communication-Efficient Federated Learning

Conference 6th ACM/IEEE Symposium on Edge Computing, SEC 2021 · January 1, 2021 With the proliferation of mobile computing and Internet of Things (IoT), massive mobile and IoT devices are connected to the Internet. These devices are generating a huge amount of data every second at the network edge. Many artificial intelligence applica ... Full text Cite

Improving Gradient Regularization using Complex-Valued Neural Networks

Conference Proceedings of Machine Learning Research · January 1, 2021 Gradient regularization is a neural network defense technique that requires no prior knowledge of an adversarial attack and that brings only limited increase in training computational complexity. A form of complex-valued neural network (CVNN) is proposed t ... Cite

AI-Powered IoT System at the Edge

Conference Proceedings - 2021 IEEE 3rd International Conference on Cognitive Machine Intelligence, CogMI 2021 · January 1, 2021 The proliferation of low-cost and low-power IoT devices are constantly generating gigabytes data at the network edge. Bridging AI with IoT is a natural option to unleash the data on devices. AI-powered IoT systems can boost many novel applications and serv ... Full text Cite

Brain Inspired Computing: The Extraordinary Voyages in Known and Unknown Worlds

Conference 2021 IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2021) · 2021 Cite

FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from a Client Perspective

Conference Advances in Neural Information Processing Systems · January 1, 2021 Federated learning (FL) is a popular distributed learning framework that trains a global model through iterative communications between a central server and edge devices. Recent works have demonstrated that FL is vulnerable to model poisoning attacks. Seve ... Cite

RED: A ReRAM-Based Efficient Accelerator for Deconvolutional Computation

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · December 1, 2020 Deconvolution is a key component in contemporary neural networks, especially, generative adversarial networks (GANs) and fully convolutional networks (FCNs). Due to extra operations of deconvolution compared to convolution, considerable degradation of perf ... Full text Cite

FCDM: A Methodology Based on Sensor Pattern Noise Fingerprinting for Fast Confidence Detection to Adversarial Attacks

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · December 1, 2020 Deep neural networks (DNNs) have shown phenomenal success in many real-world applications. However, a concerning weakness of DNNs is their vulnerability to adversarial attacks. Although there exist some methods to detect adversarial attacks, they often suf ... Full text Cite

Introduction to the Special Issue on the 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS 2020)

Journal Article IEEE Journal on Emerging and Selected Topics in Circuits and Systems · December 1, 2020 This issue of the IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS) includes the highlighted papers from the 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS 2020), which was originally pl ... Full text Cite

A Case for 3D Integrated System Design for Neuromorphic Computing and AI Applications

Journal Article International Journal of Semantic Computing · December 1, 2020 Over the last decade, artificial intelligence (AI) has found many applications areas in the society. As AI solutions have become more sophistication and the use cases grew, they highlighted the need to address performance and energy efficiency challenges f ... Full text Cite

Fast IR Drop Estimation with Machine Learning : Invited Paper

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 2, 2020 IR drop constraint is a fundamental requirement enforced in almost all chip designs. However, its evaluation takes a long time, and mitigation techniques for fixing violations may require numerous iterations. As such, fast and accurate IR drop prediction b ... Full text Cite

MobiLattice: A Depth-wise DCNN Accelerator with Hybrid Digital/Analog Nonvolatile Processing-In-Memory Block

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 2, 2020 Nonvolatile Processing-In-Memory (NVPIM) architecture is a promising technology to enable energy-efficient inference of Deep Convolutional Neural Networks (DCNNs). One major advantage of NVPIM is that the vector dot-product operations can be completed effi ... Full text Cite

ReTransformer: ReRAM-based Processing-in-Memory Architecture for Transformer Acceleration

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 2, 2020 Transformer has emerged as a popular deep neural network (DNN) model for Neural Language Processing (NLP) applications and demonstrated excellent performance in neural machine translation, entity recognition, etc. However, its scaled dot-product attention ... Full text Cite

Thwarting Replication Attack against Memristor-Based Neuromorphic Computing System

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · October 1, 2020 Neuromorphic architectures are widely used in many applications for advanced data processing and often implement proprietary algorithms. However, in an adversarial scenario, such systems may face elaborate security attacks including learning attack. In thi ... Full text Cite

AutoGrow: Automatic Layer Growing in Deep Convolutional Networks

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 23, 2020 Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We proposeAutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture,AutoGrow grows new layers if t ... Full text Cite

Lifetime Enhancement for RRAM-based Computing-In-Memory Engine Considering Aging and Thermal Effects

Conference Proceedings - 2020 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2020 · August 1, 2020 RRAM-based computing-in-memory engines provide a promising platform to accelerate deep neural networks. The programming process imposes high voltages onto the RRAM cells and thus degrades their valid conductance ranges from the fresh state, an effect calle ... Full text Cite

Adversarial Attack: A New Threat to Smart Devices and How to Defend It

Journal Article IEEE Consumer Electronics Magazine · July 1, 2020 This article introduces adversarial attack, a recently-unveiled security threat to consumer electronics, especially those utilizing machine learning techniques. We start with the fundamental knowledge including what are adversarial examples, how to realize ... Full text Cite

Leveraging 3D vertical RRAM to developing neuromorphic architecture for pattern classification

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · July 1, 2020 The crossbar architecture with resistive random-access memory (RRAM) devices presents many advantages in realizing matrix-based computations and achieves success in neural network implementation. However, the rapid growth of network size demands even dense ... Full text Cite

Conditional Transferring Features: Scaling GANs to Thousands of Classes with 30% Less High-Quality Data for Training

Conference Proceedings of the International Joint Conference on Neural Networks · July 1, 2020 Generative adversarial network (GAN) can greatly improve the quality of unsupervised image generation. Previous GAN-based methods often require a large amount of high-quality training data. This work aims to reduce the use of high-quality data in training, ... Full text Cite

Redistributing and Re-Stylizing Features for Training a Fast Photorealistic Stylizer

Conference Proceedings of the International Joint Conference on Neural Networks · July 1, 2020 Style transfer studies can be categorized into two types - artistic and photorealistic. The high-speed transfer has been well-studied for artistic styles but remains challenging for photorealistic styles. To guarantee semantic accuracy and style faithfulne ... Full text Cite

Lattice: An ADC/DAC-less ReRAM-based processing-in-memory architecture for accelerating deep convolution neural networks

Conference Proceedings - Design Automation Conference · July 1, 2020 Nonvolatile Processing-In-Memory (NVPIM) has demonstrated its great potential in accelerating Deep Convolution Neural Networks (DCNN). However, most of existing NVPIM designs require costly analog-digital conversions and often rely on excessive data copies ... Full text Cite

ReSiPE: ReRAM-based single-spiking processing-in-memory engine

Conference Proceedings - Design Automation Conference · July 1, 2020 Processing-in-memory (PIM) designs that leverage emerging nanotechnologies like resistive random access memory (ReRAM) have demonstrated enormous potential in accelerating deep learning applications due to high energy efficiency and integration density. Th ... Full text Cite

Learning low-rank deep neural networks via singular vector orthogonality regularization and singular value sparsification

Conference IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · June 1, 2020 Modern deep neural networks (DNNs) often require high memory consumption and large computational loads. In order to deploy DNN algorithms efficiently on edge or mobile devices, a series of DNN compression algorithms have been explored, including factorizat ... Full text Cite

Structural sparsification for far-field speaker recognition with intel R GNA

Conference ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · May 1, 2020 Recently, deep neural networks (DNN) have been widely used in speaker recognition area. In order to achieve fast response time and high accuracy, the requirements for hardware resources increase rapidly. However, as the speaker recognition application is o ... Full text Cite

A low-cost and high-speed hardware implementation of spiking neural network

Journal Article Neurocomputing · March 21, 2020 Spiking neural network (SNN) is a neuromorphic system based on the information process and store procedure of biological neurons. In this paper, a low-cost and high-speed implementation for a spiking neural network based on FPGA is proposed. The LIF (Leaky ... Full text Cite

ReBoc: Accelerating Block-Circulant Neural Networks in ReRAM

Conference Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020 · March 1, 2020 Deep neural networks (DNNs) emerge as a key component in various applications. However, the ever-growing DNN size hinders efficient processing on hardware. To tackle this problem, on the algorithmic side, compressed DNN models are explored, of which block- ... Full text Cite

GRAMARCH: A GPU-ReRAM based Heterogeneous Architecture for Neural Image Segmentation

Conference Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020 · March 1, 2020 Deep Neural Networks (DNNs) employed for image segmentation are computationally more expensive and complex compared to the ones used for classification. However, manycore architectures to accelerate the training of these DNNs are relatively unexplored. Res ... Full text Cite

A Pulse-width Modulation Neuron with Continuous Activation for Processing-In-Memory Engines

Conference Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020 · March 1, 2020 Processing-in-memory engines have successfully been applied to accelerate deep neural networks. For improving computing efficiency, spiking-based designs are widely explored. However, spiking-based designs quantize inter-layer signals naturally, leading to ... Full text Cite

AccPar: Tensor partitioning for heterogeneous deep learning accelerators

Conference Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020 · February 1, 2020 Deep neural network (DNN) accelerators as an example of domain-specific architecture have demonstrated great success in DNN inference. However, the architecture acceleration for equally important DNN training has not yet been fully studied. With data forwa ... Full text Cite

3D-ReG: A 3D ReRAM-based Heterogeneous Architecture for Training Deep Neural Networks

Journal Article ACM Journal on Emerging Technologies in Computing Systems · January 29, 2020 Deep neural network (DNN) models are being expanded to a broader range of applications. The computational capability of traditional hardware platforms cannot accommodate the growth of model complexity. Among recent technologies to accelerate DNN, resistive ... Full text Cite

PARC: A Processing-in-CAM Architecture for Genomic Long Read Pairwise Alignment using ReRAM

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 Technological advances in long read sequences have greatly facilitated the development of genomics. However, managing and analyzing the raw genomic data that outpaces Moore's Law requires extremely high computational efficiency. On the one hand, existing s ... Full text Cite

Enhancing Generalization of Wafer Defect Detection by Data Discrepancy-aware Preprocessing and Contrast-varied Augmentation

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 Wafer inspection locates defects at early fabrication stages and traditionally focuses on pixel-level defects. However, there are very few solutions that can effectively detect largescale defects. In this work, we leverage Convolutional Neural Networks (CN ... Full text Cite

Parallelism in Deep Learning Accelerators

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 Deep learning is the core of artificial intelligence and it achieves state-of-the-art in a wide range of applications. The intensity of computation and data in deep learning processing poses significant challenges to the conventional computing platforms. T ... Full text Cite

AutoShrink: A topology-aware NAS for discovering efficient neural architecture

Conference AAAI 2020 - 34th AAAI Conference on Artificial Intelligence · January 1, 2020 Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the flexibility of network patterns in learned cell structures. Moreover, ... Cite

PENNI: Pruned kernel sharing for efficient cnn inference

Conference 37th International Conference on Machine Learning, ICML 2020 · January 1, 2020 Although state-of-the-art (SOTA) CNNs achieve outstanding performance on various tasks, their high computation demand and massive number of parameters make it difficult to deploy these SOTA CNNs onto resource-constrained devices. Previous works on CNN acce ... Cite

Highly efficient neuromorphic computing systems with emerging nonvolatile memories

Conference Proceedings of SPIE - The International Society for Optical Engineering · January 1, 2020 Increased interest in artificial intelligence coupled with a surge in nonvolatile memory research and the inevitable hitting of the”memory wall” in von Neuman computing has set the stage for a new flavor of computing systems to flourish: neuromorphic compu ... Full text Cite

Thread batching for high-performance energy-efficient GPU memory design

Journal Article ACM Journal on Emerging Technologies in Computing Systems · December 1, 2019 Massive multi-threading in GPU imposes tremendous pressure on memory subsystems. Due to rapid growth in thread-level parallelism of GPU and slowly improved peak memory bandwidth, memory becomes a bottleneck of GPU’s performance and energy efficiency. In th ... Full text Cite

On Designing Efficient and Reliable Nonvolatile Memory-Based Computing-In-Memory Accelerators

Conference Technical Digest - International Electron Devices Meeting, IEDM · December 1, 2019 Nonvolatile memory (NVM)-based computing-in-memory (CIM) features nonvolatile storage, in-place computing and reduction in data traffic. However, the development of NVM-based CIM is hampered by immature fabrication processes and inevitable operational faul ... Full text Cite

Exploring Bit-Slice Sparsity in Deep Neural Networks for Efficient ReRAM-Based Deployment

Conference Proceedings - 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing, EMC2-NIPS 2019 · December 1, 2019 Emerging resistive random-access memory (ReRAM) has recently been intensively investigated to accelerate the processing of deep neural networks (DNNs). Due to the in-situ computation capability, analog ReRAM crossbars yield significant throughput improveme ... Full text Cite

ReBNN: in-situ acceleration of binarized neural networks in ReRAM using complementary resistive cell

Journal Article CCF Transactions on High Performance Computing · December 1, 2019 Resistive random access memory (ReRAM) has been proven capable to efficiently perform in-situ matrix-vector computations in convolutional neural network (CNN) processing. The computations are often conducted on multi-level cell (MLC) that have limited prec ... Full text Cite

How to obtain and run light and efficient deep learning networks

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 1, 2019 As the model size of deep neural networks (DNNs) grows for better performance, the increase in computational cost associated with training and testing makes it extremely difficulty to deploy DNNs on end/edge devices with limited resources while also satisf ... Full text Cite

DASNet: Dynamic activation sparsity for neural network efficiency improvement

Conference Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI · November 1, 2019 To improve the execution speed and efficiency of neural networks in embedded systems, it is crucial to decrease the model size and computational complexity. In addition to conventional compression techniques, e.g., weight pruning and quantization, removing ... Full text Cite

Resistive Memory‐Based In‐Memory Computing: From Device and Large‐Scale Integration System Perspectives

Journal Article Advanced Intelligent Systems · November 2019 In‐memory computing is a computing scheme that integrates data storage and arithmetic computation functions. Resistive random access memory (RRAM) arrays with innovative peripheral circuitry provide the capability of perform ... Full text Cite

Taming extreme heterogeneity via machine learning based design of autonomous manycore systems

Conference Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, CODES/ISSS 2019 · October 13, 2019 To avoid rewriting software code for new computer architectures and to take advantage of the extreme heterogeneous processing, communication and storage technologies, there is an urgent need for determining the right amount and type of specialization while ... Full text Cite

MSNet: Structural wired neural architecture search for internet of things

Conference Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019 · October 1, 2019 The prosperity of Internet of Things (IoT) calls for efficient ways of designing extremely compact yet accurate DNN models. Both the cell-based neural architecture search methods and the recently proposed graph based methods fall short in finding high qual ... Full text Cite

MobiEye: An efficient cloud-based video detection system for real-time mobile applications

Conference Proceedings - Design Automation Conference · June 2, 2019 In recent years, machine learning research has largely shifted focus from the cloud to the edge. While the resulting algorithm- and hardware-level optimizations have enabled local execution for the majority of deep neural networks (DNNs) on edge devices, t ... Full text Cite

ZARA: A novel zero-free dataflow accelerator for generative adversarial networks in 3D ReRAM

Conference Proceedings - Design Automation Conference · June 2, 2019 Generative Adversarial Networks (GANs) recently demonstrated a great opportunity toward unsupervised learning with the intention to mitigate the massive human efforts on data labeling in supervised learning algorithms. GAN combines a generative model and a ... Full text Cite

RRAM-based Spiking Nonvolatile Computing-In-Memory Processing Engine with Precision-Configurable in Situ Nonlinear Activation

Conference Digest of Technical Papers - Symposium on VLSI Technology · June 1, 2019 This work presents a hybrid CMOS-RRAM integration of spiking nonvolatile computing-in-memory (nvCIM) processing engine (PE) that includes a 64Kb RRAM macro and a novel in situ nonlinear activation (ISNA) module. We integrate the computing controller and no ... Full text Cite

Feature space perturbations yield more transferable adversarial examples

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · June 1, 2019 Many recent works have shown that deep learning models are vulnerable to quasi-imperceptible input perturbations, yet practitioners cannot fully explain this behavior. This work describes a transfer-based blackbox targeted adversarial attack of deep featur ... Full text Cite

Aging-aware Lifetime Enhancement for Memristor-based Neuromorphic Computing

Conference Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019 · May 14, 2019 Memristor-based crossbars have been applied successfully to accelerate vector-matrix computations in deep neural networks. During the training process of neural networks, the conductances of the memristors in the crossbars must be updated repetitively. How ... Full text Cite

REGENT: A Heterogeneous ReRAM/GPU-based Architecture Enabled by NoC for Training CNNs

Conference Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019 · May 14, 2019 The growing popularity of Convolutional Neural Networks (CNNs) has led to the search for efficient computational platforms to enable these algorithms. Resistive random-access memory (ReRAM)-based architectures offer a promising alternative to commonly used ... Full text Cite

RED: A ReRAM-based Deconvolution Accelerator

Conference Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019 · May 14, 2019 Deconvolution has been widespread in neural networks. For example, it is essential for performing unsupervised learning in generative adversarial networks or constructing fully convolutional networks for semantic segmentation. Resistive RAM (ReRAM)-based p ... Full text Cite

An overview of in-memory processing with emerging non-volatile memory for data-intensive applications

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 13, 2019 The conventional von Neumann architecture has been revealed as a major performance and energy bottleneck for rising data-intensive applications. The decade-old idea of leveraging in-memory processing to eliminate substantial data movements has returned and ... Full text Cite

Efficient process-in-memory architecture design for unsupervised GAN-based deep learning using ReRAM

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 13, 2019 The ending of Moore's Law makes domain-specific architecture as the future of computing. The most representative is the emergence of various deep learning accelerators. Among the proposed solutions, resistive random access memory (ReRAM) based process-in-m ... Full text Cite

Learning Efficient Sparse Structures in Speech Recognition

Conference ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · May 1, 2019 Recurrent neural networks (RNNs), especially long short-term memories (LSTMs) have been widely used in speech recognition and natural language processing. As the sizes of RNN models grow for better performance, the computation cost and therefore the requir ... Full text Cite

HyPar: Towards hybrid parallelism for deep learning accelerator array

Conference Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019 · March 26, 2019 With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used in many domains. To achieve high performance and energy efficiency, hardware acceleration (especially inference) of DNNs is intensively studied both ... Full text Cite

Exploration of Automatic Mixed-Precision Search for Deep Neural Networks

Conference Proceedings 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2019 · March 1, 2019 Neural networks have shown great performance in cognitive tasks. When deploying network models on mobile devices with limited computation and storage resources, the weight quantization technique has been widely adopted. In practice, 8-bit or 16-bit quantiz ... Full text Cite

AdverQuil: An efficient adversarial detection and alleviation technique for black-box neuromorphic computing systems

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 21, 2019 In recent years, neuromorphic computing systems (NCS) have gained popularity in accelerating neural network computation because of their high energy efficiency. The known vulnerability of neural networks to adversarial attack, however, raises a severe secu ... Full text Cite

NeuralHMC: An efficient HMC-based accelerator for deep neural networks

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 21, 2019 In Deep Neural Network (DNN) applications, energy consumption and performance cost of moving data between memory hierarchy and computational units are significantly higher than that of the computation itself. Process-in-memory (PIM) architecture such as Hy ... Full text Cite

Build reliable and efficient neuromorphic design with memristor technology

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 21, 2019 Neuromorphic computing is a revolutionary approach of computation, which attempts to mimic the human brain's mechanism for extremely high implementation efficiency and intelligence. Latest research studies showed that the memristor technology has a great p ... Full text Cite

Exploiting spin-orbit torque devices as reconfigurable logic for circuit obfuscation

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · January 1, 2019 Circuit obfuscation is a frequently used approach to conceal logic functionalities in order to prevent reverse engineering attacks on fabricated chips. Efficient obfuscation implementations are expected with lower design complexity and overhead but higher ... Full text Cite

Enhance the robustness to time dependent variability of ReRAM-based neuromorphic computing systems with regularization and 2R synapse

Conference Proceedings - IEEE International Symposium on Circuits and Systems · January 1, 2019 Time Dependent Variability (TDV) is one of the major concerns in implementing a Neuromorphic Computing System (NCS) with Resistive Random Access Memory (ReRAM). In this work, we propose a variation-distribution aware training algorithm to enhance the robus ... Full text Cite

Towards decentralized deep learning with differential privacy

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2019 In distributed machine learning, while a great deal of attention has been paid on centralized systems that include a central parameter server, decentralized systems have not been fully explored. Decentralized systems have great potentials in the future pra ... Full text Cite

Faster cnns with direct sparse convolutions and guided pruning

Conference 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings · January 1, 2019 © ICLR 2019 - Conference Track Proceedings. All rights reserved. Phenomenally successful in practical inference problems, convolutional neural networks (CNN) are widely deployed in mobile devices, data centers, and even supercomputers. The number of parame ... Cite

Defending neural backdoors via generative distribution modeling

Conference Advances in Neural Information Processing Systems · January 1, 2019 Neural backdoor attack is emerging as a severe security threat to deep learning, while the capability of existing defense methods is limited, especially for complex backdoor triggers. In the work, we explore the space formed by the pixel values of all poss ... Cite

SPN dash: Fast detection of adversarial attacks on mobile via sensor pattern noise fingerprinting

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 5, 2018 A concerning weakness of deep neural networks is their susceptibility to adversarial attacks. While methods exist to detect these attacks, they incur significant drawbacks, ignoring external features which could aid in the task of attack detection. In this ... Full text Cite

EMAT: An Efficient Multi-Task Architecture for Transfer Learning using ReRAM

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 5, 2018 Transfer learning has demonstrated a great success recently towards general supervised learning to mitigate expensive training efforts. However, existing neural network accelerators have been proven inefficient in executing transfer learning by failing to ... Full text Cite

Guest Editorial: Special Issue on Large-Scale Memristive Systems and Neurochips for Computational Intelligence

Journal Article IEEE Transactions on Emerging Topics in Computational Intelligence · October 1, 2018 Full text Cite

MAT: A multi-strength adversarial training method to mitigate adversarial attacks

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · August 7, 2018 Some recent work revealed that deep neural networks (DNNs) are vulnerable to so-called adversarial attacks where input examples are intentionally perturbed to fool DNNs. In this work, we revisit the DNN training process that includes adversarial examples i ... Full text Cite

Message from the technical program chairs

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · August 7, 2018 Full text Cite

Real-Time Cardiac Arrhythmia Classification Using Memristor Neuromorphic Computing System.

Conference Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference · July 2018 Cardiac arrhythmia is known to be one of the most common causes of death worldwide. Therefore, development of efficient arrhythmia detection techniques is essential to save patients' lives. In this paper, we introduce a new real-time cardiac arrhythmia cla ... Full text Cite

AtomLayer: A universal ReRAM-based CNN accelerator with atomic layer computation

Conference Proceedings - Design Automation Conference · June 24, 2018 Although ReRAM-based convolutional neural network (CNN) accelerators have been widely studied, state-of-the-art solutions suffer from either incapability of training (e.g., ISSAC [1]) or inefficiency of inference (e.g., PipeLayer [2]) due to the pipeline d ... Full text Cite

A neuromorphic design using chaotic mott memristor with relaxation oscillation

Conference Proceedings - Design Automation Conference · June 24, 2018 The recent proposed nanoscale Mott memristor features negative differential resistance and chaotic dynamics. This work proposes a novel neuromorphic computing system that utilizes Mott memristors to simplify peripheral circuitry. According to the analytic ... Full text Cite

Shift-Optimized Energy-Efficient Racetrack-Based Main Memory

Journal Article Journal of Circuits, Systems and Computers · May 1, 2018 Recently developed spin-based, racetrack memory (RM) shows great promise in enabling nonvolatile memory with unprecedented density and energy efficiency. RM-based technology will leverage the power and cost limit of main memory. However, main memory has ra ... Full text Cite

A Quantized Training Method to Enhance Accuracy of ReRAM-based Neuromorphic Systems

Conference Proceedings - IEEE International Symposium on Circuits and Systems · April 26, 2018 Deep neural networks (DNNs) are tremendously applied in artificial intelligence field. While the performance of DNNs is continuously improved by more complicated and deeper structures, the feasibility of deployment on embedded system remains as a critical ... Full text Cite

Pulse-Width Modulation based Dot-Product Engine for Neuromorphic Computing System using Memristor Crossbar Array

Conference Proceedings - IEEE International Symposium on Circuits and Systems · April 26, 2018 The Dot-Product Engine (DPE) is a critical circuit for implementing neural networks in hardware. The recent-developed memristor crossbar array technology, which is able to efficiently carry out dot-product multiplication and update its weights in real time ... Full text Cite

Design and Data Management for Magnetic Racetrack Memory

Conference Proceedings - IEEE International Symposium on Circuits and Systems · April 26, 2018 Benefiting from its ultra-high storage density, high energy efficiency, and non-volatility, racetrack memory demonstrates great potential in replacing conventional SRAM as large on-chip memory. Integrating the tape-like racetrack memory, however, faces uni ... Full text Cite

Exploring the opportunity of implementing neuromorphic computing systems with spintronic devices

Conference Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 · April 19, 2018 Many cognitive algorithms such as neural networks cannot be efficiently executed by von Neumann architectures, the performance of which is constrained by the memory wall between microprocessor and memory hierarchy. Hence, researchers started to investigate ... Full text Cite

Recom: An efficient resistive accelerator for compressed deep neural networks

Conference Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 · April 19, 2018 Deep Neural Networks (DNNs) play a key role in prevailing machine learning applications. Resistive random-Access memory (ReRAM) is capable of both computation and storage, contributing to the acceleration on DNNs by processing in memory. Besides, a signifi ... Full text Cite

ReRAM-based accelerator for deep learning

Conference Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 · April 19, 2018 Big data computing applications such as deep learning and graph analytic usually incur a large amount of data movements. Deploying such applications on conventional von Neumann architecture that separates the processing units and memory components likely l ... Full text Cite

A compact model for selectors based on metal doped electrolyte

Journal Article Applied Physics A: Materials Science and Processing · April 1, 2018 A selector device that demonstrates high nonlinearity and low switching voltages was fabricated using HfOx as a solid electrolyte doped with Ag electrodes. The electronic conductance of the volatile conductive filaments responsible for the switching was st ... Full text Cite

RC-NVM: Enabling Symmetric Row and Column Memory Accesses for In-memory Databases

Conference Proceedings - International Symposium on High-Performance Computer Architecture · March 27, 2018 Ever increasing DRAM capacity has fostered the development of in-memory databases (IMDB). The massive performance improvements provided by IMDBs have enabled transactions and analytics on the same database. In other words, the integration of OLTP (on-line ... Full text Cite

GraphR: Accelerating Graph Processing Using ReRAM

Conference Proceedings - International Symposium on High-Performance Computer Architecture · March 27, 2018 Graph processing recently received intensive interests in light of a wide range of needs to understand relationships. It is well-known for the poor locality and high memory bandwidth requirement. In conventional architectures, they incur a significant amou ... Full text Cite

Study of and in radiative decays to

Journal Article Physical Review D · March 6, 2018 Full text Cite

Neuromorphic computing's yesterday, today, and tomorrow – an evolutional view

Journal Article Integration · March 1, 2018 Neuromorphic computing was originally referred to as the hardware that mimics neuro-biological architectures to implement models of neural systems. The concept was then extended to the computing systems that can run bio-inspired computing models, e.g., neu ... Full text Cite

Low-Power, Adaptive Neuromorphic Systems: Recent Progress and Future Directions

Journal Article IEEE Journal on Emerging and Selected Topics in Circuits and Systems · March 1, 2018 In this paper, we present a survey of recent works in developing neuromorphic or neuro-inspired hardware systems. In particular, we focus on those systems which can either learn from data in an unsupervised or online supervised manner. We present algorithm ... Full text Cite

Modeling of biaxial magnetic tunneling junction for multi-level cell STT-RAM realization

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 In recent years, spin-transfer torque random access memory (STT-RAM) has been widely studied as a promising candidate to replace DRAM because of its fast access time, high endurance, and good CMOS compatibility. The improvement of tunneling magneto-resista ... Full text Cite

Neu-NoC: A high-efficient interconnection network for accelerated neuromorphic systems

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 A modern neuromorphic acceleration system could consist of hundreds of accelerators, which are often organized through a network-on-chip (NoC). Although the overall computing ability is greatly promoted by a large number of the accelerators, the power cons ... Full text Cite

Spintronics based stochastic computing for efficient Bayesian inference system

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 Bayesian inference is an effective approach for solving statistical learning problems especially with uncertainty and incompleteness. However, inference efficiencies are physically limited by the bottlenecks of conventional computing platforms. In this pap ... Full text Cite

Process variation aware data management for magnetic skyrmions racetrack memory

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 Skyrmions racetrack memory (SKM) has been identified as a promising candidate for future on-chip cache. Similar to many other nanoscale technologies, process variations also adversely impact the reliability and performance of SKM cache. In this work, we pr ... Full text Cite

Running sparse and low-precision neural network: When algorithm meets hardware

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 Deep Neural Networks (DNNs) are pervasively applied in many artificial intelligence (AI) applications. The high performance of DNNs comes at the cost of larger size and higher compute complexity. Recent studies show that DNNs have much redundancy, such as ... Full text Cite

Understanding the trade-offs of device, circuit and application in ReRAM-based neuromorphic computing systems

Conference Technical Digest - International Electron Devices Meeting, IEDM · January 23, 2018 Resistive memory (ReRAM) features nonvolatile storage, high resistance, dense structure, and analogy to the matrix-vector multiplication operation. These characteristics demonstrate the great potential of ReRAM in the development of neuromorphic computing ... Full text Cite

Guest editorial circuit and system design automation for internet of things

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · January 1, 2018 Full text Cite

Beyond CMOS: Memristor and its application for next generation storage and computing

Conference ECS Transactions · January 1, 2018 To break Moore's Law, "Beyond CMOS" post-silicon technologies have gained great attention in recent years, especially, facing the immensely expensive storage and computation in terms of speed and energy in the current big data environment. Novel devices, c ... Full text Cite

Coordinating Filters for Faster Deep Neural Networks

Conference Proceedings of the IEEE International Conference on Computer Vision · December 22, 2017 Very large-scale Deep Neural Networks (DNNs) have achieved remarkable successes in a large variety of computer vision tasks. However, the high computation intensity of DNNs makes it challenging to deploy these models on resource-limited systems. Some studi ... Full text Cite

MobiCore: An adaptive hybrid approach for power-efficient CPU management on Android devices

Conference International System on Chip Conference · December 18, 2017 Smartphones are becoming essential devices used for various types of applications in our daily life. To satisfy the ever-increasing performance requirement, the number of CPU cores in a phone keeps growing, which imposes a great impact on its power consump ... Full text Cite

A closed-loop design to enhance weight stability of memristor based neural network chips

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 13, 2017 Compared with the algorithm optimizations, brain-inspired neural network chips aim to fundamentally change the computer architecture and therefore enhance the computation capability and performance in advanced data processing. In recent years, memristor te ... Full text Cite

MeDNN: A distributed mobile system with enhanced partition and deployment for large-scale DNNs

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 13, 2017 Deep Neural Networks (DNNs) are pervasively used in a significant number of applications and platforms. To enhance the execution efficiency of large-scale DNNs, previous attempts focus mainly on client-server paradigms, relying on powerful external infrast ... Full text Cite

AdaLearner: An adaptive distributed mobile learning system for neural networks

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 13, 2017 Neural networks hold a critical domain in machine learning algorithms because of their self-adaptiveness and state-of-the-art performance. Before the testing (inference) phases in practical use, sophisticated training (learning) phases are required, callin ... Full text Cite

A compact DNN: Approaching GoogLeNet-level accuracy of classification and domain adaptation

Conference Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 · November 6, 2017 Recently, DNN model compression based on network architecture design, e.g., SqueezeNet, attracted a lot of attention. Compared to well-known models, these extremely compact networks don't show any accuracy drop on image classification. An emerging question ... Full text Cite

A quantization-aware regularized learning method in multilevel memristor-based neuromorphic computing system

Conference NVMSA 2017 - 6th IEEE Non-Volatile Memory Systems and Applications Symposium · October 10, 2017 In this work, we propose a regularized learning method that is able to take into account the deviation of the memristor-mapped synaptic weights from the target values determined during the training process. Experimental results obtained when utilizing the ... Full text Cite

Brain-inspired computing accelerated by memristor technology

Conference Proceedings of the 4th ACM International Conference on Nanoscale Computing and Communication, NanoCom 2017 · September 27, 2017 The brain-inspired computing, known as neuromorphic computing has demonstrated great potential in revolutionizing computation for high efficiency. In the neuromorphic engine, tremendous computing and power efficiency are achieved on a single chip. However, ... Full text Cite

An Energy-Efficient GPGPU Register File Architecture Using Racetrack Memory

Journal Article IEEE Transactions on Computers · September 1, 2017 Extreme multi-Threading and fast thread switching in modern GPGPU require a large, power-hungry register file (RF), which quickly becomes one of major obstacles on the upscaling path of energy-efficient GPGPU computing. In this work, we propose to implemen ... Full text Cite

A Compact Memristor-Based Dynamic Synapse for Spiking Neural Networks

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · August 1, 2017 Recent advances in memristor technology lead to the feasibility of large-scale neuromorphic systems by leveraging the similarity between memristor devices and synapses. For instance, memristor cross-point arrays can realize dense synapse network among hund ... Full text Cite

FlexLevel NAND Flash Storage System Design to Reduce LDPC Latency

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · July 1, 2017 Aggressive technology scaling and adoption of multilevel-cell technique lead to progressive increase of bit error rate (BER) of NAND flash memory. Consequently, conventional error correction code is not adequate to guarantee system reliability. As an alter ... Full text Cite

An FPGA design framework for CNN sparsification and acceleration

Conference Proceedings - IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2017 · June 30, 2017 Convolutional neural networks (CNNs) have recently broken many performance records in image recognition and object detection problems. The success of CNNs, to a great extent, is enabled by the fast scaling-up of the networks that learn from a huge volume o ... Full text Cite

The new large-scale RNNLM system based on distributed neuron

Conference Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017 · June 30, 2017 RNNLM (Recurrent Neural Network Language Model) can save the historical information of the training dataset by the last hidden layer and can also as input for training. It has become an interesting topic in the field of Natural Language Processing research ... Full text Cite

Hardware implementation of echo state networks using memristor double crossbar arrays

Conference Proceedings of the International Joint Conference on Neural Networks · June 30, 2017 Neuromorphic computing systems are inspired by humans brains, where data are stored and processed at the same location. Contrary to von Neumann systems, neuromorphic computing systems offer excellent real-time processing for huge data sizes, at low costs a ... Full text Cite

Rescuing Memristor-based Neuromorphic Design with High Defects

Conference Proceedings - Design Automation Conference · June 18, 2017 Memristor-based synaptic network has been widely investigated and applied to neuromorphic computing systems for the fast computation and low design cost. As memristors continue to mature and achieve higher density, bit failures within crossbar arrays can b ... Full text Cite

Group Scissor: Scaling Neuromorphic Computing Design to Large Neural Networks

Conference Proceedings - Design Automation Conference · June 18, 2017 Synapse crossbar is an elementary structure in neuromorphic computing systems (NCS). However, the limited size of crossbars and heavy routing congestion impede the NCS implementation of large neural networks. In this paper, we propose a two-step framework ... Full text Cite

Giant Spin-Hall assisted STT-RAM and logic design

Journal Article Integration, the VLSI Journal · June 1, 2017 In recent years, Spin-Transfer Torque Random Access Memory (STT-RAM) has attracted significant attentions from both industry and academia due to its attractive attributes such as small cell area and non-volatility. However, long switching time and large pr ... Full text Cite

Recent Technology Advances of Emerging Memories

Journal Article IEEE Design and Test · June 1, 2017 Phase change memory, spin-transfer torque random access memory, and resistive random access memory are three major emerging memory technologies that receive tremendous attentions from both academia and industry. In this survey article, the authors summariz ... Full text Cite

Cross-layer optimization for multilevel cell STT-RAM caches

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · June 1, 2017 Spin-transfer torque random access memory (STT-RAM), as an emerging nonvolatile memory technology, provides very dense array structure and extremely low leakage power consumption. It demonstrates a great potential in replacing conventional static random ac ... Full text Cite

Neuromorphic Hardware Acceleration Enabled by Emerging Technologies

Chapter · May 15, 2017 This book describes the current state of the art in big-data analytics, from a technology and hardware architecture perspective. ... Cite

Hybrid spiking-based multi-layered self-learning neuromorphic system based on memristor crossbar arrays

Conference Proceedings of the 2017 Design, Automation and Test in Europe, DATE 2017 · May 11, 2017 Neuromorphic computing systems are under heavy investigation as a potential substitute for the traditional von Neumann systems in high-speed low-power applications. Recently, memristor crossbar arrays were utilized in realizing spiking-based neuromorphic s ... Full text Cite

Understanding the design of IBM neurosynaptic system and its tradeoffs: A user perspective

Conference Proceedings of the 2017 Design, Automation and Test in Europe, DATE 2017 · May 11, 2017 As a large-scale commercial spiking-based neuromorphic computing platform, IBM TrueNorth processor received tremendous attentions in society. However, one of the known issues in TrueNorth design is the limited precision of synaptic weights. The current wor ... Full text Cite

PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning

Conference Proceedings - International Symposium on High-Performance Computer Architecture · May 5, 2017 Convolution neural networks (CNNs) are the heart of deep learning applications. Recent works PRIME [1] and ISAAC [2] demonstrated the promise of using resistive random access memory (ReRAM) to perform neural computations in memory. We found that training c ... Full text Cite

Welcome

Conference Proceedings - International Symposium on Quality Electronic Design, ISQED · May 2, 2017 Full text Cite

Classification accuracy improvement for neuromorphic computing systems with one-level precision synapses

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 16, 2017 Brain inspired neuromorphic computing has demonstrated remarkable advantages over traditional von Neumann architecture for its high energy efficiency and parallel data processing. However, the limited resolution of synaptic weights degrades system accuracy ... Full text Cite

Extending the lifetime of object-based NAND flash device with STT-RAM/DRAM hybrid buffer

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 16, 2017 A major limitation of NAND flash memory is erase-before-program characteristics. It incurs write amplification, severely degrading system performance and endurance. Previous works reveal that metadata update substantially contributes to write amplification ... Full text Cite

A memristor-based neuromorphic engine with a current sensing scheme for artificial neural network applications

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 16, 2017 By following the big data revolution, neuromorphic computing makes a comeback for its great potential in information processing capability. Despite of many types of architectures reported in conventional CMOS domain, memristor, as an example of emerging de ... Full text Cite

Message from the Technical Program Chairs

Conference Proceedings - 2016 IEEE International Symposium on Nanoelectronic and Information Systems, iNIS 2016 · January 23, 2017 Full text Cite

Nonvolatile memory design: Magnetic, resistive, and phase change

Book · January 1, 2017 The manufacture of flash memory, which is the dominant nonvolatile memory technology, is facing severe technical barriers. So much so, that some emerging technologies have been proposed as alternatives to flash memory in the nano-regime. Nonvolatile Memory ... Full text Cite

Looking Ahead for Resistive Memory Technology: A broad perspective on ReRAM technology for future storage and computing

Journal Article IEEE Consumer Electronics Magazine · January 1, 2017 Resistive random-access memory (ReRAM) is regarded as one of the most promising alternative nonvolatile memory technologies for its advantages in very-high-storage density, simple structure, low power consumption, and long endurance, as well as good compat ... Full text Cite

In-place logic obfuscation for emerging nonvolatile FPGAs

Chapter · January 1, 2017 To enhance system integrity of FPGA-based embedded systems on hardware design and data communication, we propose a hardware security scheme for nonvolatile resistive random access memory (RRAM) based FPGA, in which internal block RAM (BRAMs) are used for c ... Full text Cite

Conventional and Neuromorphic Systems Leveraging Emerging Memory Technologies

Conference 2017 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT) · January 1, 2017 Link to item Cite

TernGrad: Ternary gradients to reduce communication in distributed deep learning

Conference Advances in Neural Information Processing Systems · January 1, 2017 High network communication cost for synchronizing gradients and parameters is the well-known bottleneck of distributed training. In this work, we propose TernGrad that uses ternary gradients to accelerate distributed deep learning in data parallelism. Our ... Cite

Nanoscale memory architectures for neuromorphic computing

Chapter · January 1, 2017 216On one hand, machine learning has been widely used in data processing to help users understand the underlying property of the data [1]. As a popular type of machine learning model, neural network [2] processes input data by multiplying them with layers ... Full text Cite

RAM and TCAM designs by using STT-MRAM

Conference 2016 16th Non-Volatile Memory Technology Symposium, NVMTS 2016 · December 9, 2016 Spin-transfer torque magnetic random access memory (STT-MRAM) is a prospective candidate for cache and main memory designs. However, the reliable revision of magnetization using current requires high current density, which is hardly affordable in aggressiv ... Full text Cite

ApesNet: a pixel‐wise efficient segmentation network for embedded devices

Journal Article IET Cyber-Physical Systems: Theory & Applications · December 2016 Full text Cite

Design techniques of eNVM-enabled neuromorphic computing systems

Conference Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016 · November 22, 2016 The recently emerged research on 'neuromorphic computing', which stands for hardware acceleration of brain-inspired computing, has become one of the most active research areas in computer engineering. In this invited paper, we start with a background intro ... Full text Cite

Neural processor design enabled by memristor technology

Conference 2016 IEEE International Conference on Rebooting Computing, ICRC 2016 - Conference Proceedings · November 8, 2016 Matrix-vector multiplication is a key computing operation in neural processor design and hence greatly affects the execution efficiency. Memristor crossbar is highly attractive for the implementation of matrix-vector multiplication for its analog storage s ... Full text Cite

Security of neuromorphic computing: Thwarting learning attacks using memristor's obsolescence effect

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 7, 2016 Neuromorphic architectures are widely used in many applications for advanced data processing, and often implements proprietary algorithms. In this work, we prevent an attacker with physical access from learning the proprietary algorithm implemented by the ... Full text Cite

Security challenges in smart surveillance systems and the solutions based on emerging nano-devices

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 7, 2016 Modern smart surveillance systems can not only record the monitored environment but also identify the targeted objects and detect anomaly activities. These advanced functions are often facilitated by deep neural networks, achieving very high accuracy and l ... Full text Cite

A data locality-aware design framework for reconfigurable sparse matrix-vector multiplication kernel

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 7, 2016 Sparse matrix-vector multiplication (SpMV) is an important computational kernel in many applications. For performance improvement, software libraries designated for SpMV computation have been introduced, e.g., MKL library for CPUs and cuSPARSE library for ... Full text Cite

Guest Editorial: Design and Applications of Neuromorphic Computing System

Journal Article IEEE Transactions on Multi-Scale Computing Systems · October 1, 2016 Full text Cite

ApesNet: A pixel-wise efficient segmentation network

Conference Proceedings of the 14th ACM/IEEE Symposium on Embedded Systems for Real-Time Multimedia, ESTIMedia 2016 · October 1, 2016 Autonomous driving can effectively reduce traffic congestion and road accidents. Therefore, it is necessary to implement an efficient high-level, scene understanding model in an embedded device with limited power and sources. Toward this goal, we propose A ... Full text Cite

Exploring the optimal learning technique for IBM TrueNorth platform to overcome quantization loss

Conference Proceedings of the 2016 IEEE/ACM International Symposium on Nanoscale Architectures, NANOARCH 2016 · September 14, 2016 As the first large-scale commercial spiking-based neuromorphic computing platform, IBM TrueNorth chip received tremendous attentions in society. However, one of the known issues in TrueNorth design is the limited precision of synaptic weights, each of whic ... Full text Cite

Message from the general chairs

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · September 2, 2016 Full text Cite

A memristor crossbar based computing engine optimized for high speed and accuracy

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · September 2, 2016 Matrix-vector multiplication, as a key computing operation, has been largely adopted in applications and hence greatly affects the execution efficiency. A common technique to enhance the performance of matrix-vector multiplication is increasing execution p ... Full text Cite

A Neuromorphic Architecture for Context Aware Text Image Recognition

Journal Article Journal of Signal Processing Systems · September 1, 2016 Although existing optical character recognition (OCR) tools can achieve excellent performance in text image detection and pattern recognition, they usually require a clean input image. Most of them do not perform well when the image is partially occluded o ... Full text Cite

ObjNandSim: Object-based NAND flash device simulator

Conference 2016 5th Non-Volatile Memory Systems and Applications Symposium, NVMSA 2016 · August 17, 2016 An object-based NAND flash storage system (ONFS) is proposed to overcome the architectural limitation of the existing block-based storage system. The ONFS can improve system performance by removing redundant software layers and reducing garbage collection ... Full text Cite

Design and Implementation of a 4Kb STT-MRAM with Innovative 200nm Nano-ring Shaped MTJ

Conference Proceedings of the International Symposium on Low Power Electronics and Design · August 8, 2016 Programmability is as a severe challenge in development of spin-transfer torque magnetic random access memory (STT-MRAM). Theoretical analysis have indicated that nano-ring shaped magnetic tunneling junction (NR-MTJ) can achieve lower write current and hig ... Full text Cite

A neuromorphic ASIC design using one-selector-one-memristor crossbar

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 The applications of memristors in neuromorphic computing have been extensively studied for its analogy to synapse. To overcome sneak path issue, nonlinear resistive selectors have been introduced to the design of memristor crossbar, enabling a high integra ... Full text Cite

Heterogeneous systems with reconfigurable neuromorphic computing accelerators

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 Developing heterogeneous system with hardware accelerator is a promising solution to implement high performance applications where explicitly programmed, rule-based algorithms are either infeasible or inefficient. However, mapping a neural network model to ... Full text Cite

Security of neuromorphic systems: Challenges and solutions

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 With the rapid growth of big-data applications, advanced data processing technologies, e.g., machine learning, are widely adopted in many industry fields. Although these technologies demonstrate powerful data analyzing and processing capability, there exis ... Full text Cite

Built-in selectors self-assembled into memristors

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 We demonstrate an approach to build a selector into ReRAM (memristors) using engineered materials. In this approach, a segment(s) of nonlinear material is self-assembled into the conduction channel (s) (filament) of a memristor. The nonlinear material exhi ... Full text Cite

Compact low-power instant store and restore D flip-flop using a selfcomplementing spintronic device

Journal Article Electronics Letters · July 7, 2016 To simplify power-gating requirements in ultra-low-power architectures, design strategies for low-power non-volatile flip-flops (F/Fs) are sought, for which the utilisation of spintronic devices offers a promising option. A D F/F that utilises a five-termi ... Full text Cite

A new learning method for inference accuracy, core occupation, and performance co-optimization on TrueNorth chip

Conference Proceedings - Design Automation Conference · June 5, 2016 IBM TrueNorth chip uses digital spikes to perform neuromorphic computing and achieves ultrahigh execution parallelism and power efficiency. However, in TrueNorth chip, low quantization resolution of the synaptic weights and spikes significantly limits the ... Full text Cite

TEMP: Thread batch enabled memory partitioning for GPU

Conference Proceedings - Design Automation Conference · June 5, 2016 As massive multi-threading in GPU imposes tremendous pressure on memory subsystems, efficient bandwidth utilization becomes a key factor affecting the GPU throughput. In this work, we propose thread batch enabled memory partitioning (TEMP), to improve GPU ... Full text Cite

Spin-hall assisted STT-RAM design and discussion

Conference Proceedings of the 18th ACM/IEEE System Level Interconnect Prediction 2016 Workshop, SLIP 2016 · June 4, 2016 In recent years, Spin-Transfer Torque Random Access Memory (STT-RAM) has attracted significant attentions from both industry and academia due to its attractive attributes such as small cell area and non-volatility. However, long switching time and large pr ... Full text Cite

Leveraging Stochastic Memristor Devices in Neuromorphic Hardware Systems

Journal Article IEEE Journal on Emerging and Selected Topics in Circuits and Systems · June 1, 2016 As the fourth basic circuit element, memristor has a unique synapse-Alike feature which demonstrates great potentials in neuromorphic circuit design. However, a large gap exists between the theoretical memristor characteristics and the actual device behavi ... Full text Cite

Spintronic Memristor as Interface between DNA and Solid State Devices

Journal Article IEEE Journal on Emerging and Selected Topics in Circuits and Systems · June 1, 2016 Recently biomolecular computing platforms have been widely investigated with great potentials in both biomedical research and practices, such as using molecular structures of DNA to present the data bits and to operate the logic. Emerging CMOS/molecular hy ... Full text Cite

WELCOME to ISQED 2016

Conference Proceedings - International Symposium on Quality Electronic Design, ISQED · May 25, 2016 Full text Cite

The applications of NVM technology in hardware security

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 18, 2016 The emerging nonvolatile memory (NVM) technologies have demonstrated great potentials in revolutionizing modern memory hierarchy because of their many promising properties: nanosecond read/write time, small cell area, non-volatility, and easy CMOS integrat ... Full text Cite

Harmonica: A Framework of Heterogeneous Computing Systems with Memristor-Based Neuromorphic Computing Accelerators

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · May 1, 2016 Following technology scaling, on-chip heterogeneous architecture emerges as a promising solution to combat the power wall of microprocessors. This work presents Harmonica - aframework of heterogeneous computing system enhanced by memristor-based neuromorph ... Full text Cite

Library-based placement and routing in FPGAs with support of partial reconfiguration

Journal Article ACM Transactions on Design Automation of Electronic Systems · May 1, 2016 While traditional Field-Programmable Gate Array design flow usually employs fine-grained tile-based placement, modular placement is increasingly required to speed up the large-scale placement and save the synthesis time. Moreover, the commonly used modules ... Full text Cite

Small-world Hopfield neural networks with weight salience priority and memristor synapses for digit recognition

Journal Article Neural Computing and Applications · May 1, 2016 A novel systematic design of associative memory networks is addressed in this paper, by incorporating both the biological small-world effect and the recently acclaimed memristor into the conventional Hopfield neural network. More specifically, the original ... Full text Cite

A holistic tri-region MLC STT-RAM design with combined performance, energy, and reliability optimizations

Conference Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016 · April 25, 2016 Multi-level cell spin-transfer torque random access memory (MLC STT-RAM) demonstrates great potentials in onchip cache design for its high storage density and non-volatility but also suffers from the degraded access time, reliability and energy efficiency. ... Full text Cite

Sliding Basket: An adaptive ECC scheme for runtime write failure suppression of STT-RAM cache

Conference Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016 · April 25, 2016 Write reliability is one of the major challenges in design of spin-transfer torque random access memory (STT-RAM) caches. To ensure design quality, error correction code (ECC) scheme is usually adopted in STT-RAM caches. However, it incurs significant hard ... Full text Cite

Hardware acceleration for neuromorphic computing: An evolving view

Conference 2015 15th Non-Volatile Memory Technology Symposium, NVMTS 2015 · April 20, 2016 The rapid growth of computing capacity of modern microprocessors enables the wide adoption of machine learning and neural network models. The ever-increasing demand for performance, combining with the concern on power budget, motivated the recent research ... Full text Cite

Array Organization and Data Management Exploration in Racetrack Memory

Journal Article IEEE Transactions on Computers · April 1, 2016 As the descendant of spin-transfer random access memory (STT-RAM), racetrack memory technology saves data in magnetic domains along nanoscopic wires. Such a unique structure can achieve unprecedentedly high storage density meanwhile inheriting the promisin ... Full text Cite

A novel PUF based on cell error rate distribution of STT-RAM

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 7, 2016 Physical Unclonable Functions (PUFs) have been widely proposed as security primitives to provide device identification and authentication. Recently, PUFs based on Non-volatile Memory (NVM) are widely proposed since the promise of NVMs' wide application. In ... Full text Cite

Radiation-induced soft error analysis of STT-MRAM: A device to circuit approach

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · March 1, 2016 Spin-transfer torque magnetic random access memory (STT-MRAM) is a promising emerging memory technology due to its various advantageous features such as scalability, nonvolatility, density, endurance, and fast speed. However, the reliability of STT-MRAM is ... Full text Cite

The evolutionary spintronic technologies and their usage in high performance computing

Conference International System on Chip Conference · February 12, 2016 This paper gives a comprehensive summary of our study in using the spintronic technologies for the on-chip cache density improvement of high performance computing systems. We will start with the spin-transfer torque random access memory (STT-RAM) at the ea ... Full text Cite

Memristor modeling - static, statistical, and stochastic methodologies

Chapter · January 1, 2016 Memristor, the fourth passive circuit element, has attracted increased attention since it was rediscovered by HP Lab in 2008. Its distinctive characteristic to record the historic profile of the voltage/current creates a great potential for future neuromor ... Full text Cite

Learning structured sparsity in deep neural networks

Conference Advances in Neural Information Processing Systems · January 1, 2016 High demand for computation resources severely hinders deployment of large-scale Deep Neural Networks (DNN) in resource constrained devices. In this work, we propose a Structured Sparsity Learning (SSL) method to regularize the structures (i.e., filters, c ... Cite

Welcome.

Conference ISQED · 2016 Cite

Synthesis and Characterization of 3-butyryloxy-16-(β-naphthylmethylene)-5α-androstane-17-one

Conference PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON BIOMEDICAL AND BIOLOGICAL ENGINEERING · 2016 Link to item Cite

Hierarchical Library Based Power Estimator for Versatile FPGAs

Conference Proceedings - IEEE 9th International Symposium on Embedded Multicore/Manycore SoCs, MCSoC 2015 · November 11, 2015 FPGA is a promising hardware accelerator in modern high-performance computing systems, e.g. cloud computing, big-data processing, etc. In such a system, power is a key factor of the design requiring thermal and energy-saving considerations. Modern power es ... Full text Cite

An overview on memristor crossabr based neuromorphic circuit and architecture

Conference IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC · October 30, 2015 As technology advances, artificial intelligence becomes pervasive in society and ubiquitous in our lives, which stimulates the desire for embedded-everywhere and human-centric intelligent computation paradigm. However, conventional instruction-based comput ... Full text Cite

Hierarchical library based power estimator for versatile FPGAs

Conference 25th International Conference on Field Programmable Logic and Applications, FPL 2015 · October 7, 2015 FPGA is a promising hardware accelerator in modern high-performance computing systems. In such a system, power is a key factor in the design requiring thermal and energy-saving considerations. Modern power estimators for FPGA either support specific hardwa ... Full text Cite

Spiking-based matrix computation by leveraging memristor crossbar array

Conference 2015 IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2015 - Proceedings · August 17, 2015 As process technology continues scaling down, the memory barrier becomes more severe. Thus, spiking neuromorphic computing that can significantly enhance computing and communication efficiencies has been widely studied. Both conventional CMOS technology an ... Full text Cite

The applications of memristor devices in next-generation cortical processor designs

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 27, 2015 Discovery of memristor opened a new era of the research on universal memory thanks to many attractive properties demonstrated by this emerging device. In this paper, we switch our research focus to neuromorphic computing, which, same as memory technology, ... Full text Cite

A new self-reference sensing scheme for TLC MRAM

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 27, 2015 Density is one of the major design factors of magnetic random access memory (MRAM). Very recently, a tri-level cell (TLC) structure was proposed to enhance the storage density of MRAM. In this work, we propose a new self-reference sensing scheme for the TL ... Full text Cite

Cloning your mind: Security challenges in cognitive system designs and their solutions

Conference Proceedings - Design Automation Conference · July 24, 2015 With the booming of big-data applications, cognitive information processing systems that leverage advanced data processing technologies, e.g., machine learning and data mining, are widely used in many industry fields. Although these technologies demonstrat ... Full text Cite

VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications

Conference Proceedings - Design Automation Conference · July 24, 2015 Massive multi-threading of GPGPU demands for efficient usage of caches with limited capacity. In this work, we propose a versatile warp scheduler (VWS) to reduce the cache miss rate in GPGPU. VWS retains the intra-warp cache locality using an efficient per ... Full text Cite

RENO: A high-efficient reconfigurable neuromorphic computing accelerator design

Conference Proceedings - Design Automation Conference · July 24, 2015 Neuromorphic computing is recently gaining significant attention as a promising candidate to conquer the well-known von Neumann bottleneck. In this work, we propose RENO - a efficient reconfigurable neuromorphic computing accelerator. RENO leverages the ex ... Full text Cite

FlexLevel: A novel NAND flash storage system design for LDPC latency reduction

Conference Proceedings - Design Automation Conference · July 24, 2015 LDPC code is introduced in NAND flash memory to handle high BER (bit error rate) incurred by technology scaling. Despite strong error correction capability, LDPC decoding induces long NAND flash read latency. In this work, we propose FlexLevel - a robust N ... Full text Cite

Vortex: Variation-aware training for memristor X-bar

Conference Proceedings - Design Automation Conference · July 24, 2015 Recent advances in development of memristor devices and cross-bar integration allow us to implement a low-power on-chIP neuromorphic computing system (NCS) with small footprint. Training methods have been proposed to program the memristors in a crossbar by ... Full text Cite

A spiking neuromorphic design with resistive crossbar

Conference Proceedings - Design Automation Conference · July 24, 2015 Neuromorphic systems recently gained increasing attention for their high computation efficiency. Many designs have been proposed and realized with traditional CMOS technology or emerging devices. In this work, we proposed a spiking neuromorphic design buil ... Full text Cite

FPGA acceleration of recurrent neural network based language model

Conference Proceedings - 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2015 · July 15, 2015 Recurrent neural network (RNN) based language model (RNNLM) is a biologically inspired model for natural language processing. It records the historical information through additional recurrent connections and therefore is very effective in capturing semant ... Full text Cite

Spin-hall assisted STT-RAM design and discussion

Conference 2015 IEEE International Magnetics Conference, INTERMAG 2015 · July 14, 2015 Conventional spin-transfer torque random access memory (STT-RAM) is a promising technology due to its non-volatility and dense cell structure. However, the long switching time of magnetic tunneling junction (MTJ) limits the write speed of the STT-RAM. In o ... Full text Cite

Read performance: The newest barrier in scaled stt-ram

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · June 1, 2015 Spin-torque transfer RAM (STT-RAM), a promising alternative to static RAM (SRAM) for reducing leakage power consumption, has been widely studied to mitigate the impact of its asymmetrically long write latency. However, physical effects of technology scalin ... Full text Cite

A high-speed robust NVM-TCAM design using body bias feedback

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 20, 2015 As manufacture process scales down rapidly, the design of ternary content-addressable memory (TCAM) requiring high storage density, fast access speed and low power consumption becomes very challenging. In recent years, many novel TCAM designs have been ins ... Full text Cite

Energy efficient RRAM spiking neural network for real time classification

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 20, 2015 Inspired by the human brain's function and efficiency, neuromorphic computing offers a promising solution for a wide set of tasks, ranging from brain machine interfaces to real-time classification. The spiking neural network (SNN), which encodes and proces ... Full text Cite

A novel true random number generator design leveraging emerging memristor technology

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 20, 2015 Memristor, the fourth basic circuit element, demonstrates obvious stochastic behaviors in both the static resistance states and the dynamic switching. In this work, a novel memristor-based true random number generator (MTRNG) is presented which leverages t ... Full text Cite

Giant spin hall effect (GSHE) logic design for low power application

Conference Proceedings -Design, Automation and Test in Europe, DATE · April 22, 2015 Conventional CMOS transistors will reach its power wall, a huge leakage power consumption limits the performance growth when technology scales down, especially beyond 45nm technology nodes. Spin based devices are one of the alternative computing technologi ... Full text Cite

Welcome to ISQED 2015

Conference Proceedings - International Symposium on Quality Electronic Design, ISQED · April 13, 2015 Full text Cite

An efficient STT-RAM-based register file in GPU architectures

Conference 20th Asia and South Pacific Design Automation Conference, ASP-DAC 2015 · March 11, 2015 Modern GPGPUs employ a large register file (RF) to efficiently process heavily parallel threads in single instruction multiple thread (SIMT) fashion. The up-scaling of RF capacity, however, is greatly constrained by large cell area and high leakage power c ... Full text Cite

Quantitative modeling of racetrack memory, a tradeoff among area, performance, and power

Conference 20th Asia and South Pacific Design Automation Conference, ASP-DAC 2015 · March 11, 2015 Recently, an emerging non-volatile memory called Racetrack Memory (RM) becomes promising to satisfy the requirement of increasing on-chip memory capacity. RM can achieve ultra-high storage density by integrating many bits in a tape-like racetrack, and also ... Full text Cite

Neuromorphic hardware acceleration enabled by emerging technologies (Invited paper)

Conference Proceedings of the 14th International Symposium on Integrated Circuits, ISIC 2014 · February 2, 2015 The explosion of big data applications imposes severe challenges of data processing speed and scalability on traditional computer systems. However, the performance of the von Neumann machine is greatly hindered by the increasing performance gap between CPU ... Full text Cite

Energy efficient spiking neural network design with RRAM devices

Conference Proceedings of the 14th International Symposium on Integrated Circuits, ISIC 2014 · February 2, 2015 The brain-inspired neural networks have demonstrated great potential in big data analysis. The spiking neural network (SNN), which encodes the real world data into spike trains, promises great performance in computational ability and energy efficiency. Mor ... Full text Cite

Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 5, 2015 Neuromorphic computing system (NCS) is a promising architecture to combat the well-known memory bottleneck in Von Neumann architecture. The recent breakthrough on memristor devices made an important step toward realizing a low-power, small-footprint NCS on ... Full text Cite

CONSISTENCY OF SURFACE PULSE AND RECIPROCITY CALIBRATION OF PIEZOELECTRIC AE SENSORS

Conference PROCEEDINGS OF THE 2015 SYMPOSIUM ON PIEZOELECTRICITY, ACOUSTIC WAVES AND DEVICE APPLICATIONS · January 1, 2015 Link to item Cite

Neuromorphic acceleration for context aware text image recognition

Conference IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation · December 15, 2014 Although existing optical character recognition (OCR) tools can achieve excellent performance in text image detection and pattern recognition, they usually require a clean input image. Most of them do not perform well when the image is partially occluded o ... Full text Cite

Optimizing MLC-based STT-RAM caches by dynamic block size reconfiguration

Conference 2014 32nd IEEE International Conference on Computer Design, ICCD 2014 · December 3, 2014 The use of STT-RAM as on-chip caches has been widely studied. However, existing works focused mainly on single-level cell (SLC) design while the potential of multi-level cell (MLC) STT-RAM has not yet been fully explored. It is expected that MLC STT-RAM ca ... Full text Cite

Emerging memristor technology enabled next generation cortical processor

Conference International System on Chip Conference · November 5, 2014 The explosion of 'big data' applications imposes severe challenges of data processing speed and scalability on traditional computer systems. However, the performance of von Neumann machine is greatly hindered by the increasing performance gap between CPU a ... Full text Cite

A novel self-reference technique for STT-RAM read and write reliability enhancement

Journal Article IEEE Transactions on Magnetics · November 1, 2014 Spin-transfer torque random access memory (STT-RAM) has demonstrated great potential in embedded and stand-alone applications. However, process variations and thermal fluctuations greatly influence the operation reliability of STT-RAM and limit its scalabi ... Full text Cite

Memristor crossbar-based neuromorphic computing system: a case study.

Journal Article IEEE transactions on neural networks and learning systems · October 2014 By mimicking the highly parallel biological systems, neuromorphic hardware provides the capability of information processing within a compact and energy-efficient platform. However, traditional Von Neumann architecture and the limited signal connections ha ... Full text Cite

Memristor modeling - Static, statistical, and stochastic methodologies

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · September 18, 2014 Memristor, the fourth passive circuit element, hasattracted increased attention since it was rediscovered by HPLab in 2008. Its distinctive characteristic to record the historicprofile of the voltage/current creates a great potential for futureneuromorphic ... Full text Cite

A weighted sensing scheme for ReRAM-based cross-point memory array

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · September 18, 2014 In recent years, the design of cross-point array based on resistive random access memory (ReRAM) has been widely investigated because it offers extremely high storage density and low power consumption. However, the sneak-path leakage in such a resistive ne ... Full text Cite

An adjustable memristor model and its application in small-world neural networks

Conference Proceedings of the International Joint Conference on Neural Networks · September 3, 2014 This paper presents a novel mathematical model for the TiO2 thin-film memristor device discovered by Hewlett-Packard (HP) labs. Our proposed model considers the boundary conditions and the nonlinear ionic drift effects by using a piecewise linear window fu ... Full text Cite

STDP learning rule based on memristor with STDP property

Conference Proceedings of the International Joint Conference on Neural Networks · September 3, 2014 Spike-timing-dependent plasticity (STDP) learning ability has been observed in physical memristors, but whether the STDP is caused by the neuron or the memristor is unclear. In this paper, we proved the STDP property in the model for both symmetric and asy ... Full text Cite

The Prospect of STT-RAM Scaling

Chapter · August 4, 2014 Featuring contributions from well-known and respected industrial and academic experts, this cutting-edge work not only presents the latest research and developments but also: Describes spintronic applications in current and future magnetic ... ... Cite

Spintronic memristor as interface between DNA and solid state devices

Chapter · August 1, 2014 Magnetic sensing is widely used in various modern bio-medical devices since many physiological functions (e.g., nerve impulses) generate electrical currents that create magnetic field [24]. Monitoring such signals by detecting magnetic field is less invasi ... Full text Cite

Research on the Sol-Gel Method of Preparing Ternary Nano SiO<sub>2</sub>-Al<sub>2</sub>O<sub>3</sub>-TiO<sub>2</sub> Materials

Conference Key Engineering Materials · April 2014 Tetraethyl orthosilicate (TEOS), butyl titanate [Ti (OBu)4] and aluminium isopropoxide were used as molecular precursor of ternary nanoSiO2-Al2O3-TiO2 Full text Cite

The stochastic modeling of TiO2 memristor and its usage in neuromorphic system design

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 27, 2014 Memristor, the fourth basic circuit element, has shown great potential in neuromorphic circuit design for its unique synapse-like feature. However, though the continuous resistance state of memristor has been expected, obtaining and maintaining an arbitrar ... Full text Cite

A coherent hybrid SRAM and STT-RAM L1 cache architecture for shared memory multicores

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 27, 2014 STT-RAM is an emerging NVRAM technology that promises high density, low energy and a comparable access speed to conventional SRAM. This paper proposes a hybrid L1 cache architecture that incorporates both SRAM and STT-RAM. The key novelty of the proposal i ... Full text Cite

A heterogeneous computing system with memristor-based neuromorphic accelerators

Conference 2014 IEEE High Performance Extreme Computing Conference, HPEC 2014 · February 11, 2014 As technology scales, on-chip heterogeneous architecture emerges as a promising solution to combat the power wall of microprocessors. In this work, we propose a heterogeneous computing system with memristor-based neuromorphic computing accelerators (NCAs). ... Full text Cite

Bio-inspired computing with resistive memories - Models, architectures and applications

Conference Proceedings - IEEE International Symposium on Circuits and Systems · January 1, 2014 The traditional Von Neumann architecture has constrained the potential for applying massively parallel architecture to embedded high performance computing where we must optimize the size, weight and power of the system. Inspired by highly parallel biologic ... Full text Cite

A novel memristive multilayer feedforward small-world neural network with its applications in PID control.

Journal Article TheScientificWorldJournal · January 2014 In this paper, we present an implementation scheme of memristor-based multilayer feedforward small-world neural network (MFSNN) inspirited by the lack of the hardware realization of the MFSNN on account of the need of a large number of electronic neurons a ... Full text Cite

Design exploration of racetrack lower-level caches

Conference Proceedings of the International Symposium on Low Power Electronics and Design · January 1, 2014 The recent successful integration of magnetic racetrack memory forecasts a new computing era with unprecedentedly high-density on-chip storage. However, racetrack memory accesses require frequent magnetic domain shifting, introducing overheads in access la ... Full text Cite

STT-RAM cache hierarchy design and exploration with emerging magnetic devices

Chapter · January 1, 2014 Spin-transfer torque random access memory (STT-RAM) is a promising new nonvolatile technology that has good scalability, zero standby power, and radiation hardness. The use of STT-RAM in last level on-chip caches has been proposed as it significantly reduc ... Full text Cite

ICE: Inline calibration for memristor crossbar-based computing engine

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2014 The emerging neuromorphic computation provides a revolutionary solution to the alternative computing architecture and effectively extends Moore's Law. The discovery of the memristor presents a promising hardware realization of neuromorphic systems with inc ... Full text Cite

Accelerating graph computation with racetrack memory and pointer-assisted graph representation

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2014 The poor performance of NAND Flash memory, such as long access latency and large granularity access, is the major bottleneck of graph processing. This paper proposes an intelligent storage for graph processing which is based on fast and low cost racetrack ... Full text Cite

Exploration of GPGPU register file architecture using domain-wall-shift- write based racetrack memory

Conference Proceedings - Design Automation Conference · January 1, 2014 SRAM based register le (RF) is one of the major factors lim-iting the scaling of GPGPU. In this work, we propose to use the emerging nonvolatile domain-wall-shift-write based race-track memory (DWSW-RM) to implement a power-effcient GPGPU RF, of which the ... Full text Cite

A new field-assisted access scheme of STT-RAM with self-reference capability

Conference Proceedings - Design Automation Conference · January 1, 2014 Spin-transfer torque random access memory (STT-RAM) has demonstrated great potentials in embedded and stand-alone applications. However, process variations and thermal fluctuations greatly influence the operation reliability of STT-RAM and limit its scalab ... Full text Cite

GLSVLSI'14 chairs' welcome

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · January 1, 2014 Cite

STT-RAM cache hierarchy with multiretention MTJ designs

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · January 1, 2014 Spin-transfer torque random access memory (STT-RAM) is the most promising candidate to be universal memory due to its good scalability, zero standby power, and radiation hardness. Having a cell area only 1/9 to 1/3 that of SRAM, allows for a much larger ca ... Full text Cite

Optimizing MLC-based STT-RAM Caches by Dynamic Block Size Reconfiguration

Conference 2014 32ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD) · January 1, 2014 Link to item Cite

A practical low-power memristor-based analog neural branch predictor

Conference Proceedings of the International Symposium on Low Power Electronics and Design · December 11, 2013 Recently, the discovery of memristor brought the promise of high density, low energy, and combined memory/arithmetic capability into computing. This paper demonstrates a practical neural branch predictor based on memristor. By using analog computation tech ... Full text Cite

A neuromorphic architecture for anomaly detection in autonomous large-area traffic monitoring

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2013 The advanced sensing and imaging capability of today's sensor networks enables real time monitoring in a large area. In order to provide continuous monitoring and prompt situational awareness, an abstract-level autonomous information processing framework i ... Full text Cite

ADAMS: Asymmetric differential STT-RAM cell structure for reliable and high-performance applications

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2013 Spin-transfer torque random access memory (STT-RAM) is an emerging non-volatile memory technology offering many attractive characteristics like high integration density, nanosecond access time, and good CMOS compatibility. However, the performance and reli ... Full text Cite

Unleashing the potential of MLC STT-RAM caches

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2013 In this paper, we study the use of multi-level cell (MLC) spin-transfer torque RAM (STT-RAM) in cache design of embedded systems and microprocessors. Compared to the single level cell (SLC) design, a MLC STT-RAM cache is expected to offer higher density an ... Full text Cite

C1C: A configurable, compiler-guided STT-RAM L1 cache

Journal Article Transactions on Architecture and Code Optimization · December 1, 2013 Spin-Transfer Torque RAM (STT-RAM), a promising alternative to SRAM for reducing leakage power consumption, has been widely studied to mitigate the impact of its asymmetrically long write latency. Recently, STT-RAM has been proposed for L1 caches by relaxi ... Full text Cite

Memristor-based synapse design and a case study in reconfigurable systems

Conference Proceedings of the International Joint Conference on Neural Networks · December 1, 2013 Scientists have dreamed of an information system with cognitive human-like skills for years. However, constrained by the device characteristics and rapidly increasing design complexity under the traditional processing technology, little progress has been m ... Full text Cite

A pseudo-weighted sensing scheme for memristor based cross-point memory

Conference Proceedings of the 2013 IEEE/ACM International Symposium on Nanoscale Architectures, NANOARCH 2013 · November 6, 2013 To further extend the scaling trend of traditional CMOS technology, many hybrid architectures integrating emerging device technologies have been proposed recently. Among them, memristor based cross-point memory (MBCPM) demonstrates great potential in data ... Full text Cite

On-chip caches built on multilevel spin-transfer torque RAM cells and its optimizations

Journal Article ACM Journal on Emerging Technologies in Computing Systems · October 21, 2013 It has been predicted that a processor's caches could occupy as much as 90% of chip area a few technology nodes from the current ones. In this article, we investigate the use of multilevel spin-transfer torque RAM (STT-RAM) cells in the design of processor ... Full text Cite

BSB training scheme implementation on memristor-based circuit

Conference Proceedings of the 2013 IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013 · October 9, 2013 In this work, we propose a hardware realization of the Brain-State-in-a-Box (BSB) neural network model training algorithm. This method can be implemented as an analog/digital mixed-signal circuit to train memristor crossbar arrays within BSB circuits. The ... Full text Cite

Common-source-line array: An area efficient memory architecture for bipolar nonvolatile devices

Journal Article ACM Transactions on Design Automation of Electronic Systems · October 1, 2013 Traditional array organization of bipolar nonvolatile memories such as STT-MRAM and memristor utilizes two bitlines for cell manipulations.With technology scaling, such bitline pair will soon become the bottleneck for further density improvement. In this a ... Full text Cite

Cross-layer racetrack memory design for ultra high density and low power consumption

Conference Proceedings - Design Automation Conference · July 12, 2013 The racetrack memory technology utilizes magnetic domains along a nanoscopic wire to obtain ultra-high data storage density. The recent success in the planar racetrack nanowire promised its fabrication feasibility and future scalability, bringing more desi ... Full text Cite

Coordinating prefetching and STT-RAM based last-level cache management for multicore systems

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 30, 2013 Data prefetching is a common mechanism to mitigate the bottleneck of off-chip memory bandwidth in modern computing systems. Unfortunately, the side effects of prefetching are an additional burden on off-chip communication and increased cache write operatio ... Full text Cite

A hardware security scheme for RRAM-based FPGA

Conference 2013 23rd International Conference on Field Programmable Logic and Applications, FPL 2013 - Proceedings · January 1, 2013 To enhance the system integrity of FPGA-based embedded systems on hardware design, we propose a hardware security scheme for nonvolatile resistive random access memory (RRAM) based FPGA, in which internal block RAM (BRAMs) are used for configuration and te ... Full text Cite

STT-RAM designs supporting dual-port accesses

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2013 The spin-transfer torque random access memory (STT-RAM) has been widely investigated as a promising candidate to replace the static random access memory (SRAM) as on-chip cache memories. However, the existing STT-RAM cell designs can be used for only singl ... Full text Cite

DA-RAID-5: A disturb aware data protection technique for NAND flash storage systems

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2013 Program disturb, read disturb and retention time limit are three major reasons accounting for the bit errors in NAND flash memory. The adoption of multi-level cell (MLC) technology and technology scaling further aggravates this reliability issue by narrowi ... Full text Cite

Digital-assisted noise-eliminating training for memristor crossbar-based analog neuromorphic computing engine

Conference Proceedings - Design Automation Conference · January 1, 2013 The invention of neuromorphic computing architecture is inspired by the working mechanism of human-brain. Memristor technology revitalized neuromorphic computing system design by efficiently executing the analog Matrix-Vector multiplication on the memristo ... Full text Cite

C1C: A Configurable, Compiler-Guided STT-RAM L1 Cache

Journal Article ACM Transactions on Architecture and Code Optimization · January 1, 2013 Spin-Transfer Torque RAM (STT-RAM), a promising alternative to SRAM for reducing leakage power consumption, has been widely studied to mitigate the impact of its asymmetrically long write latency. Recently, STT-RAM has been proposed for L1 caches by relaxi ... Full text Cite

An Image Compression Method using Sparse Representation and Grey Relation

Conference PROCEEDINGS OF 2013 IEEE INTERNATIONAL CONFERENCE ON GREY SYSTEMS AND INTELLIGENT SERVICES (GSIS) · 2013 Cite

Non-volatile 3D stacking RRAM-based FPGA

Conference Proceedings - 22nd International Conference on Field Programmable Logic and Applications, FPL 2012 · December 12, 2012 We demonstrates a novel Field-Programmable Gate Array (FPGA) structure based on Resistive Random Access Memory (RRAM) system. RRAM is a non-volatile memory device which is compatible to CMOS Back End of Line (BEOL) process with only 4F2 area per cell. We u ... Full text Cite

uBRAM-based run-time reconfigurable FPGA and corresponding reconfiguration methodology

Conference FPT 2012 - 2012 International Conference on Field-Programmable Technology · December 1, 2012 With rising demands for high-performance computing and design flexibility of post-fabrication system, reconfigurable architecture has been drawing increasing attentions. However, reconfigurability, advantage of current Field-Programmable Gate Array (FPGA), ... Full text Cite

Spintronic devices: From memory to memristor

Conference ICSICT 2012 - 2012 IEEE 11th International Conference on Solid-State and Integrated Circuit Technology, Proceedings · December 1, 2012 This paper provides a broad overview of spintronic devises with emphasis on memory and spintronic memristive systems. The operational fundamentals of spintronic devices are presented, followed by brief descriptions of magnetic tunneling junctions (MTJs), s ... Full text Cite

STT-RAM cell design considering CMOS and MTJ temperature dependence

Journal Article IEEE Transactions on Magnetics · October 29, 2012 In spin-transfer torque random access memory (STT-RAM), the temperature fluctuations can significantly affect the characteristics of both electrical and magnetic devices. In this paper, we analyze their temperature dependence and investigate the impacts of ... Full text Cite

Analysis and optimization of thermal effect on STT-RAM based 3-D stacked cache design

Conference Proceedings - 2012 IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2012 · October 29, 2012 Spin-Transfer Torque Random Access Memory (STT-RAM) has been proved a promising emerging nonvolatile memory technology suitable for many applications such as cache memory of CPU. Simulation results show that the switching time of Magnetic Tunnel Junction ( ... Full text Cite

A novel peripheral circuit for RRAM-based LUT

Conference ISCAS 2012 - 2012 IEEE International Symposium on Circuits and Systems · September 28, 2012 Resistive random access memory (RRAM) is a promising candidate to substitute static random access memory (SRAM) in lookup table (LUT) design for its high density and non-volatility. RRAM cells are fabricated at backend CMOS process and have negligible area ... Full text Cite

The 3-D stacking bipolar RRAM for high density

Journal Article IEEE Transactions on Nanotechnology · September 17, 2012 For its simple structure, high density, and good scalability, the resistive random access memory (RRAM) has emerged as one of the promising candidates for large data storage in computing systems. Moreover, building up RRAM in a 3-D stacking structure furth ... Full text Cite

A dual-mode architecture for fast-switching STT-RAM

Conference Proceedings of the International Symposium on Low Power Electronics and Design · September 4, 2012 In the past, the spin-transfer torque RAM (STT-RAM) suffered from the slow write speed and the high write energy consumption. The latest progress in device engineering has dramatically reduced the write time to a few nanoseconds and hence enabled the fast- ... Full text Cite

Process variation aware data management for STT-RAM cache design

Conference Proceedings of the International Symposium on Low Power Electronics and Design · September 4, 2012 The spin-transfer torque random access memory (STT-RAM) has gained increasing attentions for its high density, fast read access, zero standby power, and good scalability. The recently proposed retention-relax design further improves STT-RAM write access pe ... Full text Cite

Voltage driven nondestructive self-reference sensing for STT-Ram yield enhancement

Journal Article SPIN · September 1, 2012 Spin-transfer torque random access memory (STT-RAM) has demonstrated great potentials as a universal memory for its fast access speed, zero standby power, excellent scalability and simplicity of cell structure. However, large process variations of both mag ... Full text Cite

Memristor crossbar based hardware realization of BSB recall function

Conference Proceedings of the International Joint Conference on Neural Networks · August 22, 2012 The Brain-State-in-a-Box (BSB) model is an auto-associative neural network that has been widely used in optical character recognition and image processing. Traditionally, the BSB model was realized at software level and carried out on high-performance comp ... Full text Cite

Memristor-based synapse design and training scheme for neuromorphic computing architecture

Conference Proceedings of the International Joint Conference on Neural Networks · August 22, 2012 Memristors have been rediscovered recently and then gained increasing attentions. Their unique properties, such as high density, nonvolatility, and recording historic behavior of current (or voltage) profile, have inspired the creation of memristor-based n ... Full text Cite

Statistical memristor modeling and case study in neuromorphic computing

Conference Proceedings - Design Automation Conference · July 11, 2012 Memristor, the fourth passive circuit element, has attracted increased attention since it was rediscovered by HP Lab in 2008. Its distinctive characteristic to record the historic profile of the voltage/current creates a great potential for future neuromor ... Full text Cite

Hardware realization of BSB recall function using memristor crossbar arrays

Conference Proceedings - Design Automation Conference · July 11, 2012 The Brain-State-in-a-Box (BSB) model is an auto-associative neural network that has been widely used in optical character recognition and image processing. Traditionally, the BSB model was realized at software level and carried out on high-performance comp ... Full text Cite

Statistical Memristor Model and Its Applications in Neuromorphic Computing

Chapter · June 28, 2012 This book presents a selection of the remarkable contributions given by the leaders of the field and it may serve as inspiration and future reference to all researchers that want to explore the extraordinary possibilities given by this ... ... Cite

Nonvolatile memories as the data storage system for implantable ecg recorder

Journal Article ACM Journal on Emerging Technologies in Computing Systems · June 1, 2012 In this article, we propose a data storage systemwith the emerging nonvolatilememory technologies used for the implantable electrocardiography (ECG) recorder. The proposed storage system can record the digitalized real-time ECG waveforms continuously insid ... Full text Cite

Spintronic memristor based temperature sensor design with CMOS current reference

Conference Proceedings -Design, Automation and Test in Europe, DATE · May 24, 2012 As the technology scales down, the increased power density brings in significant system reliability issues. Therefore, the temperature monitoring and the induced power management become more and more critical. The thermal fluctuation effects of the recentl ... Cite

Architecting a common-source-line array for bipolar non-volatile memory devices

Conference Proceedings -Design, Automation and Test in Europe, DATE · May 24, 2012 Traditional array organization of bipolar non-volatile memories such as STT-MRAM and memristor utilizes two bitlines for cell manipulations. With technology scaling, such bitline pair will soon become the bottleneck of density improvement. In this paper we ... Cite

Fine-grained dynamic voltage scaling on OLED display

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · April 26, 2012 Organic Light Emitting Diode (OLED) has emerged as the new generation display technique for mobile multimedia devices. Compared to existing technologies OLEDs are thinner, brighter, lighter, and cheaper. However, OLED panels are still the biggest contribut ... Full text Cite

A look up table design with 3D bipolar RRAMs

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · April 26, 2012 Look Up Table (LUT) is a basic configurable logic element in Field Programmable Gate Arrays (FPGAs). In a commercial product, Static Random Access Memory (SRAM) has been widely used in each LUT to store configured logic. Recently, emerging Resistive RAM (R ... Full text Cite

Magnetic tunnel junction design margin exploration for self-reference sensing scheme.

Journal Article Journal of applied physics · April 2012 This work investigates the magnetic tunnel junction (MTJ) design requirements for the application of nondestructive self-reference sensing scheme, a novel sensing scheme featuring high tolerance of process variations, fast sensing speed, and no impact on d ... Full text Cite

A 130 nm 1.2 V/3.3 v 16 Kb spin-transfer torque random access memory with nondestructive self-reference sensing scheme

Journal Article IEEE Journal of Solid-State Circuits · February 1, 2012 Among all the emerging memories, Spin-Transfer Torque Random Access Memory (STT-RAM) has demonstrated many promising features such as fast access speed, nonvolatility, excellent scalability, and compatibility to CMOS process. However, the large process var ... Full text Cite

Voltage driven nondestructive self-reference sensing scheme of spin-transfer torque memory

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · January 1, 2012 Spin-transfer torque random access memory (STT-RAM) has demonstrated great potentials as a universal memory for its fast access speed, zero standby power, excellent scalability, and simplicity of cell structure. However, large process variations of both ma ... Full text Cite

Probabilistic design methodology to improve run-time stability and performance of STT-RAM caches

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2012 Using the spin-transfer torque random access memory (STT-RAM) technology as lower level on-chip caches has been proposed to minimize leakage power consumption and enhance cache capacity at the scaled technologies. However, programming STT-RAM is a stochast ... Full text Cite

Consequences of inhibiting amyloid precursor protein processing enzymes on synaptic function and plasticity.

Journal Article Neural plasticity · January 2012 Alzheimer's disease (AD) is a neurodegenerative disease, one of whose major pathological hallmarks is the accumulation of amyloid plaques comprised of aggregated β-amyloid (Aβ) peptides. It is now recognized that soluble Aβ oligomers may lead to synaptic d ... Full text Cite

Multi retention level STT-RAM cache designs with a dynamic refresh scheme

Conference Proceedings of the Annual International Symposium on Microarchitecture, MICRO · December 1, 2011 Spin-transfer torque random access memory (STT-RAM) has received increasing attention because of its attractive features: good scalability, zero standby power, non-volatility and radiation hardness. The use of STT-RAM technology in the last level on-chip c ... Full text Cite

Fast statistical model of TiO 2 thin-film memristor and design implication

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2011 The emerging memristor devices have recently received increased attention since HP Lab reported the first TiO 2-based memristive structure. As it is at nano-scale geometry size, the uniformity of memristor device is difficult to control due to the process ... Full text Cite

Universal statistical cure for predicting memory loss

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2011 Novel nonvolatile memory (NVM) technologies are gaining significant attention from semiconductor industry in the competition of universal memory development. However, as nanoscale devices, these emerging NVMs suffer from the intrinsic technology challenges ... Full text Cite

Emerging non-volatile memories: Opportunities and challenges

Conference Embedded Systems Week 2011, ESWEEK 2011 - Proceedings of the 9th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS'11 · November 22, 2011 In recent years, non-volatile memory (NVM) technologies have emerged as candidates for future universal memory. NVMs generally have advantages such as low leakage power, high density, and fast read spead. At the same time, NVMs also have disadvantages. For ... Full text Cite

A 1.0V 45nm nonvolatile magnetic latch design and its robustness analysis

Conference Proceedings of the Custom Integrated Circuits Conference · November 9, 2011 A new nonvolatile latch design is proposed based on the magnetic tunneling junction (MTJ) devices. In the standby mode, the latched data can be retained in the MTJs without consuming any power. Two types of operation errors, namely, persistent and non-pers ... Full text Cite

Processor caches built using multi-level spin-transfer torque RAM cells

Conference Proceedings of the International Symposium on Low Power Electronics and Design · September 19, 2011 It has been predicted that a processor's caches could occupy as much as 90% of chip area for technology nodes from the current. In this paper, we study the use of multi-level spin-transfer torque RAM (STT-RAM) cells in the design of processor caches. Compa ... Full text Cite

3D-HIM: A 3D High-density interleaved memory for bipolar RRAM design

Conference Proceedings of the 2011 IEEE/ACM International Symposium on Nanoscale Architectures, NANOARCH 2011 · August 11, 2011 Because of its simple structure, high density and good scalability, resistive random access memory (RRAM) is expected to be a promising candidate to substitute traditional data storage devices, e.g., hard-disk drive (HDD). In a conventional three-dimension ... Full text Cite

3D-ICML: A 3D bipolar ReRAM design with interleaved complementary memory layers

Conference Proceedings -Design, Automation and Test in Europe, DATE · May 31, 2011 Resistive random access memory (ReRAM) has been demonstrated as a promising non-volatile memory technology with features such as high density, low power, good scalability, easy fabrication and compatibility to the existing CMOS technology. The conventional ... Cite

Stacking magnetic random access memory atop microprocessors: An architecture-level evaluation

Journal Article IET Computers and Digital Techniques · May 1, 2011 Magnetic random access memory (MRAM) has been considered as a promising memory technology because of its attractive properties such as non-volatility, fast access, zero standby leakage and high density. Although integrating MRAM with complementary metal-ox ... Full text Cite

Geometry variations analysis of TiO2 thin-film and spintronic memristors

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 28, 2011 The fourth passive circuit element, memristor, has attracted increased attentions since the first real device was discovered by HP Lab in 2008. Its distinctive characteristic to record the historic profile of the voltage/current through itself creates grea ... Full text Cite

Emerging sensing techniques for emerging memories

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 28, 2011 Among all emerging memories, Spin-Transfer Torque Random Access Memory (STT-RAM) has shown many promising features such as fast access speed, nonvolatility, compatibility to CMOS process and excellent scalability. However, large process variations of both ... Full text Cite

Current switching in MgO-based magnetic tunneling junctions

Journal Article IEEE Transactions on Magnetics · January 1, 2011 Spin-transfer induced magnetization switching in a MgO-based magnetic tunneling junction (MTJ) has been measured over a wide time range. It was found that the switching current response is asymmetric going from the high resistance state to the low resistan ... Full text Cite

Spintronic memristor: Compact model and statistical analysis

Journal Article Journal of Low Power Electronics · January 1, 2011 The fourth fundamental passive circuit element - memristor, has received the increased attentions after a real device was demonstrated by HP Lab in 2008. The distinctive characteristic of a memristor to record the historical profile of the voltage/current ... Full text Cite

Nonpersistent errors optimization in spin-MOS logic and storage circuitry

Journal Article IEEE Transactions on Magnetics · January 1, 2011 By combining the flexibility of MOS logic and the nonvolatility of spintronic devices, Spin-MOS logic and storage circuitries offer a promising approach to implement a highly integrated, power-efficient, and nonvolatile computing and storage systems. Besid ... Full text Cite

STT-RAM cell optimization considering MTJ and CMOS variations

Journal Article IEEE Transactions on Magnetics · January 1, 2011 Spin-transfer torque random access memory (STT-RAM) becomes a promising technology for future computing systems for its fast access time, high density, nonvolatility, and small write current. However, like all the other nanotechnologies, STT-RAM suffers fr ... Full text Cite

Performance, power, and reliability tradeoffs of STT-RAM cell subject to architecture-level requirement

Journal Article IEEE Transactions on Magnetics · January 1, 2011 Large switching current and long switching time have significantly limited the adoption of spin-transfer torque random access memory (STT-RAM). Technology scaling, moreover, makes it very challenging to reduce the switching current while maintaining the re ... Full text Cite

Universal Statistical Cure For Predicting Memory Loss (Invited Paper)

Conference 2011 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) · January 1, 2011 Link to item Cite

Design margin exploration of spin-transfer torque RAM (STT-RAM) in scaled technologies

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · December 1, 2010 We propose a magnetic and electric level spin-transfer torque random access memory (STT-RAM) cell model to simulate the write operation of an STT-RAM. The model of a magnetic tunneling junction (MTJ) is modified to take into account the electrical response ... Full text Cite

Applications of TMR devices in solid state circuits and systems

Conference 2010 International SoC Design Conference, ISOCC 2010 · December 1, 2010 Spintronic devices have recently attracted significant attentions in solid state circuit society as a promising device in the applications of nonvolatile memory and emerging circuit design, i.e., memristor-based system. In this paper, we introduce Tunnelin ... Full text Cite

Spintronic devices: From memory to memristor

Conference 2010 International Conference on Communications, Circuits and Systems, ICCCAS 2010 - Proceedings · November 19, 2010 In 1971, Professor Leon Chua in UC Berkeley predicted the fourth fundamental passive circuit element - memristor, based on the conceptual completeness of circuit theory. 37 years later, a team at HP Labs led by Dr. Stanley Williams announced the developmen ... Full text Cite

Combined magnetic-and circuit-level enhancements for the nondestructive self-reference scheme of STT-RAM

Conference Proceedings of the International Symposium on Low Power Electronics and Design · October 21, 2010 A nondestructive self-reference read scheme (NSRS) was recently proposed to overcome the bit-to-bit variation in Spin-Transfer Torque Random Access Memory (STT-RAM). In this work, we introduced three magnetic-and circuit-level techniques, including 1) R-I ... Full text Cite

Emerging non-volatile memory technologies: From materials, to device, circuit, and architecture

Conference Midwest Symposium on Circuits and Systems · September 20, 2010 The emerging nonvolatile memory technologies are gaining significant attentions from semiconductor in recent years. Multiple promising candidates, such as phase change memory, magnetic memory, resistive memory, and memristor, have gained substantial attent ... Full text Cite

Access scheme of multi-level cell spin-transfer torque random access memory and its optimization

Conference Midwest Symposium on Circuits and Systems · September 20, 2010 In this work, we study the access (read and write) scheme of the newly proposed Multi-Level Cell Spin-Transfer Torque Random Access Memory (MLC STT-RAM) from both the circuit design and architectural perspectives. Based on the physical principles of the re ... Full text Cite

The application of spintronic devices in magnetic bio-sensing

Conference Proceedings of the 2nd Asia Symposium on Quality Electronic Design, ASQED 2010 · September 17, 2010 Recently integrated magnetic/spintronic device microarrays have demonstrated great potentials in both biomedical research and practices. In this work, we discuss the physical mechanisms of three types of spintronic devices for magnetic signal sensing, incl ... Full text Cite

PCMO device with high switching stability

Journal Article IEEE Electron Device Letters · August 1, 2010 We studied the relationship between the resistive-switching properties of the Pr0.7Ca0.3MnO3 (PCMO) thin-film elements and their geometry dimensions below submicrometers. Our electrical test results of a series of PCMO-based resistive-switching devices wit ... Full text Cite

Patents relevant to cross-point memory array

Journal Article Recent Patents on Electrical Engineering · June 25, 2010 Patents relevant to cross-point memory array structure are reviewed. These patents are selected from the categories of cross-point, crossbar, memory array and emerging memory. These patents address the questions of how to build a cross-point memory, includ ... Full text Cite

Compact model of memristors and its application in computing systems

Conference Proceedings -Design, Automation and Test in Europe, DATE · June 9, 2010 In this paper, we present a compact model of the spintronic memristor based on the magnetic-domain-wall motion mechanism for circuit design. Our model also takes into account the variations of material parameters and fabrication process, which significantl ... Cite

A nondestructive self-reference scheme for spin-transfer torque random access memory (STT-RAM)

Conference Proceedings -Design, Automation and Test in Europe, DATE · June 9, 2010 We proposed a novel self-reference sensing scheme for Spin-Transfer Torque Random Access Memory (STT-RAM) to overcome the large bit-to-bit variation of Magnetic Tunneling Junction (MTJ) resistance. Different from all the existing schemes, our solution is n ... Cite

Scalability of PCMO-based resistive switch device in DSM technologies

Conference Proceedings of the 11th International Symposium on Quality Electronic Design, ISQED 2010 · May 28, 2010 This work systematically explores the relationship between the resistive switching properties of Pr0.7Ca0.3MnO3 (PCMO) thin film element and its geometry dimensions in deep submicron (DSM) technologies. A series of PCMO-based resistive switch devices (RSDs ... Full text Cite

Spin transfer torque memory with thermal assist mechanism: A case study

Journal Article IEEE Transactions on Magnetics · March 1, 2010 We have investigated spin transfer torque random access memory (STT-RAM) with a thermal-assist programming scheme using finite-element thermal simulation. We conducted the study on a specific memory element design to analyze the thermal dynamics and therma ... Full text Cite

Spintronic memristor temperature sensor

Journal Article IEEE Electron Device Letters · January 1, 2010 Thermal fluctuation effects on the electric behavior of a spintronic memristor based upon the spin-torque-induced domain-wall motion are explored. Depending upon material, geometry, and electric excitation strength, the device electric behavior can be eith ... Full text Cite

Variable-Latency Adder (VL-Adder) Designs for Low Power and NBTI Tolerance

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · January 1, 2010 In this paper, we proposed a new adder design called variable-latency adder (VL-adder). This technique allows the adder to work at a lower supply voltage than that required by a conventional adder while maintaining the same throughput. The VL-adder design ... Full text Cite

Variation tolerant sensing scheme of spin-transfer torque memory for yield improvement

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2010 Spin-Transfer Torque Random Access Memory (STTRAM) demonstrated great potentials as an universal memory for its fast access speed, zero standby power, excellent scalability and simplicity of cell structure. However, large process variations of both magneti ... Full text Cite

A hybrid solid-state storage architecture for the performance, energy consumption, and lifetime improvement

Conference Proceedings - International Symposium on High-Performance Computer Architecture · January 1, 2010 In recent years, many systems have employed NAND flash memory as storage devices because of its advantages of higher performance (compared to the traditional hard disk drive), high-density, random-access, increasing capacity, and falling cost. On the other ... Full text Cite

Gated decap: Gate leakage control of on-chip decoupling capacitors in scaled technologies

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · December 1, 2009 To minimize the leakage power dissipation of present-day on-chip Decaps, we propose a gated decoupling capacitor (GDecap) technique that deactivates a Decap when it is not needed. The application of the proposed GDecap technique on an eight-way clock-gated ... Full text Cite

The salvage cache: A fault-tolerant cache architecture for next-generation memory technologies

Conference Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors · December 1, 2009 There has been much work on the next generation of memory technologies such as MRAM, RRAM and PRAM. Most of these are non-volatile in nature, and compared to SRAM, they are often denser, just as fast, and have much lower energy consumption. Using 3-D stack ... Full text Cite

An overview of non-volatile memory technology and the implication for tools and architectures

Conference Proceedings -Design, Automation and Test in Europe, DATE · October 22, 2009 Novel nonvolatile memory technologies are gaining significant attentions from semiconductor industry in the competition of universal memory development. We used Spin-Transfer Torque Random Access Memory (STT-RAM) and Resistive Random Access Memory (R-RAM) ... Cite

Thermal-assisted spin transfer torque memory (STT-RAM) cell design exploration

Conference Proceedings of the 2009 IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2009 · October 5, 2009 Thermal-assisted spin-transfer torque random access memory (STT-RAM) has been considered as a promising candidate of next-generation nonvolatile memory technology. We conducted finite element simulation on thermal dynamics in the programming process of the ... Full text Cite

Tolerating process variations in large, set-associative caches: The buddy cache

Journal Article Transactions on Architecture and Code Optimization · June 1, 2009 One important trend in today's microprocessor architectures is the increase in size of the processor caches. These caches also tend to be set associative. As technology scales, process variations are expected to increase the fault rates of the SRAM cells t ... Full text Cite

Spintronic memristor through spin-thorque-induced magnetization motion

Journal Article IEEE Electron Device Letters · February 12, 2009 Existence of spintronic memristor in nanoscale is demonstrated based upon spin-torque-induced magnetization switching and magnetic-domain-wall motion. Our examples show that memristive effects are quite universal for spin-torque spintronic device at the ti ... Full text Cite

Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement

Conference Proceedings - Design Automation Conference · September 17, 2008 Magnetic Random Access Memory (MRAM) has been considered as a promising memory technology due to many attractive properties. Integrating MRAM with CMOS logic may incur extra manufacture cost, due to its hybrid magnetic-CMOS fabrication process. Stacking MR ... Full text Cite

Design margin exploration of Spin-Torque Transfer RAM (SPRAM)

Conference Proceedings of the 9th International Symposium on Quality Electronic Design, ISQED 2008 · August 25, 2008 We proposed a combined magnetic and circuit level technique to explore the design methodology of Spin-Torque Transfer RAM (SPRAM). A dynamic magnetic model of magnetic tunneling junction (MTJ), which is based upon measured spin torque induced magnetization ... Full text Cite

Design for Low Power

Chapter · January 7, 2008 After nearly six years as the field's leading reference, the second edition of this award-winning handbook reemerges with completely updated content and a brand new format. ... Cite

Spin torque random access memory down to 22 nm technology

Journal Article IEEE Transactions on Magnetics · January 1, 2008 Spin torque random access memory (ST-MRAM) design spaces down to CMOS 22 nm technology node are explored using a dynamic magnetic tunneling junction (MTJ)-CMOS model. The coupled dynamics of MTJ and CMOS is modeled by a combination of MTJ micromagnetic sim ... Full text Cite

Design margin exploration of spin-torque transfer RAM (SPRAM)

Conference ISQED 2008: PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN · January 1, 2008 Full text Link to item Cite

Variable-latency adder (VL-adder): New arithmetic circuit design practice to overcome NBTI

Conference Proceedings of the International Symposium on Low Power Electronics and Design · December 17, 2007 Negative bias temperature instability (NBTI) has become a dominant reliability concern for nanoscale PMOS transistors. In this paper, we propose variable-latency adder (VL-adder) technique for NBTI tolerance. By detecting the circuit failure on-the-fly, th ... Full text Cite

VOSCH: Voltage scaled cache hierarchies

Conference 2007 IEEE International Conference on Computer Design, ICCD 2007 · December 1, 2007 The cache hierarchy of state-of-the-art - especially multicore - microprocessors consumes a significant amount of area and energy. A significant amount of research has been devoted especially to reducing the latter. One of the most important microarchitect ... Full text Cite

SAVS: A self-adaptive variable supply-voltage technique for process- Tolerant and power-efficient multi-issue superscalar processor design

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · September 19, 2006 Technology scaling and sub-wavelength optical lithography is associated with significant process variations. We propose a self-adaptive variable supply-voltage scaling (SAVS) technique for multi-issue out-of-order pipeline to improve parametric yield with ... Cite

Cascaded carry-select adder (C2 SA): A new structure for low-power CSA design

Conference Proceedings of the International Symposium on Low Power Electronics and Design · December 12, 2005 In this paper we propose a novel low-power Carry-Select Adder (CSA) design called Cascaded CSA (C2SA). Based on the prediction of the critical path delay of current operation, C2SA can automatically work with one or two clock-cycle latency and a scaled sup ... Cite

Combined circuit and architectural level variable supply-voltage scaling for low power

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · May 1, 2005 Energy-efficient processor design is becoming more and more important with technology scaling and with high performance requirements. Supply-voltage scaling is an efficient way to reduce energy by lowering the operating voltage and the clock frequency of p ... Full text Cite

Gated decap: Gate leakage control of on-chip decoupling capacitors in scaled technologies

Conference CICC: PROCEEDINGS OF THE IEEE 2005 CUSTOM INTEGRATED CIRCUITS CONFERENCE · January 1, 2005 Link to item Cite

Gated Decap: Gate leakage control of on-chip decoupling capacitors in scaled technologies

Conference Proceedings of the Custom Integrated Circuits Conference · January 1, 2005 A novel on-chip Decoupling Capacitor (Decap) design - Gated Decoupling Capacitor (GDecap) - is proposed to minimize the leakage power dissipation associated with present-day on-chip decoupling capacitors. Experiments on the application of GDecap in an 8-wa ... Full text Cite

DCG: Deterministic Clock-Gating for Low-Power Microprocessor Design

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · March 1, 2004 With the scaling of technology and the need for higher performance and more functionality, power dissipation is becoming a major bottleneck for microprocessor designs. Because clock power can be significant in high-performance processors, we propose a dete ... Full text Cite

A single-Vt low-leakage gated-ground cache for deep submicron

Journal Article IEEE Journal of Solid-State Circuits · February 1, 2003 In this paper, we propose a novel integrated circuit and architectural level technique to reduce leakage power consumption in high-performance cache memories using single Vt (transistor threshold voltage) process. We utilize the concept of gated-Ground (nM ... Full text Cite

VSV: L2-miss-driven variable supply-voltage scaling for low power

Conference Proceedings of the Annual International Symposium on Microarchitecture, MICRO · January 1, 2003 Energy efficient processor design is becoming more and more important with technology scaling and with high performance requirements. Supply-voltage scaling is an efficient way to reduce energy by lowering the operating voltage and the clock frequency of p ... Full text Cite

Deterministic clock gating for microprocessor power reduction

Conference Proceedings - International Symposium on High-Performance Computer Architecture · January 1, 2003 With the scaling of technology and the need for higher performance and more functionality, power dissipation is becoming a major bottleneck for microprocessor designs. Pipeline balancing (PLB), a previous technique, is essentially a methodology to clock-ga ... Full text Cite

A high performance IDDQ testable cache for scaled CMOS technologies

Conference Proceedings of the Asian Test Symposium · January 1, 2002 Quiescent supply current (IDDQ) testing is a useful test method for static CMOS RAM and can be combined with functional testing to reduce total test time and to increase reliability. However the sensitivity of IDDQ testing deteriorates significantly with t ... Full text Cite

DRG-Cache: A data retention gated-ground cache for low power

Conference Proceedings - Design Automation Conference · January 1, 2002 In this paper we propose a novel integrated circuit and architectural level technique to reduce leakage power consumption in high performance cache memories using single Vt (transistor threshold voltage) process. We utilize the concept of Gated-Ground (NMO ... Full text Cite