Skip to main content

Yiran Chen

John Cocke Distinguished Professor of Electrical and Computer Engineering
Electrical and Computer Engineering
130 Hudson Hall, PO Box 90291, Durham, NC 27708
405 Wilkison Building, 534 Research Dr., Durham, NC 27705
Office hours Appointment only.  

Selected Publications


A memristive all-inclusive hypernetwork for parallel analog deployment of full search space architectures.

Journal Article Neural networks : the official journal of the International Neural Network Society · July 2024 In recent years, there has been a significant advancement in memristor-based neural networks, positioning them as a pivotal processing-in-memory deployment architecture for a wide array of deep learning applications. Within this realm of progress, the emer ... Full text Cite

Lithography Hotspot Detection Based on Heterogeneous Federated Learning With Local Adaptation and Feature Selection

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · May 1, 2024 Since the scaling of advanced technology nodes is pushing to its physical limit, lithography hotspot detection (LHD) has become more significant than ever in design for manufacturability. Recently, machine learning techniques have been deployed to greatly ... Full text Cite

NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models

Journal Article IEEE Transactions on Computers · May 1, 2024 Recent advances in deep neural networks (DNNs) have enabled highly effective recommendation models for diverse web services. In such DNN-based recommendation models, the embedding layer comprises the majority of model parameters. As these models scale rapi ... Full text Cite

Efficient, Direct, and Restricted Black-Box Graph Evasion Attacks to Any-Layer Graph Neural Networks via Influence Function

Conference WSDM 2024 - Proceedings of the 17th ACM International Conference on Web Search and Data Mining · March 4, 2024 Graph neural network (GNN), the mainstream method to learn on graph data, is vulnerable to graph evasion attacks, where an attacker slightly perturbing the graph structure can fool trained GNN models. Existing work has at least one of the following drawbac ... Full text Cite

Toward Fully Automated Machine Learning for Routability Estimator Development

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · March 1, 2024 The rise of machine learning (ML) technology inspires a boom of its applications in electronic design automation (EDA) and helps improve the degree of automation in chip designs. However, manually crafting ML models remains a complex and time-consuming pro ... Full text Cite

Neuro-Symbolic Computing: Advancements and Challenges in Hardware-Software Co-Design

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · March 1, 2024 The rapid progress of artificial intelligence (AI) has led to the emergence of a highly promising field known as neuro-symbolic (NeSy) computing. This approach combines the strengths of neural networks, which excel at data-driven learning, with the reasoni ... Full text Cite

Athena – The NSF AI Institute for Edge Computing

Journal Article AI Magazine · March 1, 2024 The National Science Foundation (NSF) Artificial Intelligence (AI) Institute for Edge Computing Leveraging Next Generation Networks (Athena) seeks to foment a transformation in modern edge computing by advancing AI foundations, computing paradigms, network ... Full text Cite

Efficient Low-Bit Neural Network With Memristor-Based Reconfigurable Circuits

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · January 1, 2024 As neural network models are developed and optimized, the use of neural networks in edge devices is increasing, where low-bit neural networks, such as binary neural networks and mixed-precision neural networks, are ideal for edge AI applications. Periphera ... Full text Cite

An Interview with Dr. Nicky Lu [Interview]

Journal Article IEEE Circuits and Systems Magazine · January 1, 2024 In October 2023, Fan Chen, an Associate Editor at IEEE Circuits and Systems Magazine, had the privilege of interviewing Dr. Nicky Lu, an icon in IC design and the semiconductor industry. This engaging interview covers Dr. Lu's educational journey, illustri ... Full text Cite

Tunable Hybrid Proposal Networks for the Open World

Conference Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024 · January 1, 2024 Current state-of-the-art object proposal networks are trained with a closed-world assumption, meaning they learn to only detect objects of the training classes. These models fail to provide high recall in open-world environments where important novel objec ... Full text Cite

Si-Kintsugi: Towards Recovering Golden-Like Performance of Defective Many-Core Spatial Architectures for AI

Conference Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023 · October 28, 2023 The growing demand for higher compute and memory capacity driven by artificial intelligence (AI) applications pushes higher core counts in modern systems. Many-core architectures exhibiting spatial interconnects with high on-chip bandwidth are ideal for th ... Full text Cite

Swirls: Sniffing Wi-Fi Using Radios with Low Sampling Rates

Conference Proceedings of the International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc) · October 23, 2023 Next-generation Wi-Fi systems embrace large signal bandwidth to achieve significantly improved data rates, while requiring efficient methods for network monitoring and spectrum sharing applications. A radio receiver (RX) operating at low sampling rates can ... Full text Cite

EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference

Journal Article ACM Transactions on Embedded Computing Systems · September 9, 2023 Recommendation systems have been widely embedded into many Internet services. For example, Meta's deep learning recommendation model (DLRM) shows high prefictive accuracy of click-through rate in processing large-scale embedding tables. The SparseLengthSum ... Full text Cite

SpikeSen: Low-Latency In-Sensor-Intelligence Design With Neuromorphic Spiking Neurons

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · June 1, 2023 In-sensor-processing (ISP) paradigm has been exploited in state-of-the-art vision system designs to pave the way towards power-efficient sensing and processing. The redundant data transmission between sensors and processors is significantly minimized by lo ... Full text Cite

NASRec: Weight Sharing Neural Architecture Search for Recommender Systems

Conference ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023 · April 30, 2023 The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that trains a single s ... Full text Cite

The Dark Side: Security and Reliability Concerns in Machine Learning for EDA

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · April 1, 2023 The growing integrated circuit complexity has led to a compelling need for design efficiency improvement through new electronic design automation (EDA) methodologies. In recent years, many unprecedented efficient EDA methods have been enabled by machine le ... Full text Cite

DefT: Boosting Scalability of Deformable Convolution Operations on GPUs

Conference International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS · March 25, 2023 Deformable Convolutional Networks (DCN) have been proposed as a powerful tool to boost the representation power of Convolutional Neural Networks (CNN) in computer vision tasks via adaptive sampling of the input feature map. Much like vision transformers, D ... Full text Cite

DyNNamic: Dynamically Reshaping, High Data-Reuse Accelerator for Compact DNNs

Journal Article IEEE Transactions on Computers · March 1, 2023 Convolutional layers dominate the computation and energy costs of Deep Neural Network (DNN) inference. Recent algorithmic works attempt to reduce these bottlenecks via compact DNN structures and model compression. Likewise, state-of-the-art accelerator des ... Full text Cite

Designing Efficient Bit-Level Sparsity-Tolerant Memristive Networks.

Journal Article IEEE transactions on neural networks and learning systems · March 2023 With the rapid progress of deep neural network (DNN) applications on memristive platforms, there has been a growing interest in the acceleration and compression of memristive networks. As an emerging model optimization technique for memristive platforms, b ... Full text Cite

DisP+V: A Unified Framework for Disentangling Prototype and Variation From Single Sample per Person.

Journal Article IEEE transactions on neural networks and learning systems · February 2023 Single sample per person face recognition (SSPP FR) is one of the most challenging problems in FR due to the extreme lack of enrolment data. To date, the most popular SSPP FR methods are the generic learning methods, which recognize query face images based ... Full text Cite

A memristive all-inclusive hypernetwork for parallel analog deployment of full search space architectures.

Journal Article Neural networks : the official journal of the International Neural Network Society · July 2024 In recent years, there has been a significant advancement in memristor-based neural networks, positioning them as a pivotal processing-in-memory deployment architecture for a wide array of deep learning applications. Within this realm of progress, the emer ... Full text Cite

Lithography Hotspot Detection Based on Heterogeneous Federated Learning With Local Adaptation and Feature Selection

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · May 1, 2024 Since the scaling of advanced technology nodes is pushing to its physical limit, lithography hotspot detection (LHD) has become more significant than ever in design for manufacturability. Recently, machine learning techniques have been deployed to greatly ... Full text Cite

NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models

Journal Article IEEE Transactions on Computers · May 1, 2024 Recent advances in deep neural networks (DNNs) have enabled highly effective recommendation models for diverse web services. In such DNN-based recommendation models, the embedding layer comprises the majority of model parameters. As these models scale rapi ... Full text Cite

Efficient, Direct, and Restricted Black-Box Graph Evasion Attacks to Any-Layer Graph Neural Networks via Influence Function

Conference WSDM 2024 - Proceedings of the 17th ACM International Conference on Web Search and Data Mining · March 4, 2024 Graph neural network (GNN), the mainstream method to learn on graph data, is vulnerable to graph evasion attacks, where an attacker slightly perturbing the graph structure can fool trained GNN models. Existing work has at least one of the following drawbac ... Full text Cite

Toward Fully Automated Machine Learning for Routability Estimator Development

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · March 1, 2024 The rise of machine learning (ML) technology inspires a boom of its applications in electronic design automation (EDA) and helps improve the degree of automation in chip designs. However, manually crafting ML models remains a complex and time-consuming pro ... Full text Cite

Neuro-Symbolic Computing: Advancements and Challenges in Hardware-Software Co-Design

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · March 1, 2024 The rapid progress of artificial intelligence (AI) has led to the emergence of a highly promising field known as neuro-symbolic (NeSy) computing. This approach combines the strengths of neural networks, which excel at data-driven learning, with the reasoni ... Full text Cite

Athena – The NSF AI Institute for Edge Computing

Journal Article AI Magazine · March 1, 2024 The National Science Foundation (NSF) Artificial Intelligence (AI) Institute for Edge Computing Leveraging Next Generation Networks (Athena) seeks to foment a transformation in modern edge computing by advancing AI foundations, computing paradigms, network ... Full text Cite

Efficient Low-Bit Neural Network With Memristor-Based Reconfigurable Circuits

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · January 1, 2024 As neural network models are developed and optimized, the use of neural networks in edge devices is increasing, where low-bit neural networks, such as binary neural networks and mixed-precision neural networks, are ideal for edge AI applications. Periphera ... Full text Cite

An Interview with Dr. Nicky Lu [Interview]

Journal Article IEEE Circuits and Systems Magazine · January 1, 2024 In October 2023, Fan Chen, an Associate Editor at IEEE Circuits and Systems Magazine, had the privilege of interviewing Dr. Nicky Lu, an icon in IC design and the semiconductor industry. This engaging interview covers Dr. Lu's educational journey, illustri ... Full text Cite

Tunable Hybrid Proposal Networks for the Open World

Conference Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024 · January 1, 2024 Current state-of-the-art object proposal networks are trained with a closed-world assumption, meaning they learn to only detect objects of the training classes. These models fail to provide high recall in open-world environments where important novel objec ... Full text Cite

Si-Kintsugi: Towards Recovering Golden-Like Performance of Defective Many-Core Spatial Architectures for AI

Conference Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023 · October 28, 2023 The growing demand for higher compute and memory capacity driven by artificial intelligence (AI) applications pushes higher core counts in modern systems. Many-core architectures exhibiting spatial interconnects with high on-chip bandwidth are ideal for th ... Full text Cite

Swirls: Sniffing Wi-Fi Using Radios with Low Sampling Rates

Conference Proceedings of the International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc) · October 23, 2023 Next-generation Wi-Fi systems embrace large signal bandwidth to achieve significantly improved data rates, while requiring efficient methods for network monitoring and spectrum sharing applications. A radio receiver (RX) operating at low sampling rates can ... Full text Cite

EMS-i: An Efficient Memory System Design with Specialized Caching Mechanism for Recommendation Inference

Journal Article ACM Transactions on Embedded Computing Systems · September 9, 2023 Recommendation systems have been widely embedded into many Internet services. For example, Meta's deep learning recommendation model (DLRM) shows high prefictive accuracy of click-through rate in processing large-scale embedding tables. The SparseLengthSum ... Full text Cite

SpikeSen: Low-Latency In-Sensor-Intelligence Design With Neuromorphic Spiking Neurons

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · June 1, 2023 In-sensor-processing (ISP) paradigm has been exploited in state-of-the-art vision system designs to pave the way towards power-efficient sensing and processing. The redundant data transmission between sensors and processors is significantly minimized by lo ... Full text Cite

NASRec: Weight Sharing Neural Architecture Search for Recommender Systems

Conference ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023 · April 30, 2023 The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that trains a single s ... Full text Cite

The Dark Side: Security and Reliability Concerns in Machine Learning for EDA

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · April 1, 2023 The growing integrated circuit complexity has led to a compelling need for design efficiency improvement through new electronic design automation (EDA) methodologies. In recent years, many unprecedented efficient EDA methods have been enabled by machine le ... Full text Cite

DefT: Boosting Scalability of Deformable Convolution Operations on GPUs

Conference International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS · March 25, 2023 Deformable Convolutional Networks (DCN) have been proposed as a powerful tool to boost the representation power of Convolutional Neural Networks (CNN) in computer vision tasks via adaptive sampling of the input feature map. Much like vision transformers, D ... Full text Cite

DyNNamic: Dynamically Reshaping, High Data-Reuse Accelerator for Compact DNNs

Journal Article IEEE Transactions on Computers · March 1, 2023 Convolutional layers dominate the computation and energy costs of Deep Neural Network (DNN) inference. Recent algorithmic works attempt to reduce these bottlenecks via compact DNN structures and model compression. Likewise, state-of-the-art accelerator des ... Full text Cite

Designing Efficient Bit-Level Sparsity-Tolerant Memristive Networks.

Journal Article IEEE transactions on neural networks and learning systems · March 2023 With the rapid progress of deep neural network (DNN) applications on memristive platforms, there has been a growing interest in the acceleration and compression of memristive networks. As an emerging model optimization technique for memristive platforms, b ... Full text Cite

DisP+V: A Unified Framework for Disentangling Prototype and Variation From Single Sample per Person.

Journal Article IEEE transactions on neural networks and learning systems · February 2023 Single sample per person face recognition (SSPP FR) is one of the most challenging problems in FR due to the extreme lack of enrolment data. To date, the most popular SSPP FR methods are the generic learning methods, which recognize query face images based ... Full text Cite

Rethink before Releasing Your Model: ML Model Extraction Attack in EDA

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 16, 2023 Machine learning (ML)-based techniques for electronic design automation (EDA) have boosted the performance of modern integrated circuits (ICs). Such achievement makes ML model to be of importance for the EDA industry. In addition, ML models for EDA are wid ... Full text Cite

Improving the Robustness and Efficiency of PIM-Based Architecture by SW/HW Co-Design

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 16, 2023 Processing-in-memory (PIM) based architecture shows great potential to process several emerging artificial intelligence workloads, including vision and language models. Cross-layer optimizations could bridge the gap between computing density and the availa ... Full text Cite

Fully Automated Machine Learning Model Development for Analog Placement Quality Prediction

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 16, 2023 Analog integrated circuit (IC) placement is a heavily manual and time-consuming task that has a significant impact on chip quality. Several recent studies apply machine learning (ML) techniques to directly predict the impact of placement on circuit perform ... Full text Cite

Photonic Bayesian Neural Network Using Programmed Optical Noises

Journal Article IEEE Journal of Selected Topics in Quantum Electronics · January 1, 2023 The Bayesian neural network (BNN) combines the strengths of neural networks and statistical modeling in that it simultaneously performs posterior predictions and quantifies the uncertainty of the predictions. Integrated photonics has emerged as a promising ... Full text Cite

Guest Editorial Machine Learning for Resilient Industrial Cyber-Physical Systems

Journal Article IEEE Transactions on Automation Science and Engineering · January 1, 2023 Full text Cite

Interpreting Disparate Privacy-Utility Tradeoff in Adversarial Learning via Attribute Correlation

Conference Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023 · January 1, 2023 Adversarial learning is commonly used to extract latent data representations which are expressive to predict the target attribute but indistinguishable in the privacy attribute. However, whether they can achieve an expected privacy-utility tradeoff is of g ... Full text Cite

: Joint Point Interaction-Dimension Search for 3D Point Cloud

Conference Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023 · January 1, 2023 The interaction and dimension of points are two important axes in designing point operators to serve hierarchical 3D models. Yet, these two axes are heterogeneous and challenging to fully explore. Existing works craft point operator under a single axis and ... Full text Cite

Deep Learning for Routability

Chapter · January 1, 2023 Design rule checking (DRC) clean is a fundamental chip manufacturing requirement. However, achieving this is increasingly challenging with the advance of semiconductor technology nodes and the increase of complicated design rules. To effectively mitigate D ... Full text Cite

Mixture Outlier Exposure: Towards Out-of-Distribution Detection in Fine-grained Environments

Conference Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023 · January 1, 2023 Many real-world scenarios in which DNN-based recognition systems are deployed have inherently fine-grained attributes (e.g., bird-species recognition, medical image classification). In addition to achieving reliable accuracy, a critical subtask for these m ... Full text Cite

2023: The Golden Age of Semiconductors

Journal Article IEEE Circuits and Systems Magazine · January 1, 2023 Full text Cite

LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2023 Search spaces hallmark the advancement of Neural Architecture Search (NAS). Large and complex search spaces with versatile building operators and structures provide more opportunities to brew promising architectures, yet pose severe challenges on efficient ... Cite

ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models

Conference Proceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2023 Knowledge Distillation (KD) (Hinton et al., 2015) is one of the most effective approaches for deploying large-scale pre-trained language models in low-latency environments by transferring the knowledge contained in the large-scale models to smaller student ... Cite

PowerPruning: Selecting Weights and Activations for Power-Efficient Neural Network Acceleration

Conference Proceedings - Design Automation Conference · January 1, 2023 Deep neural networks (DNNs) have been successfully applied in various fields. A major challenge of deploying DNNs, especially on edge devices, is power consumption, due to the large number of multiply-and-accumulate (MAC) operations. To address this challe ... Full text Cite

Accelerating Sparse Attention with a Reconfigurable Non-volatile Processing-In-Memory Architecture

Conference Proceedings - Design Automation Conference · January 1, 2023 Attention-based neural networks have shown superior performance in a wide range of tasks. Non-volatile processing-in-memory (NVPIM) architecture shows its great potential to accelerate the dense attention model. However, the unique unstructured and dynamic ... Full text Cite

Refloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear Solvers

Conference International Conference for High Performance Computing, Networking, Storage and Analysis, SC · January 1, 2023 Resistive random access memory (ReRAM) is a promising technology that can perform low-cost and in-situ matrix-vector multiplication (MVM) in analog domain. Scientific computing requires high-precision floating-point (FP) processing. However, performing flo ... Full text Cite

Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction

Conference Proceedings of Machine Learning Research · January 1, 2023 Due to the often limited communication bandwidth of edge devices, most existing federated learning (FL) methods randomly select only a subset of devices to participate in training at each communication round. Compared with engaging all the available client ... Cite

Early Identification of Timing Critical RTL Components using ML based Path Delay Prediction

Conference 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD, MLCAD 2023 · January 1, 2023 In chip design, it is crucial to identify timing critical components early on to preemptively fix any timing issues and avoid numerous design convergence iterations. However, obtaining this information requires one to run the time intensive physical design ... Full text Cite

DISQ: Dynamic Iteration Skipping for Variational Quantum Algorithms

Conference Proceedings - 2023 IEEE International Conference on Quantum Computing and Engineering, QCE 2023 · January 1, 2023 In the noisy intermediate scale quantum (NISQ) era, the Variational Quantum Algorithm (VQA) has emerged as one of the most promising approaches to harness the power of quantum computers. In VQA, a classical optimizer iteratively updates the parameters of a ... Full text Cite

PANDA: Architecture-Level Power Evaluation by Unifying Analytical and Machine Learning Solutions

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2023 Power efficiency is a critical design objective in modern microprocessor design. To evaluate the impact of architectural-level design decisions, an accurate yet efficient architecture-level power model is desired. However, widely adopted data-independent a ... Full text Cite

Invited Paper: Towards the Efficiency, Heterogeneity, and Robustness of Edge AI

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2023 Over the past decade, there has been a persistent trend in edge computing, driving the migration of intelligence closer to the edge. The increasing need to process data locally has fueled the deployment of highly efficient computing hardware and artificial ... Full text Cite

FINE-GRAIN INFERENCE ON OUT-OF-DISTRIBUTION DATA WITH HIERARCHICAL CLASSIFICATION

Conference Proceedings of Machine Learning Research · January 1, 2023 Machine learning methods must be trusted to make appropriate decisions in real-world environments, even when faced with out-of-distribution (OOD) samples. Many current approaches simply aim to detect OOD examples and alert the user when an unrecognized inp ... Cite

Biologically Plausible Learning on Neuromorphic Hardware Architectures

Conference Midwest Symposium on Circuits and Systems · January 1, 2023 Movement of model parameters from memory to computing elements in deep learning (DL) has led to a growing imbalance known as the memory wall. Neuromorphic computation-in-memory (CIM) is an emerging paradigm that addresses this imbalance by performing compu ... Full text Cite

Communication-Efficient Vertical Federated Learning with Limited Overlapping Samples

Conference Proceedings of the IEEE International Conference on Computer Vision · January 1, 2023 Federated learning is a popular collaborative learning approach that enables clients to train a global model without sharing their local data. Vertical federated learning (VFL) deals with scenarios in which the data on clients have different feature spaces ... Full text Cite

Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations

Conference Proceedings of the IEEE International Conference on Computer Vision · January 1, 2023 In recent years, discriminative self-supervised methods have made significant strides in advancing various visual tasks. The central idea of learning a data encoder that is robust to data distortions/augmentations is straightforward yet highly effective. A ... Full text Cite

Rethinking normalization methods in federated learning

Conference DistributedML 2022 - Proceedings of the 3rd International Workshop on Distributed Machine Learning, Part of CoNEXT 2022 · December 9, 2022 Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data. In this work, we explicitly uncover external covariate shift problem in FL, which is caused by the independent local t ... Full text Cite

IVQ: In-Memory Acceleration of DNN Inference Exploiting Varied Quantization

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · December 1, 2022 Weight quantization is well adapted to cope with the ever-growing complexity of the deep neural network (DNN) model. Diversified quantization schemes lead to diverse quantized bit width and formats of the weights, thereby, subject to different hardware imp ... Full text Cite

PIMulator-NN: An Event-Driven, Cross-Level Simulation Framework for Processing-In-Memory-Based Neural Network Accelerators

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · December 1, 2022 Processing-in-memory (PIM) architecture has been proposed to accelerate state-of-the-art neuro-inspired algorithms, such as deep neural networks. In this article, we present PIMulator-NN, an event-driven, cross-level simulation framework for PIM-based neur ... Full text Cite

A Novel Architecture Design for Output Significance Aligned Flow with Adaptive Control in ReRAM-based Neural Network Accelerator

Journal Article ACM Transactions on Design Automation of Electronic Systems · November 22, 2022 Resistive-RAM-based (ReRAM-based) computing shows great potential on accelerating DNN inference by its highly parallel structure. Regrettably, computing accuracy in practical is much lower than expected due to the non-ideal ReRAM device. Conventional compu ... Full text Cite

Boosting the sensing granularity of acoustic signals by exploiting hardware non-linearity

Conference HotNets 2022 - Proceedings of the 2022 21st ACM Workshop on Hot Topics in Networks · November 14, 2022 Acoustic sensing is a new sensing modality that senses the contexts of human targets and our surroundings using acoustic signals. It becomes a hot topic in both academia and industry owing to its finer sensing granularity and the wide availability of micro ... Full text Cite

FedSEA: A Semi-Asynchronous Federated Learning Framework for Extremely Heterogeneous Devices

Conference SenSys 2022 - Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems · November 6, 2022 Federated learning (FL) has attracted increasing attention as a promising technique to drive a vast number of edge devices with artificial intelligence. However, it is very challenging to guarantee the efficiency of a FL system in practice due to the heter ... Full text Cite

Preplacement Net Length and Timing Estimation by Customized Graph Neural Network

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · November 1, 2022 Net length is a key proxy metric for optimizing timing and power across various stages of a standard digital design flow. However, the bulk of net length information is not available until cell placement, and hence, it is a significant challenge to explici ... Full text Cite

Robustify ML-based lithography hotspot detectors

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · October 30, 2022 Deep learning has been widely applied in various VLSI design automation tasks, from layout quality estimation to design optimization. Though deep learning has shown state-of-the-art performance in several applications, recent studies reveal that deep neura ... Full text Cite

DEEP: Developing extremely efficient runtime on-chip power meters

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · October 30, 2022 Accurate and efficient on-chip power modeling is crucial to runtime power, energy, and voltage management. Such power monitoring can be achieved by designing and integrating on-chip power meters (OPMs) into the target design. In this work, we propose a new ... Full text Cite

How good is your verilog RTL code? A quick answer from machine learning

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · October 30, 2022 Hardware Description Language (HDL) is a common entry point for designing digital circuits. Differences in HDL coding styles and design choices may lead to considerably different design quality and performance-power tradeoff. In general, the impact of HDL ... Full text Cite

Space-Time-Efficient Modeling of Large-Scale 3-D Cross-Point Memory Arrays by Operation Adaption and Network Compaction

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · October 1, 2022 Three-dimensional (3-D) integrated cross-point memory arrays can be used to build high-density storage-class memory systems. However, the coupled network topology caused by sharing word lines or bit lines between adjacent memory layers significantly enlarg ... Full text Cite

Introduction to the Special Section on Energy-Efficient AI Chips

Journal Article ACM Transactions on Design Automation of Electronic Systems · September 21, 2022 Full text Cite

The 5th Artificial Intelligence of Things (AIoT) Workshop

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 14, 2022 With advancement of recent network and chip technologies, IoT devices are becoming smarter with increasing compute power, bandwidth, and storage available on the device. This enables intelligent decision making and information transferring on the devices a ... Full text Cite

ASTERS: Adaptable Threshold Spike-timing Neuromorphic Design with Twin-Column ReRAM Synapses

Conference Proceedings - Design Automation Conference · July 10, 2022 Complex event-driven neuron dynamics was an obstacle to implementing efficient brain-inspired computing architectures with VLSI circuits. To solve this problem and harness the event-driven advantage, we propose ASTERS, a resistive random-access memory (ReR ... Full text Cite

HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance

Conference Proceedings - Design Automation Conference · July 10, 2022 With the recent demand of deploying neural network models on mobile and edge devices, it is desired to improve the model's generalizability on unseen testing data, as well as enhance the model's robustness under fixed-point quantization for efficient deplo ... Full text Cite

Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data

Conference Proceedings - Design Automation Conference · July 10, 2022 Applying machine learning (ML) in design flow is a popular trend in Electronic Design Automation (EDA) with various applications from design quality predictions to optimizations. Despite its promise, which has been demonstrated in both academic researches ... Full text Cite

Cascading Structured Pruning: Enabling High Data Reuse for Sparse DNN Accelerators

Conference Proceedings - International Symposium on Computer Architecture · June 18, 2022 Performance and efciency of running modern Deep Neural Networks (DNNs) are heavily bounded by data movement. To mitigate the data movement bottlenecks, recent DNN inference accelerator designs widely adopt aggressive compression techniques and sparse-skipp ... Full text Cite

A Hybrid-Grained Remapping Defense Scheme Against Hard Failures for Row-Column-NVM

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · June 1, 2022 Row-column-NVM (RC-NVM) is a new architecture for emerging nonvolatile memory (NVM), such as ReRAM, PCM, and STT-RAM. It leverages the symmetry of crossbar structure and supports both row and column memory accesses. The new architecture is well fit for the ... Full text Cite

Processing-in-Memory Technology for Machine Learning: From Basic to ASIC

Journal Article IEEE Transactions on Circuits and Systems II: Express Briefs · June 1, 2022 Due to the need for computing models that can process large quantities of data efficiently and with high throughput in many state-of-the-art machine learning algorithms, the processing-in-memory (PIM) paradigm is emerging as a potential replacement for sta ... Full text Cite

Research Progress on Memristor: From Synapses to Computing Systems

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · May 1, 2022 As the limits of transistor technology are approached, feature size in integrated circuit transistors has been reduced very near to the minimum physically-realizable channel length, and it has become increasingly difficult to meet expectations outlined by ... Full text Cite

Toward Efficient and Adaptive Design of Video Detection System with Deep Neural Networks

Journal Article ACM Transactions on Embedded Computing Systems · May 1, 2022 In the past decade, Deep Neural Networks (DNNs), e.g., Convolutional Neural Networks, achieved human-level performance in vision tasks such as object classification and detection. However, DNNs are known to be computationally expensive and thus hard to be ... Full text Cite

A survey of architectures of neural network accelerators

Journal Article Scientia Sinica Informationis · April 1, 2022 Nowadays, with the growth in data demand and the improvement in hardware computing power, artificial intelligence (AI) can be applied to a wide range of applications. Among them, neural network algorithms have successfully solved some practical problems, s ... Full text Cite

ISLPED 2021: The 25th Anniversary!

Conference IEEE Design and Test · February 1, 2022 The ISLPED 2021 Conference was again held online on July 26-28, 2021, due to the COVID-19 pandemic. The program of the conference was arranged between 10 A.M. and 1 P.M., which is the most friendly time period to the attendants from America, Asia, and Euro ... Full text Cite

Harnessing optoelectronic noises in a photonic generative network.

Journal Article Science advances · January 2022 Integrated optoelectronics is emerging as a promising platform of neural network accelerator, which affords efficient in-memory computing and high bandwidth interconnectivity. The inherent optoelectronic noises, however, make the photonic systems error-pro ... Full text Cite

The Untapped Potential of Off-the-Shelf Convolutional Neural Networks

Conference Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022 · January 1, 2022 Over recent years, a myriad of novel convolutional network architectures have been developed to advance state-of-the-art performance on challenging recognition tasks. As computational resources improve, a great deal of effort has been placed on efficiently ... Full text Cite

Lithography Hotspot Detection via Heterogeneous Federated Learning with Local Adaptation

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2022 As technology scaling is approaching its physical limit, lithography hotspot detection has become an essential task in design for manufacturability. Although the deployment of machine learning in hotspot detection is found to save significant simulation ti ... Full text Cite

Improving Out-of-Distribution Detection by Learning from the Deployment Environment

Journal Article IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing · January 1, 2022 Recognition systems in the remote sensing domain often operate in 'open-world' environments, where they must be capable of accurately classifying data from the in-distribution categories while simultaneously detecting and rejecting anomalous/out-of-distrib ... Full text Cite

A 1.041-Mb/mm227.38-TOPS/W Signed-INT8 Dynamic-Logic-Based ADC-less SRAM Compute-in-Memory Macro in 28nm with Reconfigurable Bitwise Operation for AI and Embedded Applications

Conference Digest of Technical Papers - IEEE International Solid-State Circuits Conference · January 1, 2022 Advanced intelligent embedded systems perform cognitive tasks with highly-efficient vector-processing units for deep neural network (DNN) inference and other vector-based signal processing using limited power. SRAM-based compute-in-memory (CIM) achieves hi ... Full text Cite

Reinforcement Learning-based Black-Box Evasion Attacks to Link Prediction in Dynamic Graphs

Conference 2021 IEEE 23rd International Conference on High Performance Computing and Communications, 7th International Conference on Data Science and Systems, 19th International Conference on Smart City and 7th International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC-DSS-SmartCity-DependSys 2021 · January 1, 2022 Link prediction in dynamic graphs (LPDG) is an important research problem that has diverse applications such as online recommendations, studies on disease contagion, organizational studies, etc. Various LPDG methods based on graph embedding and graph neura ... Full text Cite

An Audio Frequency Unfolding Framework for Ultra-Low Sampling Rate Sensors

Conference Proceedings - International Symposium on Quality Electronic Design, ISQED · January 1, 2022 Recent audio super-resolution works have achieved significant success in promoting audio quality by improving a sensor's sampling rate, e.g., from 8 kHz to 48 kHz. However, these works fail to maintain the performance when the sampling rate at the sensor i ... Full text Cite

Editorial: Machine learning for computational neural modeling and data analyses.

Journal Article Frontiers in computational neuroscience · January 2022 Full text Cite

MOM: Microphone based 3D Orientation Measurement

Conference Proceedings - 21st ACM/IEEE International Conference on Information Processing in Sensor Networks, IPSN 2022 · January 1, 2022 While a tremendous amount of effort has been devoted to localization, the orientation of a device, especially in 3D space, is seldom explored. Although many sensor-based methods utilizing gyro-scope, accelerometer, and magnetometer have been proposed to me ... Full text Cite

Privacy Leakage of Adversarial Training Models in Federated Learning Systems

Conference IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · January 1, 2022 Adversarial Training (AT) is crucial for obtaining deep neural networks that are robust to adversarial attacks, yet recent works found that it could also make models more vulnerable to privacy attacks. In this work, we further reveal this unsettling proper ... Full text Cite

FedCor: Correlation-Based Active Client Selection Strategy for Heterogeneous Federated Learning

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2022 Client-wise data heterogeneity is one of the major issues that hinder effective training in federated learning (FL). Since the data distribution on each client may vary dramatically, the client selection strategy can significantly influence the convergence ... Full text Cite

Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer

Conference IEEE International Conference on Data Mining Workshops, ICDMW · January 1, 2022 Developing neural architectures that are capable of logical reasoning has become increasingly important for a wide range of applications (e.g., natural language processing). Towards this grand objective, we propose a symbolic reasoning architecture that ch ... Full text Cite

Security Threat to the Robustness of RRAM-based Neuromorphic Computing System

Conference Proceedings - 2022 IEEE International Symposium on Smart Electronic Systems, iSES 2022 · January 1, 2022 The RRAM-based neuromorphic computing system (NCS) has amassed explosive interest due to its superior data processing capability and energy efficiency compared with traditional architectures, and thus being widely adopted even in many safety-sensitive appl ... Full text Cite

GraphFL: A Federated Learning Framework for Semi-Supervised Node Classification on Graphs

Conference Proceedings - IEEE International Conference on Data Mining, ICDM · January 1, 2022 Graph-based semi-supervised node classification (GraphSSC) has wide applications, ranging from networking and security to data mining and machine learning, etc. However, existing centralized GraphSSC methods are impractical to solve many real-world graph-b ... Full text Cite

On Building Efficient and Robust Neural Network Designs

Conference Conference Record - Asilomar Conference on Signals, Systems and Computers · January 1, 2022 Neural network models have demonstrated outstanding performance in a variety of applications, from image classification to natural language processing. However, deploying the models to hardware raises efficiency and reliability issues. From the efficiency ... Full text Cite

Next Generation Federated Learning for Edge Devices: An Overview

Conference Proceedings - 2022 IEEE 8th International Conference on Collaboration and Internet Computing, CIC 2022 · January 1, 2022 Federated learning (FL) is a popular distributed machine learning paradigm involving numerous edge devices with enhanced privacy protection. Recently, an extensive literature has been developing on the research which aims at promoting the innovations of FL ... Full text Cite

Message from the Program Co-Chairs: SEC 2022

Conference Proceedings - 2022 IEEE/ACM 7th Symposium on Edge Computing, SEC 2022 · January 1, 2022 Full text Cite

ScaleNAS: Multi-Path One-Shot NAS for Scale-Aware High-Resolution Representation

Conference Proceedings of Machine Learning Research · January 1, 2022 Scale variance among different sizes of body parts and objects is a challenging problem for visual recognition tasks. Existing works usually design a dedicated backbone or apply Neural architecture Search (NAS) for each task to tackle this challenge. Howev ... Cite

Why do We Need Large Batchsizes in Contrastive Learning? A Gradient-Bias Perspective

Conference Advances in Neural Information Processing Systems · January 1, 2022 Contrastive learning (CL) has been the de facto technique for self-supervised representation learning (SSL), with impressive empirical success such as multi-modal representation learning. However, traditional CL loss only considers negative samples from a ... Cite

Editorial Special Issue for 50th Birthday of Memristor Theory and Application of Neuromorphic Computing Based on Memristor - Part II

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · December 1, 2021 Full text Cite

FedMask: Joint Computation and Communication-Efficient Personalized Federated Learning via Heterogeneous Masking

Conference SenSys 2021 - Proceedings of the 2021 19th ACM Conference on Embedded Networked Sensor Systems · November 15, 2021 Recent advancements in deep neural networks (DNN) enabled various mobile deep learning applications. However, it is technically challenging to locally train a DNN model due to limited data on devices like mobile phones. Federated learning (FL) is a distrib ... Full text Cite

Multilabel Image Classification via Feature/Label Co-Projection

Journal Article IEEE Transactions on Systems, Man, and Cybernetics: Systems · November 1, 2021 This article presents a simple and intuitive solution for multilabel image classification, which achieves the competitive performance on the popular COCO and PASCAL VOC benchmarks. The main idea is to capture how humans perform this task: We recognize both ... Full text Cite

Editorial Special Issue for 50th Birthday of Memristor Theory and Application of Neuromorphic Computing Based on Memristor - Part i

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · November 1, 2021 Full text Cite

ESCALATE: Boosting the efficiency of sparse CNN accelerator with kernel decomposition

Conference Proceedings of the Annual International Symposium on Microarchitecture, MICRO · October 18, 2021 The ever-growing parameter size and computation cost of Convolutional Neural Network (CNN) models hinder their deployment onto resource-constrained platforms. Network pruning techniques are proposed to remove the redundancy in CNN parameters and produce a ... Full text Cite

APOLLO: An automated power modeling framework for runtime power introspection in high-volume commercial microprocessors

Conference Proceedings of the Annual International Symposium on Microarchitecture, MICRO · October 18, 2021 Accurate power modeling is crucial for energy-efficient CPU design and runtime management. An ideal power modeling framework needs to be accurate yet fast, achieve high temporal resolution (ideally cycle-accurate) yet with low runtime computational overhea ... Full text Cite

Introduction to the Special Issue on Hardware and Algorithms for Efficient Machine Learning-Part 2

Journal Article ACM Journal on Emerging Technologies in Computing Systems · October 1, 2021 Full text Cite

Privacy-Preserving Representation Learning on Graphs: A Mutual Information Perspective

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 14, 2021 Learning with graphs has attracted significant attention recently. Existing representation learning methods on graphs have achieved state-of-the-art performance on various graph-related tasks such as node classification, link prediction, etc. However, we o ... Full text Cite

Defending against GAN-based DeepFake Attacks via Transformation-aware Adversarial Faces

Conference Proceedings of the International Joint Conference on Neural Networks · July 18, 2021 DeepFake represents a category of face-swapping attacks that leverage machine learning models such as autoen-coders or generative adversarial networks. Although the concept of the face-swapping is not new, its recent technical advances make fake content (e ... Full text Cite

TPrune: Efficient Transformer Pruning for Mobile Devices

Journal Article ACM Transactions on Cyber-Physical Systems · July 1, 2021 The invention of Transformer model structure boosts the performance of Neural Machine Translation (NMT) tasks to an unprecedented level. Many previous works have been done to make the Transformer model more execution-friendly on resource-constrained platfo ... Full text Cite

Combining improved genetic algorithm and matrix semi-tensor product (STP) in color image encryption

Journal Article Signal Processing · June 1, 2021 Image encryption is one of the important methods for preservation of confidentiality and integrity of digital images. In this paper, a color image cryptosystem based on improved genetic algorithm and matrix semi-tensor product (STP) is introduced. The encr ... Full text Cite

End-to-End Detection-Segmentation System for Face Labeling

Journal Article IEEE Transactions on Emerging Topics in Computational Intelligence · June 1, 2021 In this paper, we propose an end-to-end detection-segmentation system to implement detailed face labeling. Fully convolutional networks (FCN) has become the mainstream algorithm in the field of semantic segmentation due to the state-of-the-art performance. ... Full text Cite

DeepObfuscator: Obfuscating Intermediate Representations with Privacy-Preserving Adversarial Learning on Smartphones

Conference IoTDI 2021 - Proceedings of the 2021 International Conference on Internet-of-Things Design and Implementation · May 18, 2021 Deep learning has been widely applied in many computer vision applications, with remarkable success. However, running deep learning models on mobile devices is generally challenging due to the limitation of computing resources. A popular alternative is to ... Full text Cite

An efficient approach for encrypting double color images into a visually meaningful cipher image using 2D compressive sensing

Journal Article Information Sciences · May 1, 2021 An efficient visually meaningful double color image encryption algorithm is proposed by combining 2D compressive sensing (CS) with an embedding technique. First, two color images are measured by measurement matrices in two directions to achieve simultaneou ... Full text Cite

An electroforming-free, analog interface-type memristor based on a SrFeOx epitaxial heterojunction for neuromorphic computing

Journal Article Materials Today Physics · May 1, 2021 Distinct from the conductive filament-type counterparts, the interface-type resistive switching (RS) devices are electroforming-free and exhibit bidirectionally continuous conductance changes, making them promising candidates as analog synapses. While the ... Full text Cite

Optical Generative Adversarial Network based on Programmable Phase-change Photonics

Conference 2021 Conference on Lasers and Electro-Optics, CLEO 2021 - Proceedings · May 1, 2021 We demonstrate photonic generative adversarial networks (GANs) based on a phase-change metasurface mode converter (PMMC) array and perform the handwritten-like number generation task. ... Cite

Introduction of Special Issue on Hardware and Algorithms for Efficient Machine LearningΓÇôPart 1

Journal Article ACM Journal on Emerging Technologies in Computing Systems · April 5, 2021 Full text Cite

Improving Multilevel Writes on Vertical 3-D Cross-Point Resistive Memory

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · April 1, 2021 Resistive memory is promising to be constructed as a high-density storage-class memory. Multilevel cell, access-transistor-free cross-point array structure, and 3-D array integration are three approaches to scale up the density of resistive memory. However ... Full text Cite

Improving Write Performance on Cross-Point RRAM Arrays by Leveraging Multidimensional Non-Uniformity of Cell Effective Voltage

Journal Article IEEE Transactions on Computers · April 1, 2021 Resistive cross-point memory arrays can be used to construct high-density storage-class memory. However, coupled IR drop and sneak currents cause multidimensional non-uniformity of cell effective voltage in cross-point arrays. The voltage non-uniformity si ... Full text Cite

Memristive LSTM Network for Sentiment Analysis

Journal Article IEEE Transactions on Systems, Man, and Cybernetics: Systems · March 1, 2021 This paper presents a complete solution for the hardware design of a memristor-based long short-term memory (MLSTM) network. Throughout the design process, we fully consider the external and internal structures of the long short-term memory (LSTM), both of ... Full text Cite

Sliding Mode Stabilization of Memristive Neural Networks With Leakage Delays and Control Disturbance.

Journal Article IEEE transactions on neural networks and learning systems · March 2021 In this article, we investigate a class of memristive neural networks (MNNs) with time-varying delays and leakage delays via sliding mode control (SMC) with and without control disturbance. SMC is used to ensure MNNs' stability. According to the characteri ... Full text Cite

Marvel: A Vertical Resistive Accelerator for Low-Power Deep Learning Inference in Monolithic 3D

Conference Proceedings -Design, Automation and Test in Europe, DATE · February 1, 2021 Resistive memory (ReRAM) based Deep Neural Network (DNN) accelerators have achieved state-of-the-art DNN inference throughput. However, the power efficiency of such resistive accelerators is greatly limited by their peripheral circuitry including analog-to ... Full text Cite

RAISE: A Resistive Accelerator for Subject-Independent EEG Signal Classification

Conference Proceedings -Design, Automation and Test in Europe, DATE · February 1, 2021 State-of-the-art deep neural networks (DNNs) for electroencephalography (EEG) signals classification focus on subject-related tasks, in which the test data and the training data needs to be collected from the same subject. In addition, due to limited compu ... Full text Cite

Efficient neural network using pointwise convolution kernels with linear phase constraint

Journal Article Neurocomputing · January 29, 2021 In current efficient convolutional neural networks, 1 × 1 convolution is widely used. However, the amount of computation and the number of parameters of 1 × 1 convolution layers account for a large part of these neural network models. In this paper, we pro ... Full text Cite

Net2: A Graph Attention Network Method Customized for Pre-Placement Net Length Estimation

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 18, 2021 Net length is a key proxy metric for optimizing timing and power across various stages of a standard digital design flow. However, the bulk of net length information is not available until cell placement, and hence it is a significant challenge to explicit ... Full text Cite

Exploring Applications of STT-RAM in GPU Architectures

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · January 1, 2021 Use of modern GPUs has been extended from traditional 3D graphic processing to computing acceleration of many scientific, engineering, and enterprise applications. In modern GPUs, on-chip memory capacity keeps increasing to support thousands of chip-reside ... Full text Cite

Explainability Metrics of Deep Convolutional Networks for Photoplethysmography Quality Assessment.

Journal Article IEEE access : practical innovations, open solutions · January 2021 Photoplethysmography (PPG) is a noninvasive way to monitor various aspects of the circulatory system, and is becoming more and more widespread in biomedical processing. Recently, deep learning methods for analyzing PPG have also become prevalent, achieving ... Full text Cite

2021: The Greatest Reset [From the Editor]

Journal Article IEEE Circuits and Systems Magazine · January 1, 2021 Full text Cite

VD-GAN: A Unified Framework for Joint Prototype and Representation Learning from Contaminated Single Sample per Person

Journal Article IEEE Transactions on Information Forensics and Security · January 1, 2021 Single sample per person (SSPP) face recognition with a ${c}$ ontaminated biometric ${e}$ nrolment database (SSPP-ce FR) is an emerging practical FR problem, where the SSPP in the enrolment database is no longer standard but contaminated by nuisance facial ... Full text Cite

Bridging a Gap in SAR-ATR: Training on Fully Synthetic and Testing on Measured Data

Journal Article IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing · January 1, 2021 Obtaining measured synthetic aperture radar (SAR) data for training automatic target recognition (ATR) models can be too expensive (in terms of time and money) and complex of a process in many situations. In response, researchers have developed methods for ... Full text Cite

Training SAR-ATR Models for Reliable Operation in Open-World Environments

Journal Article IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing · January 1, 2021 Training deep learning-based synthetic aperture radar automatic target recognition (SAR-ATR) systems for use in an 'open-world' operating environment has, thus far proven difficult. Most SAR-ATR systems are designed to achieve maximum accuracy for a limite ... Full text Cite

NASGEM: Neural Architecture Search via Graph Embedding Method

Conference 35th AAAI Conference on Artificial Intelligence, AAAI 2021 · January 1, 2021 Neural Architecture Search (NAS) automates and prospers the design of neural networks. Estimator-based NAS has been proposed recently to model the relationship between architectures and their performance to enable scalable and flexible search. However, exi ... Cite

Optical generative adversarial network based on programmable phase-change photonics

Conference Optics InfoBase Conference Papers · January 1, 2021 We demonstrate photonic generative adversarial networks (GANs) based on a phase-change metasurface mode converter (PMMC) array and perform the handwritten-like number generation task. ... Cite

Hermes: Decentralized dynamic spectrum access system for massive devices deployment in 5g

Conference International Conference on Embedded Wireless Systems and Networks · January 1, 2021 With the incoming 5G network, the ubiquitous Internet of Things (IoT) devices can benefit our daily life, such as smart cameras, drones, etc. With the introduction of the millimeter-wave band and the thriving number of IoT devices, it is critical to design ... Cite

Soteria: Provable Defense against Privacy Leakage in Federated Learning from Representation Perspective

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2021 Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data. However, recent works have demonstrated that sharing model updates makes FL vulnerable to inference attack. In this wo ... Full text Cite

REREC: In-ReRAM Acceleration with Access-Aware Mapping for Personalized Recommendation

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2021 Personalized recommendation systems are widely used in many Internet services. The sparse embedding lookup in recommendation models dominates the computational cost of inference due to its intensive irregular memory accesses. Applying resistive random acce ... Full text Cite

FedSwap: A Federated Learning based 5G Decentralized Dynamic Spectrum Access System

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2021 The era of 5G extends the available spectrum from the microwave band to the millimeter-wave band. The thriving Internet of Things (IoT) also enriches the user equipment (UEs) we used in our daily life, such as smart glasses, smart watches, and drones. With ... Full text Cite

Automatic Routability Predictor Development Using Neural Architecture Search

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2021 The rise of machine learning technology inspires a boom of its applications in electronic design automation (EDA) and helps improve the degree of automation in chip designs. However, manually crafted machine learning models require extensive human expertis ... Full text Cite

MORE2: Morphable Encryption and Encoding for Secure NVM

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2021 Memory encryption can enhance the security of Non-volatile memories (NVMs), but it significantly increases the data bits written to NVMs and leads to severe lifetime and performance degradation. Current encryption techniques aim to reduce the re-encryption ... Full text Cite

Can Targeted Adversarial Examples Transfer When the Source and Target Models Have No Label Space Overlap?

Conference Proceedings of the IEEE International Conference on Computer Vision · January 1, 2021 We design blackbox transfer-based targeted adversarial attacks for an environment where the attacker's source model and the target blackbox model may have disjoint label spaces and training datasets. This scenario significantly differs from the "standard"b ... Full text Cite

LotteryFL: Empower Edge Intelligence with Personalized and Communication-Efficient Federated Learning

Conference 6th ACM/IEEE Symposium on Edge Computing, SEC 2021 · January 1, 2021 With the proliferation of mobile computing and Internet of Things (IoT), massive mobile and IoT devices are connected to the Internet. These devices are generating a huge amount of data every second at the network edge. Many artificial intelligence applica ... Full text Cite

Improving Gradient Regularization using Complex-Valued Neural Networks

Conference Proceedings of Machine Learning Research · January 1, 2021 Gradient regularization is a neural network defense technique that requires no prior knowledge of an adversarial attack and that brings only limited increase in training computational complexity. A form of complex-valued neural network (CVNN) is proposed t ... Cite

AI-Powered IoT System at the Edge

Conference Proceedings - 2021 IEEE 3rd International Conference on Cognitive Machine Intelligence, CogMI 2021 · January 1, 2021 The proliferation of low-cost and low-power IoT devices are constantly generating gigabytes data at the network edge. Bridging AI with IoT is a natural option to unleash the data on devices. AI-powered IoT systems can boost many novel applications and serv ... Full text Cite

FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from a Client Perspective

Conference Advances in Neural Information Processing Systems · January 1, 2021 Federated learning (FL) is a popular distributed learning framework that trains a global model through iterative communications between a central server and edge devices. Recent works have demonstrated that FL is vulnerable to model poisoning attacks. Seve ... Cite

DISENTANGLING PROTOTYPE AND VARIATION FOR SINGLE SAMPLE FACE RECOGNITION

Conference Proceedings - IEEE International Conference on Multimedia and Expo · January 1, 2021 Single sample per person face recognition (SSPP FR) is one of the most challenging problems in FR due to the extreme lack of enrolment data. State-of-the-art SSPP FR methods are based on the prototype plus variation (i.e., P+V) model. However, the classic ... Full text Cite

Preface

Book · January 1, 2021 Cite

Preface

Book · January 1, 2021 Cite

BSQ: EXPLORING BIT-LEVEL SPARSITY FOR MIXED-PRECISION NEURAL NETWORK QUANTIZATION

Conference ICLR 2021 - 9th International Conference on Learning Representations · January 1, 2021 Mixed-precision quantization can potentially achieve the optimal tradeoff between performance and compression rate of deep neural networks, and thus, have been widely investigated. However, it lacks a systematic method to determine the exact quantization s ... Cite

Hermes: An efficient federated learning framework for heterogeneous mobile clients

Conference Proceedings of the Annual International Conference on Mobile Computing and Networking, MOBICOM · January 1, 2021 Federated learning (FL) has been a popular method to achieve distributed machine learning among numerous devices without sharing their data to a cloud server. FL aims to learn a shared global model with the participation of massive devices under the orches ... Full text Cite

FCDM: A Methodology Based on Sensor Pattern Noise Fingerprinting for Fast Confidence Detection to Adversarial Attacks

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · December 1, 2020 Deep neural networks (DNNs) have shown phenomenal success in many real-world applications. However, a concerning weakness of DNNs is their vulnerability to adversarial attacks. Although there exist some methods to detect adversarial attacks, they often suf ... Full text Cite

HitM: High-Throughput ReRAM-based PIM for Multi-Modal Neural Networks

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 2, 2020 With the rapid progress of artificial intelligence (AI) algorithms, multi-modal deep neural networks (DNNs) have been applied to some challenging tasks, e.g., image and video description to process multi-modal information from vision and language. Resistiv ... Full text Cite

Fast IR Drop Estimation with Machine Learning : Invited Paper

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 2, 2020 IR drop constraint is a fundamental requirement enforced in almost all chip designs. However, its evaluation takes a long time, and mitigation techniques for fixing violations may require numerous iterations. As such, fast and accurate IR drop prediction b ... Full text Cite

MobiLattice: A Depth-wise DCNN Accelerator with Hybrid Digital/Analog Nonvolatile Processing-In-Memory Block

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 2, 2020 Nonvolatile Processing-In-Memory (NVPIM) architecture is a promising technology to enable energy-efficient inference of Deep Convolutional Neural Networks (DCNNs). One major advantage of NVPIM is that the vector dot-product operations can be completed effi ... Full text Cite

ReTransformer: ReRAM-based Processing-in-Memory Architecture for Transformer Acceleration

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 2, 2020 Transformer has emerged as a popular deep neural network (DNN) model for Neural Language Processing (NLP) applications and demonstrated excellent performance in neural machine translation, entity recognition, etc. However, its scaled dot-product attention ... Full text Cite

Routing-Free Crosstalk Prediction

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 2, 2020 Interconnect spacing is getting increasingly smaller in advanced technology nodes, which adversely increases the capacitive coupling of adjacent interconnect wires. It makes crosstalk a significant contributor to signal integrity and timing, and it is now ... Full text Cite

Color image compression and encryption scheme based on compressive sensing and double random encryption strategy

Journal Article Signal Processing · November 1, 2020 Based on compressive sensing and double random encryption strategy, a novel color image compression and encryption scheme is proposed in this paper. The architecture of compression, confusion and diffusion is adopted. Firstly, the red, green and blue compo ... Full text Cite

MVStylizer: An efficient edge-assisted video photorealistic style transfer system for mobile phones

Conference Proceedings of the International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc) · October 11, 2020 Recent research has made great progress in realizing neural style transfer of images, which denotes transforming an image to a desired style. Many users start to use their mobile phones to record their daily life, and then edit and share the captured image ... Full text Cite

Thwarting Replication Attack against Memristor-Based Neuromorphic Computing System

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · October 1, 2020 Neuromorphic architectures are widely used in many applications for advanced data processing and often implement proprietary algorithms. However, in an adversarial scenario, such systems may face elaborate security attacks including learning attack. In thi ... Full text Cite

Projective Synchroniztion of Neural Networks via Continuous/Periodic Event-Based Sampling Algorithms

Journal Article IEEE Transactions on Network Science and Engineering · October 1, 2020 This study concerns the projective synchronization problem of basic neural networks via continuous/periodic event-based sampling algorithms. Firstly, an event-Triggering control scheme is proposed via continuous sampling. In addition, there exists a consis ... Full text Cite

A Low-Overhead Encoding Scheme to Extend the Lifetime of Nonvolatile Memories

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · October 1, 2020 Emerging nonvolatile memories (NVMs) are promising to replace DRAM as main memory. However, NVMs suffer from limited write endurance and high write energy. Encoding method reduces the bit flips of NVMs by exploiting additional tag bits to encode the data. ... Full text Cite

Fork Path: Batching ORAM Requests to Remove Redundant Memory Accesses

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · October 1, 2020 Outsourcing data to a third-party cloud provider has become quite common with the increasing use of cloud computing. This brings convenience, as well as the concern for data security and privacy. It is believed that data encryption alone is often not enoug ... Full text Cite

Revisiting memristor properties

Journal Article International Journal of Bifurcation and Chaos · September 30, 2020 Memristor is a natural synapse because of its nanoscale and memory property, which influences the performance of memristive artificial neural networks. A three-variable memristor model is simplified with 15 kinds of properties, including the learning exper ... Full text Cite

Designing pulse-coupled neural networks with spike-synchronization-dependent plasticity rule: image segmentation and memristor circuit application

Journal Article Neural Computing and Applications · September 1, 2020 Pulse-coupled neural network (PCNN) is a powerful unsupervised learning model with many parameters to be determined empirically. In particular, the weight matrix is invariable in the iterative process, which is inconsistent with the actual biological syste ... Full text Cite

An effective image compression–encryption scheme based on compressive sensing (CS) and game of life (GOL)

Journal Article Neural Computing and Applications · September 1, 2020 At present, information entropies of cipher images gotten by some CS-based image cryptosystems are lower than 7, which make them vulnerable to entropy attack. To cope with this problem, we propose a novel image compression–encryption method based on compre ... Full text Cite

Editorial for the special issue on disruptive computing technologies

Journal Article CCF Transactions on High Performance Computing · September 1, 2020 Full text Cite

TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework for Deep Learning with Anonymized Intermediate Representations

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 23, 2020 The success of deep learning partially benefits from the availability of various large-scale datasets. These datasets are often crowdsourced from individual users and contain private information like gender, age, etc. The emerging privacy concerns from use ... Full text Cite

AutoGrow: Automatic Layer Growing in Deep Convolutional Networks

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 23, 2020 Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We proposeAutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture,AutoGrow grows new layers if t ... Full text Cite

INOR—An Intelligent noise reduction method to defend against adversarial audio examples

Journal Article Neurocomputing · August 11, 2020 Recently, Automatic Speech Recognition(ASR) systems are seriously threatened by adversarial audio examples. The defense against adversarial audio examples has become an urgent issue. Different from adversarial image examples whose target is limited in the ... Full text Cite

Training memristor-based multilayer neuromorphic networks with SGD, momentum and adaptive learning rates.

Journal Article Neural networks : the official journal of the International Neural Network Society · August 2020 Neural networks implemented with traditional hardware face inherent limitation of memory latency. Specifically, the processing units like GPUs, FPGAs, and customized ASICs, must wait for inputs to read from memory and outputs to write back. This motivates ... Full text Cite

Neuromorphic Computing Systems with Emerging Nonvolatile Memories: A Circuits and Systems Perspective

Conference 2020 International Symposium on VLSI Technology, Systems and Applications, VLSI-TSA 2020 · August 1, 2020 The renaissance of artificial intelligence highlights the tremendous need for computational power as well as higher computing efficiency in both high performance computing and embedded applications. [1] To meet this demand, neuromorphic computing systems ( ... Full text Cite

Memristor-Based Design of Sparse Compact Convolutional Neural Network

Journal Article IEEE Transactions on Network Science and Engineering · July 1, 2020 Memristor has been widely studied for hardware implementation of neural networks due to the advantages of nanometer size, low power consumption, fast switching speed and functional similarity to biological synapse. However, it is difficult to realize memri ... Full text Cite

SparseTrain: Exploiting dataflow sparsity for efficient convolutional neural networks training

Conference Proceedings - Design Automation Conference · July 1, 2020 Training Convolutional Neural Networks (CNNs) usually requires a large number of computational resources. In this paper, SparseTrain is proposed to accelerate CNN training by fully exploiting the sparsity. It mainly involves three levels of innovations: ac ... Full text Cite

Lattice: An ADC/DAC-less ReRAM-based processing-in-memory architecture for accelerating deep convolution neural networks

Conference Proceedings - Design Automation Conference · July 1, 2020 Nonvolatile Processing-In-Memory (NVPIM) has demonstrated its great potential in accelerating Deep Convolution Neural Networks (DCNN). However, most of existing NVPIM designs require costly analog-digital conversions and often rely on excessive data copies ... Full text Cite

Hiding cipher-images generated by 2-D compressive sensing with a multi-embedding strategy

Journal Article Signal Processing · June 1, 2020 In terms of 2-D compressive sensing (CS), multi-embedding strategy and chaotic systems, a novel color image encryption scheme to generate visually meaningful cipher image is proposed in this paper. Firstly, red, green and blue components of color image are ... Full text Cite

Exploiting plaintext-related mechanism for secure color image encryption

Journal Article Neural Computing and Applications · June 1, 2020 Nowadays, many image cryptosystems have been cracked by chosen-plaintext attacks, for they are not highly sensitive to plain image. To solve this problem, we introduce a plaintext-related mechanism for secure color image encryption, and it is established i ... Full text Cite

Learning low-rank deep neural networks via singular vector orthogonality regularization and singular value sparsification

Conference IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · June 1, 2020 Modern deep neural networks (DNNs) often require high memory consumption and large computational loads. In order to deploy DNN algorithms efficiently on edge or mobile devices, a series of DNN compression algorithms have been explored, including factorizat ... Full text Cite

An efficient chaos-based image compression and encryption scheme using block compressive sensing and elementary cellular automata

Journal Article Neural Computing and Applications · May 1, 2020 In this paper, an efficient image compression and encryption scheme combining the parameter-varying chaotic system, elementary cellular automata (ECA) and block compressive sensing (BCS) is presented. The architecture of permutation, compression and re-per ... Full text Cite

Event-triggered distributed control for synchronization of multiple memristive neural networks under cyber-physical attacks

Journal Article Information Sciences · May 1, 2020 This paper investigates the synchronization of multiple memristive neural networks (MMNNs) under cyber-physical attacks through distributed event-triggered control. In the field of multi-agent dynamics, memristive neural network (MNN) is considered as a ki ... Full text Cite

Structural sparsification for far-field speaker recognition with intel R GNA

Conference ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · May 1, 2020 Recently, deep neural networks (DNN) have been widely used in speaker recognition area. In order to achieve fast response time and high accuracy, the requirements for hardware resources increase rapidly. However, as the speaker recognition application is o ... Full text Cite

GaaS-X: Graph Analytics Accelerator Supporting Sparse Data Representation using Crossbar Architectures

Conference Proceedings - International Symposium on Computer Architecture · May 1, 2020 Graph analytics applications are ubiquitous in this era of a connected world. These applications have very low compute to byte-transferred ratios and exhibit poor locality, which limits their computational efficiency on general purpose computing systems. C ... Full text Cite

SPINBIS: Spintronics-based Bayesian inference system with stochastic computing

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · April 1, 2020 Bayesian inference is an effective approach for solving statistical learning problems, especially with uncertainty and incompleteness. However, Bayesian inference is a computing-intensive task whose efficiency is physically limited by the bottlenecks of co ... Full text Cite

Advanced techniques for robust SAR ATR: Mitigating noise and phase errors

Conference 2020 IEEE International Radar Conference, RADAR 2020 · April 1, 2020 We present advanced Deep Learning (DL) techniques for robust Synthetic Aperture Radar (SAR) automatic target recognition (ATR) in the presence of noise and signal phase errors. Our research focuses on ensuring robust performance of SAR ATR algorithms under ... Full text Cite

A low-cost and high-speed hardware implementation of spiking neural network

Journal Article Neurocomputing · March 21, 2020 Spiking neural network (SNN) is a neuromorphic system based on the information process and store procedure of biological neurons. In this paper, a low-cost and high-speed implementation for a spiking neural network based on FPGA is proposed. The LIF (Leaky ... Full text Cite

Quantized synchronization of memristive neural networks with time-varying delays via super-twisting algorithm

Journal Article Neurocomputing · March 7, 2020 In this paper, we investigate quantized synchronization control problem of memristive neural networks (MNNs) with time-varying delays via super-twisting algorithm. A feedback controller is introduced with quantized method. To enormously reduce the computat ... Full text Cite

A Survey of Accelerator Architectures for Deep Neural Networks

Journal Article Engineering · March 1, 2020 Recently, due to the availability of big data and the rapid growth of computing power, artificial intelligence (AI) has regained tremendous attention and investment. Machine learning (ML) approaches have been successfully applied to solve many problems in ... Full text Cite

ReBoc: Accelerating Block-Circulant Neural Networks in ReRAM

Conference Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020 · March 1, 2020 Deep neural networks (DNNs) emerge as a key component in various applications. However, the ever-growing DNN size hinders efficient processing on hardware. To tackle this problem, on the algorithmic side, compressed DNN models are explored, of which block- ... Full text Cite

AccPar: Tensor partitioning for heterogeneous deep learning accelerators

Conference Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020 · February 1, 2020 Deep neural network (DNN) accelerators as an example of domain-specific architecture have demonstrated great success in DNN inference. However, the architecture acceleration for equally important DNN training has not yet been fully studied. With data forwa ... Full text Cite

An efficient visually meaningful image compression and encryption scheme based on compressive sensing and dynamic LSB embedding

Journal Article Optics and Lasers in Engineering · January 1, 2020 In this paper, an efficient visually meaningful image compression and encryption (VMICE) scheme is proposed by combining compressive sensing (CS) and Least Significant Bit (LSB) embedding. First, the original image (Iorig) is compressed and encrypted into ... Full text Cite

Sliding mode control of neural networks via continuous or periodic sampling event-triggering algorithm.

Journal Article Neural networks : the official journal of the International Neural Network Society · January 2020 This paper presents the theoretical results on sliding mode control (SMC) of neural networks via continuous or periodic sampling event-triggered algorithm. Firstly, SMC with continuous sampling event-triggered scheme is developed and the practical sliding ... Full text Cite

FIST: A Feature-Importance Sampling and Tree-Based Method for Automatic Design Flow Parameter Tuning

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 Design flow parameters are of utmost importance to chip design quality and require a painfully long time to evaluate their effects. In reality, flow parameter tuning is usually performed manually based on designers' experience in an ad hoc manner. In this ... Full text Cite

PowerNet: Transferable Dynamic IR Drop Estimation via Maximum Convolutional Neural Network

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 IR drop is a fundamental constraint required by almost all chip designs. However, its evaluation usually takes a long time that hinders mitigation techniques for fixing its violations. In this work, we develop a fast dynamic IR drop estimation technique, n ... Full text Cite

PARC: A Processing-in-CAM Architecture for Genomic Long Read Pairwise Alignment using ReRAM

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 Technological advances in long read sequences have greatly facilitated the development of genomics. However, managing and analyzing the raw genomic data that outpaces Moore's Law requires extremely high computational efficiency. On the one hand, existing s ... Full text Cite

Enhancing Generalization of Wafer Defect Detection by Data Discrepancy-aware Preprocessing and Contrast-varied Augmentation

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 Wafer inspection locates defects at early fabrication stages and traditionally focuses on pixel-level defects. However, there are very few solutions that can effectively detect largescale defects. In this work, we leverage Convolutional Neural Networks (CN ... Full text Cite

Parallelism in Deep Learning Accelerators

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 Deep learning is the core of artificial intelligence and it achieves state-of-the-art in a wide range of applications. The intensity of computation and data in deep learning processing poses significant challenges to the conventional computing platforms. T ... Full text Cite

2020: Looking Forward to the Next Decade [From the Editor]

Journal Article IEEE Circuits and Systems Magazine · January 1, 2020 Full text Cite

Neural Predictor for Neural Architecture Search

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2020 Neural Architecture Search methods are effective but often use complex algorithms to come up with the best architecture. We propose an approach with three basic steps that is conceptually much simpler. First we train N random architectures to generate N (a ... Full text Cite

Task-Agnostic Privacy-Preserving Representation Learning via Federated Learning

Chapter · January 1, 2020 The availability of various large-scale datasets benefits the advancement of deep learning. These datasets are often crowdsourced from individual users and contain private information like gender, age, etc. Due to rich private information embedded in the r ... Full text Cite

Snooping attacks on deep reinforcement learning

Conference Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS · January 1, 2020 Adversarial attacks have exposed a significant security vulnerability in state-of-the-art machine learning models. Among these models include deep reinforcement learning agents. The existing methods for attacking reinforcement learning agents assume the ad ... Cite

TRP: Trained rank pruning for efficient deep neural networks

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2020 To enable DNNs on edge devices like mobile phones, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low- ... Cite

AutoShrink: A topology-aware NAS for discovering efficient neural architecture

Conference AAAI 2020 - 34th AAAI Conference on Artificial Intelligence · January 1, 2020 Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the flexibility of network patterns in learned cell structures. Moreover, ... Cite

Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability

Conference Advances in Neural Information Processing Systems · January 1, 2020 We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers. Rather than focusing on crossing decision boundaries at the output layer of the source model, our method perturbs ... Cite

PENNI: Pruned kernel sharing for efficient cnn inference

Conference 37th International Conference on Machine Learning, ICML 2020 · January 1, 2020 Although state-of-the-art (SOTA) CNNs achieve outstanding performance on various tasks, their high computation demand and massive number of parameters make it difficult to deploy these SOTA CNNs onto resource-constrained devices. Previous works on CNN acce ... Cite

Accelerating CNN Training by Pruning Activation Gradients

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2020 Sparsification is an efficient approach to accelerate CNN inference, but it is challenging to take advantage of sparsity in training procedure because the involved gradients are dynamically changed. Actually, an important observation shows that most of the ... Full text Cite

Message from the Technical Program Committee

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 1, 2020 Full text Cite

Highly efficient neuromorphic computing systems with emerging nonvolatile memories

Conference Proceedings of SPIE - The International Society for Optical Engineering · January 1, 2020 Increased interest in artificial intelligence coupled with a surge in nonvolatile memory research and the inevitable hitting of the”memory wall” in von Neuman computing has set the stage for a new flavor of computing systems to flourish: neuromorphic compu ... Full text Cite

TRANSFERABLE PERTURBATIONS OF DEEP FEATURE DISTRIBUTIONS

Conference 8th International Conference on Learning Representations, ICLR 2020 · January 1, 2020 Almost all current adversarial attacks of CNN classifiers rely on information derived from the output layer of the network. This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distrib ... Cite

Preface.

Book · December 2019 Full text Link to item Cite

Thread batching for high-performance energy-efficient GPU memory design

Journal Article ACM Journal on Emerging Technologies in Computing Systems · December 1, 2019 Massive multi-threading in GPU imposes tremendous pressure on memory subsystems. Due to rapid growth in thread-level parallelism of GPU and slowly improved peak memory bandwidth, memory becomes a bottleneck of GPU’s performance and energy efficiency. In th ... Full text Cite

On Designing Efficient and Reliable Nonvolatile Memory-Based Computing-In-Memory Accelerators

Conference Technical Digest - International Electron Devices Meeting, IEDM · December 1, 2019 Nonvolatile memory (NVM)-based computing-in-memory (CIM) features nonvolatile storage, in-place computing and reduction in data traffic. However, the development of NVM-based CIM is hampered by immature fabrication processes and inevitable operational faul ... Full text Cite

Trained Rank Pruning for Efficient Deep Neural Networks

Conference Proceedings - 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing, EMC2-NIPS 2019 · December 1, 2019 To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; h ... Full text Cite

ReBNN: in-situ acceleration of binarized neural networks in ReRAM using complementary resistive cell

Journal Article CCF Transactions on High Performance Computing · December 1, 2019 Resistive random access memory (ReRAM) has been proven capable to efficiently perform in-situ matrix-vector computations in convolutional neural network (CNN) processing. The computations are often conducted on multi-level cell (MLC) that have limited prec ... Full text Cite

An efficient mobile-edge collaborative system for video photorealistic style transfer

Conference Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC 2019 · November 7, 2019 Full text Cite

Poster abstract: An efficient edge-assisted mobile system for video photorealistic style transfer

Conference Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC 2019 · November 7, 2019 In the past decade, convolutional neural networks (CNNs) have achieved great practical success in image transformation tasks, including style transfer, semantic segmentation, etc. CNN-based style transfer, which denotes transforming an image into a desired ... Full text Cite

How to obtain and run light and efficient deep learning networks

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 1, 2019 As the model size of deep neural networks (DNNs) grows for better performance, the increase in computational cost associated with training and testing makes it extremely difficulty to deploy DNNs on end/edge devices with limited resources while also satisf ... Full text Cite

Hardware fault tolerance for binary RRAM crossbars

Conference Proceedings - International Test Conference · November 1, 2019 Resistive random-access memory (RRAM)-based computing systems (RCS) are being advocated for neural network acceleration. The memristor is the unit cell of an RCS and it is susceptible to process variations and manufacturing defects. Therefore, it is essent ... Full text Cite

A chaotic image encryption algorithm based on 3-D bit-plane permutation

Journal Article Neural Computing and Applications · November 1, 2019 There are two shortcomings existing in the current color image encryption. One is that high correlation between R, G, B components of the original image may be neglected, the other is that the encryption has little relationship with the plain image, and th ... Full text Cite

Efficiently Learning a Robust Self-Driving Model with Neuron Coverage Aware Adaptive Filter Reuse

Conference IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation · October 1, 2019 Human drivers learn driving skills from both regular (non-Accidental) and accidental driving experiences, while most of current self-driving research focuses on regular driving only. We argue that learning from accidental driving data is necessary for robu ... Full text Cite

MSNet: Structural wired neural architecture search for internet of things

Conference Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019 · October 1, 2019 The prosperity of Internet of Things (IoT) calls for efficient ways of designing extremely compact yet accurate DNN models. Both the cell-based neural architecture search methods and the recently proposed graph based methods fall short in finding high qual ... Full text Cite

Message from the General Chairs

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · July 1, 2019 Full text Cite

Adaptive granularity encoding for energy-efficient non-volatile main memory

Conference Proceedings - Design Automation Conference · June 2, 2019 Data encoding methods have been proposed to alleviate the high write energy and limited write endurance disadvantages of Non- Volatile Memories (NVMs). Encoding methods are proved to be effective through theoretical analysis. Under the data patterns of wor ... Full text Cite

ESLAM: An energy-efficient accelerator for real-time ORB-SLAM on FPGA platform

Conference Proceedings - Design Automation Conference · June 2, 2019 Simultaneous Localization and Mapping (SLAM) is a critical task for autonomous navigation. However, due to the computational complexity of SLAM algorithms, it is very difficult to achieve realtime implementation on low-power platforms. We propose an energy ... Full text Cite

MobiEye: An efficient cloud-based video detection system for real-time mobile applications

Conference Proceedings - Design Automation Conference · June 2, 2019 In recent years, machine learning research has largely shifted focus from the cloud to the edge. While the resulting algorithm- and hardware-level optimizations have enabled local execution for the majority of deep neural networks (DNNs) on edge devices, t ... Full text Cite

Machine learning-based pre-routing timing prediction with reduced pessimism

Conference Proceedings - Design Automation Conference · June 2, 2019 Optimizations at placement stage need to be guided by timing estimation prior to routing. To handle timing uncertainty due to the lack of routing information, people tend to make very pessimistic predictions such that performance specification can be ensur ... Full text Cite

ZARA: A novel zero-free dataflow accelerator for generative adversarial networks in 3D ReRAM

Conference Proceedings - Design Automation Conference · June 2, 2019 Generative Adversarial Networks (GANs) recently demonstrated a great opportunity toward unsupervised learning with the intention to mitigate the massive human efforts on data labeling in supervised learning algorithms. GAN combines a generative model and a ... Full text Cite

Low-Power Computer Vision: Status, Challenges, and Opportunities

Journal Article IEEE Journal on Emerging and Selected Topics in Circuits and Systems · June 1, 2019 Computer vision has achieved impressive progress in recent years. Meanwhile, mobile phones have become the primary computing platforms for millions of people. In addition to mobile phones, many autonomous systems rely on visual data for making decisions, a ... Full text Cite

RRAM-based Spiking Nonvolatile Computing-In-Memory Processing Engine with Precision-Configurable in Situ Nonlinear Activation

Conference Digest of Technical Papers - Symposium on VLSI Technology · June 1, 2019 This work presents a hybrid CMOS-RRAM integration of spiking nonvolatile computing-in-memory (nvCIM) processing engine (PE) that includes a 64Kb RRAM macro and a novel in situ nonlinear activation (ISNA) module. We integrate the computing controller and no ... Full text Cite

Feature space perturbations yield more transferable adversarial examples

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · June 1, 2019 Many recent works have shown that deep learning models are vulnerable to quasi-imperceptible input perturbations, yet practitioners cannot fully explain this behavior. This work describes a transfer-based blackbox targeted adversarial attack of deep featur ... Full text Cite

Routability-Driven Macro Placement with Embedded CNN-Based Prediction Model

Conference Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019 · May 14, 2019 With the dramatic shrink of feature size and the advance of semiconductor technology nodes, numerous and complicated design rules need to be followed, and a chip design can only be taped-out after passing design rule check (DRC). The high design complexity ... Full text Cite

RED: A ReRAM-based Deconvolution Accelerator

Conference Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019 · May 14, 2019 Deconvolution has been widespread in neural networks. For example, it is essential for performing unsupervised learning in generative adversarial networks or constructing fully convolutional networks for semantic segmentation. Resistive RAM (ReRAM)-based p ... Full text Cite

Learning Efficient Sparse Structures in Speech Recognition

Conference ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · May 1, 2019 Recurrent neural networks (RNNs), especially long short-term memories (LSTMs) have been widely used in speech recognition and natural language processing. As the sizes of RNN models grow for better performance, the computation cost and therefore the requir ... Full text Cite

HyPar: Towards hybrid parallelism for deep learning accelerator array

Conference Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019 · March 26, 2019 With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used in many domains. To achieve high performance and energy efficiency, hardware acceleration (especially inference) of DNNs is intensively studied both ... Full text Cite

Special Session: 2018 Low-Power Image Recognition Challenge and beyond

Conference Proceedings 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2019 · March 1, 2019 The IEEE Low-Power Image Recognition Challenge (LPIRC) is an annual competition started in 2015. The competition identifies the best technologies that can detect objects in images efficiently (short execution time and low energy consumption). This paper su ... Full text Cite

Exploration of Automatic Mixed-Precision Search for Deep Neural Networks

Conference Proceedings 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2019 · March 1, 2019 Neural networks have shown great performance in cognitive tasks. When deploying network models on mobile devices with limited computation and storage resources, the weight quantization technique has been widely adopted. In practice, 8-bit or 16-bit quantiz ... Full text Cite

RC-NVM: Dual-Addressing Non-Volatile Memory Architecture Supporting Both Row and Column Memory Accesses

Journal Article IEEE Transactions on Computers · February 1, 2019 Although emerging non-volatile memories (NVMs) have been comprehensively studied to design next-generation memory systems, the symmetry of the crossbar structure adopted by most NVMs has not been addressed. In this work, we argue that crossbar-based NVMs c ... Full text Cite

A color image cryptosystem based on dynamic DNA encryption and chaos

Journal Article Signal Processing · February 1, 2019 This paper presents a color image cryptosystem based on dynamic DNA encryption and chaos. The color plain image is firstly decomposed into red, green and blue components, and then a simultaneous intra-inter-component permutation mechanism dependent on the ... Full text Cite

AdverQuil: An efficient adversarial detection and alleviation technique for black-box neuromorphic computing systems

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 21, 2019 In recent years, neuromorphic computing systems (NCS) have gained popularity in accelerating neural network computation because of their high energy efficiency. The known vulnerability of neural networks to adversarial attack, however, raises a severe secu ... Full text Cite

NeuralHMC: An efficient HMC-based accelerator for deep neural networks

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · January 21, 2019 In Deep Neural Network (DNN) applications, energy consumption and performance cost of moving data between memory hierarchy and computational units are significantly higher than that of the computation itself. Process-in-memory (PIM) architecture such as Hy ... Full text Cite

A novel image encryption scheme based on DNA sequence operations and chaotic systems

Journal Article Neural Computing and Applications · January 18, 2019 In the paper, a novel image encryption algorithm based on DNA sequence operations and chaotic systems is proposed. The encryption architecture of permutation and diffusion is adopted. Firstly, 256-bit hash value of the plain image is gotten to calculate th ... Full text Cite

Exploiting spin-orbit torque devices as reconfigurable logic for circuit obfuscation

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · January 1, 2019 Circuit obfuscation is a frequently used approach to conceal logic functionalities in order to prevent reverse engineering attacks on fabricated chips. Efficient obfuscation implementations are expected with lower design complexity and overhead but higher ... Full text Cite

Markov Chain Based Efficient Defense Against Adversarial Examples in Computer Vision

Journal Article IEEE Access · January 1, 2019 Adversarial examples are the inputs to machine learning models that result in erroneous outputs, which are usually generated from normal inputs via subtle modification and seem to remain unchanged to human observers. They have severely threatened the appli ... Full text Cite

Bamboo: Ball-shape data augmentation against adversarial attacks from all directions

Conference CEUR Workshop Proceedings · January 1, 2019 The robustness of Deep neural networks (DNNs) has been recently challenged by adversarial attacks State-of-the-art defending algorithms improve DNNs’ robustness by paying high computational costs. Moreover, these approaches are usually designed against one ... Cite

DPatch: An adversarial patch attack on object detectors

Conference CEUR Workshop Proceedings · January 1, 2019 Object detectors have emerged as an indispensable module in modern computer vision systems. In this work, we propose DPATCH– a black-box adversarial-patch-based attack towards mainstream object detectors (i.e. Faster R-CNN and YOLO). Unlike the original ad ... Cite

Reshaping Future Computing Systems with Emerging Nonvolatile Memory Technologies

Journal Article IEEE Micro · January 1, 2019 nonvolatile memory (eNVM) technology is an important sector in both academic research and memory industry for more than a decade. Following the maturing of fabrication process, eNVMs started to demonstrate their unique properties in data storage and comput ... Full text Cite

Enhance the robustness to time dependent variability of ReRAM-based neuromorphic computing systems with regularization and 2R synapse

Conference Proceedings - IEEE International Symposium on Circuits and Systems · January 1, 2019 Time Dependent Variability (TDV) is one of the major concerns in implementing a Neuromorphic Computing System (NCS) with Resistive Random Access Memory (ReRAM). In this work, we propose a variation-distribution aware training algorithm to enhance the robus ... Full text Cite

Towards decentralized deep learning with differential privacy

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2019 In distributed machine learning, while a great deal of attention has been paid on centralized systems that include a central parameter server, decentralized systems have not been fully explored. Decentralized systems have great potentials in the future pra ... Full text Cite

Faster cnns with direct sparse convolutions and guided pruning

Conference 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings · January 1, 2019 © ICLR 2019 - Conference Track Proceedings. All rights reserved. Phenomenally successful in practical inference problems, convolutional neural networks (CNN) are widely deployed in mobile devices, data centers, and even supercomputers. The number of parame ... Cite

Designing neuromorphic computing systems with memristor devices

Chapter · January 1, 2019 Neuromorphic computing systems are under heavy investigation as a potential substitute for the traditional von Neumann systems in high-speed low-power applications. One way to implement neuromorphic systems in hardware is to use the new emerging devices su ... Full text Cite

Neuromorphic computing systems: From CMOS to emerging nonvolatile memory

Journal Article IPSJ Transactions on System LSI Design Methodology · January 1, 2019 The end of Moore's Law and von Neumann bottleneck motivate researchers to seek alternative architectures that can fulfill the increasing demand for computation resources which cannot be easily achieved by traditional computing paradigm. As one important pr ... Full text Cite

Survey of low-power electric vehicles: A design automation perspective

Journal Article IEEE Design and Test · December 1, 2018 The survey of the guest editors provides a comprehensive introduction to the design process of electric vehicles. Relevant topics such as the modeling of efficiency of the propulsion engine, optimization of the propulsion engine, runtime driving management ... Full text Cite

A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform

Conference Proceedings - 2018 International Conference on Field-Programmable Technology, FPT 2018 · December 1, 2018 Region proposal is critical for object detection while it usually poses a bottleneck in improving the computation efficiency on traditional control-flow architectures. We have observed region proposal tasks are potentially suitable for performing pipelined ... Full text Cite

RouteNet: Routability prediction for mixed-size designs using convolutional neural network

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 5, 2018 Early routability prediction helps designers and tools perform preventive measures so that design rule violations can be avoided in a proactive manner. However, it is a huge challenge to have a predictor that is both accurate and fast. In this work, we stu ... Full text Cite

SPN dash: Fast detection of adversarial attacks on mobile via sensor pattern noise fingerprinting

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 5, 2018 A concerning weakness of deep neural networks is their susceptibility to adversarial attacks. While methods exist to detect these attacks, they incur significant drawbacks, ignoring external features which could aid in the task of attack detection. In this ... Full text Cite

TriZone: A Design of MLC STT-RAM Cache for Combined Performance, Energy, and Reliability Optimizations

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · October 1, 2018 Spin-transfer torque random access memory (STT-RAM) is a promising technology for future nonvolatile caches and memories. To increase the storage density, multilevel cell (MLC) technique was recently introduced to STT-RAM designs at the cost of degraded ac ... Full text Cite

MAT: A multi-strength adversarial training method to mitigate adversarial attacks

Conference Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI · August 7, 2018 Some recent work revealed that deep neural networks (DNNs) are vulnerable to so-called adversarial attacks where input examples are intentionally perturbed to fool DNNs. In this work, we revisit the DNN training process that includes adversarial examples i ... Full text Cite

An image encryption algorithm based on chaotic system and compressive sensing

Journal Article Signal Processing · July 1, 2018 In this paper, we propose an image encryption algorithm based on the memristive chaotic system, elementary cellular automata (ECA) and compressive sensing (CS). Firstly, the original image is transformed by discrete wavelet transform, and the sparse coeffi ... Full text Cite

Real-Time Cardiac Arrhythmia Classification Using Memristor Neuromorphic Computing System.

Conference Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference · July 2018 Cardiac arrhythmia is known to be one of the most common causes of death worldwide. Therefore, development of efficient arrhythmia detection techniques is essential to save patients' lives. In this paper, we introduce a new real-time cardiac arrhythmia cla ... Full text Cite

A Forgetting Memristive Spiking Neural Network for Pavlov Experiment

Journal Article International Journal of Bifurcation and Chaos · June 15, 2018 In this paper, we designed a memristive spiking neural network (MSNN) to perform a fully functional Pavlov experiment. A memristor with forgetting effect is adopted to implement synapses while Izhikevich neurons are used for generating tonic spiking and to ... Full text Cite

NV-TCAM: Alternative designs with NVM devices

Journal Article Integration · June 1, 2018 TCAM (ternary content addressable memory) is a special memory type that can compare input search data with stored data, and return location (sometime, the associated content) of matched data. TCAM is widely used in microprocessor designs as well as communi ... Full text Cite

Challenges of memristor based neuromorphic computing system

Journal Article Science China Information Sciences · June 1, 2018 Full text Cite

Low-Power image recognition challenge

Journal Article AI Magazine · June 1, 2018 The Low-Power Image Recognition Challenge (LPIRC) has been held annually since 2015. This article summarizes the competition advancements made over the past three years. ... Full text Cite

Special session on reliability and vulnerability of neuromorphic computing systems

Conference Proceedings of the IEEE VLSI Test Symposium · May 29, 2018 This is the summary of the special session on reliability and vulnerability of neuromorphic computing systems. ... Full text Cite

Shift-Optimized Energy-Efficient Racetrack-Based Main Memory

Journal Article Journal of Circuits, Systems and Computers · May 1, 2018 Recently developed spin-based, racetrack memory (RM) shows great promise in enabling nonvolatile memory with unprecedented density and energy efficiency. RM-based technology will leverage the power and cost limit of main memory. However, main memory has ra ... Full text Cite

Pulse-Width Modulation based Dot-Product Engine for Neuromorphic Computing System using Memristor Crossbar Array

Conference Proceedings - IEEE International Symposium on Circuits and Systems · April 26, 2018 The Dot-Product Engine (DPE) is a critical circuit for implementing neural networks in hardware. The recent-developed memristor crossbar array technology, which is able to efficiently carry out dot-product multiplication and update its weights in real time ... Full text Cite

Design and Data Management for Magnetic Racetrack Memory

Conference Proceedings - IEEE International Symposium on Circuits and Systems · April 26, 2018 Benefiting from its ultra-high storage density, high energy efficiency, and non-volatility, racetrack memory demonstrates great potential in replacing conventional SRAM as large on-chip memory. Integrating the tape-like racetrack memory, however, faces uni ... Full text Cite

Exploring the opportunity of implementing neuromorphic computing systems with spintronic devices

Conference Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 · April 19, 2018 Many cognitive algorithms such as neural networks cannot be efficiently executed by von Neumann architectures, the performance of which is constrained by the memory wall between microprocessor and memory hierarchy. Hence, researchers started to investigate ... Full text Cite

Recom: An efficient resistive accelerator for compressed deep neural networks

Conference Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 · April 19, 2018 Deep Neural Networks (DNNs) play a key role in prevailing machine learning applications. Resistive random-Access memory (ReRAM) is capable of both computation and storage, contributing to the acceleration on DNNs by processing in memory. Besides, a signifi ... Full text Cite

Three years of low-power image recognition challenge: Introduction to special session

Conference Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 · April 19, 2018 Reducing power consumption has been one of the most important goals since the creation of electronic systems. Energy efficiency is increasingly important as battery-powered systems (such as smartphones, drones, and body cameras) are widely used. It is desi ... Full text Cite

ReRAM-based accelerator for deep learning

Conference Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 · April 19, 2018 Big data computing applications such as deep learning and graph analytic usually incur a large amount of data movements. Deploying such applications on conventional von Neumann architecture that separates the processing units and memory components likely l ... Full text Cite

A compact model for selectors based on metal doped electrolyte

Journal Article Applied Physics A: Materials Science and Processing · April 1, 2018 A selector device that demonstrates high nonlinearity and low switching voltages was fabricated using HfOx as a solid electrolyte doped with Ag electrodes. The electronic conductance of the volatile conductive filaments responsible for the switching was st ... Full text Cite

RC-NVM: Enabling Symmetric Row and Column Memory Accesses for In-memory Databases

Conference Proceedings - International Symposium on High-Performance Computer Architecture · March 27, 2018 Ever increasing DRAM capacity has fostered the development of in-memory databases (IMDB). The massive performance improvements provided by IMDBs have enabled transactions and analytics on the same database. In other words, the integration of OLTP (on-line ... Full text Cite

GraphR: Accelerating Graph Processing Using ReRAM

Conference Proceedings - International Symposium on High-Performance Computer Architecture · March 27, 2018 Graph processing recently received intensive interests in light of a wide range of needs to understand relationships. It is well-known for the poor locality and high memory bandwidth requirement. In conventional architectures, they incur a significant amou ... Full text Cite

Neuromorphic computing's yesterday, today, and tomorrow – an evolutional view

Journal Article Integration · March 1, 2018 Neuromorphic computing was originally referred to as the hardware that mimics neuro-biological architectures to implement models of neural systems. The concept was then extended to the computing systems that can run bio-inspired computing models, e.g., neu ... Full text Cite

Modeling of biaxial magnetic tunneling junction for multi-level cell STT-RAM realization

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 In recent years, spin-transfer torque random access memory (STT-RAM) has been widely studied as a promising candidate to replace DRAM because of its fast access time, high endurance, and good CMOS compatibility. The improvement of tunneling magneto-resista ... Full text Cite

Neu-NoC: A high-efficient interconnection network for accelerated neuromorphic systems

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 A modern neuromorphic acceleration system could consist of hundreds of accelerators, which are often organized through a network-on-chip (NoC). Although the overall computing ability is greatly promoted by a large number of the accelerators, the power cons ... Full text Cite

Spintronics based stochastic computing for efficient Bayesian inference system

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 Bayesian inference is an effective approach for solving statistical learning problems especially with uncertainty and incompleteness. However, inference efficiencies are physically limited by the bottlenecks of conventional computing platforms. In this pap ... Full text Cite

Process variation aware data management for magnetic skyrmions racetrack memory

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 Skyrmions racetrack memory (SKM) has been identified as a promising candidate for future on-chip cache. Similar to many other nanoscale technologies, process variations also adversely impact the reliability and performance of SKM cache. In this work, we pr ... Full text Cite

Running sparse and low-precision neural network: When algorithm meets hardware

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 Deep Neural Networks (DNNs) are pervasively applied in many artificial intelligence (AI) applications. The high performance of DNNs comes at the cost of larger size and higher compute complexity. Recent studies show that DNNs have much redundancy, such as ... Full text Cite

ReGAN: A pipelined ReRAM-based accelerator for generative adversarial networks

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 20, 2018 Generative Adversarial Networks (GANs) have recently drawn tremendous attention in many artificial intelligence (AI) applications including computer vision, speech recognition, and natural language processing. While GANs deliver state-of-the-art performanc ... Full text Cite

Understanding the trade-offs of device, circuit and application in ReRAM-based neuromorphic computing systems

Conference Technical Digest - International Electron Devices Meeting, IEDM · January 23, 2018 Resistive memory (ReRAM) features nonvolatile storage, high resistance, dense structure, and analogy to the matrix-vector multiplication operation. These characteristics demonstrate the great potential of ReRAM in the development of neuromorphic computing ... Full text Cite

Generalized inverse optimization through online learning

Conference Advances in Neural Information Processing Systems · January 1, 2018 Inverse optimization is a powerful paradigm for learning preferences and restrictions that explain the behavior of a decision maker, based on a set of external signal and the corresponding decision pairs. However, most inverse optimization algorithms are d ... Cite

Learning intrinsic sparse structures within long short-term memory

Conference 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings · January 1, 2018 © Learning Representations, ICLR 2018 - Conference Track Proceedings.All right reserved. Model compression is significant for the wide adoption of Recurrent Neural Networks (RNNs) in both user devices possessing limited resources and business clusters requ ... Cite

Learning intrinsic sparse structures within long short-term memory

Conference 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings · January 1, 2018 Model compression is significant for the wide adoption of Recurrent Neural Networks (RNNs) in both user devices possessing limited resources and business clusters requiring quick responses to large-scale service requests. This work aims to learn structural ... Cite

Coordinating Filters for Faster Deep Neural Networks

Conference Proceedings of the IEEE International Conference on Computer Vision · December 22, 2017 Very large-scale Deep Neural Networks (DNNs) have achieved remarkable successes in a large variety of computer vision tasks. However, the high computation intensity of DNNs makes it challenging to deploy these models on resource-limited systems. Some studi ... Full text Cite

MobiCore: An adaptive hybrid approach for power-efficient CPU management on Android devices

Conference International System on Chip Conference · December 18, 2017 Smartphones are becoming essential devices used for various types of applications in our daily life. To satisfy the ever-increasing performance requirement, the number of CPU cores in a phone keeps growing, which imposes a great impact on its power consump ... Full text Cite

Behaviors of multi-dimensional forgetting memristor models

Conference Proceedings IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society · December 15, 2017 This letter discusses behaviors of multi-dimensional memristor models. A second dimensional memristor model is extracted from the third dimensional memristor model. Parameters of this memristor model are physically defined and analyzed. A comparison betwee ... Full text Cite

An ensemble approach to activity recognition based on binary sensor readings

Conference 2017 IEEE 19th International Conference on e-Health Networking, Applications and Services, Healthcom 2017 · December 14, 2017 Research on activity recognition provides a wide range of ubiquitous computing applications. Once activities are recognized, computers can use this information to provide people with suitable services. In the past decade, many classification algorithms hav ... Full text Cite

A closed-loop design to enhance weight stability of memristor based neural network chips

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 13, 2017 Compared with the algorithm optimizations, brain-inspired neural network chips aim to fundamentally change the computer architecture and therefore enhance the computation capability and performance in advanced data processing. In recent years, memristor te ... Full text Cite

VoCaM: Visualization oriented convolutional neural network acceleration on mobile system: Invited paper

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 13, 2017 Convolutional Neural Networks (CNNs) have been widely investigated as some of the most promising solution for various computer vision tasks. However, CNNs introduce massive computing overhead due to their complex network computing flow, resulting in signif ... Full text Cite

MeDNN: A distributed mobile system with enhanced partition and deployment for large-scale DNNs

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 13, 2017 Deep Neural Networks (DNNs) are pervasively used in a significant number of applications and platforms. To enhance the execution efficiency of large-scale DNNs, previous attempts focus mainly on client-server paradigms, relying on powerful external infrast ... Full text Cite

AdaLearner: An adaptive distributed mobile learning system for neural networks

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 13, 2017 Neural networks hold a critical domain in machine learning algorithms because of their self-adaptiveness and state-of-the-art performance. Before the testing (inference) phases in practical use, sophisticated training (learning) phases are required, callin ... Full text Cite

A compact DNN: Approaching GoogLeNet-level accuracy of classification and domain adaptation

Conference Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 · November 6, 2017 Recently, DNN model compression based on network architecture design, e.g., SqueezeNet, attracted a lot of attention. Compared to well-known models, these extremely compact networks don't show any accuracy drop on image classification. An emerging question ... Full text Cite

Improving write performance and extending endurance of object-based NAND flash devices

Journal Article ACM Transactions on Embedded Computing Systems · November 1, 2017 Write amplification is a major cause of performance and endurance degradations in NAND flash-based storage systems. In an object-based NAND flash device (ONFD), two causes of write amplification are onode partial update and cascading update. Here, onode is ... Full text Cite

A quantization-aware regularized learning method in multilevel memristor-based neuromorphic computing system

Conference NVMSA 2017 - 6th IEEE Non-Volatile Memory Systems and Applications Symposium · October 10, 2017 In this work, we propose a regularized learning method that is able to take into account the deviation of the memristor-mapped synaptic weights from the target values determined during the training process. Experimental results obtained when utilizing the ... Full text Cite

An Energy-Efficient GPGPU Register File Architecture Using Racetrack Memory

Journal Article IEEE Transactions on Computers · September 1, 2017 Extreme multi-Threading and fast thread switching in modern GPGPU require a large, power-hungry register file (RF), which quickly becomes one of major obstacles on the upscaling path of energy-efficient GPGPU computing. In this work, we propose to implemen ... Full text Cite

A Compact Memristor-Based Dynamic Synapse for Spiking Neural Networks

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · August 1, 2017 Recent advances in memristor technology lead to the feasibility of large-scale neuromorphic systems by leveraging the similarity between memristor devices and synapses. For instance, memristor cross-point arrays can realize dense synapse network among hund ... Full text Cite

FlexLevel NAND Flash Storage System Design to Reduce LDPC Latency

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · July 1, 2017 Aggressive technology scaling and adoption of multilevel-cell technique lead to progressive increase of bit error rate (BER) of NAND flash memory. Consequently, conventional error correction code is not adequate to guarantee system reliability. As an alter ... Full text Cite

Persistent and Nonpersistent Error Optimization for STT-RAM Cell Design

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · July 1, 2017 Rapidly increasing demands for memory capacity and severe technical scaling challenges of conventional memory technologies motivated recent investments on next-generation nonvolatile memory technologies. As a promising candidate, spin-transfer torque rando ... Full text Cite

An FPGA design framework for CNN sparsification and acceleration

Conference Proceedings - IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2017 · June 30, 2017 Convolutional neural networks (CNNs) have recently broken many performance records in image recognition and object detection problems. The success of CNNs, to a great extent, is enabled by the fast scaling-up of the networks that learn from a huge volume o ... Full text Cite

Hardware implementation of echo state networks using memristor double crossbar arrays

Conference Proceedings of the International Joint Conference on Neural Networks · June 30, 2017 Neuromorphic computing systems are inspired by humans brains, where data are stored and processed at the same location. Contrary to von Neumann systems, neuromorphic computing systems offer excellent real-time processing for huge data sizes, at low costs a ... Full text Cite

A lightweight progress maximization scheduler for non-volatile processor under unstable energy harvesting

Conference ACM SIGPLAN Notices · June 21, 2017 Energy harvesting techniques become increasingly popular as power supplies for embedded systems. However, the harvested energy is intrinsically unstable. Thus, the program execution may be interrupted frequently. Although the development of non-volatile pr ... Full text Cite

Giant Spin-Hall assisted STT-RAM and logic design

Journal Article Integration, the VLSI Journal · June 1, 2017 In recent years, Spin-Transfer Torque Random Access Memory (STT-RAM) has attracted significant attentions from both industry and academia due to its attractive attributes such as small cell area and non-volatility. However, long switching time and large pr ... Full text Cite

Recent Technology Advances of Emerging Memories

Journal Article IEEE Design and Test · June 1, 2017 Phase change memory, spin-transfer torque random access memory, and resistive random access memory are three major emerging memory technologies that receive tremendous attentions from both academia and industry. In this survey article, the authors summariz ... Full text Cite

Neuromorphic Hardware Acceleration Enabled by Emerging Technologies

Chapter · May 15, 2017 This book describes the current state of the art in big-data analytics, from a technology and hardware architecture perspective. ... Cite

Hybrid spiking-based multi-layered self-learning neuromorphic system based on memristor crossbar arrays

Conference Proceedings of the 2017 Design, Automation and Test in Europe, DATE 2017 · May 11, 2017 Neuromorphic computing systems are under heavy investigation as a potential substitute for the traditional von Neumann systems in high-speed low-power applications. Recently, memristor crossbar arrays were utilized in realizing spiking-based neuromorphic s ... Full text Cite

Understanding the design of IBM neurosynaptic system and its tradeoffs: A user perspective

Conference Proceedings of the 2017 Design, Automation and Test in Europe, DATE 2017 · May 11, 2017 As a large-scale commercial spiking-based neuromorphic computing platform, IBM TrueNorth processor received tremendous attentions in society. However, one of the known issues in TrueNorth design is the limited precision of synaptic weights. The current wor ... Full text Cite

Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar

Conference Proceedings of the 2017 Design, Automation and Test in Europe, DATE 2017 · May 11, 2017 RRAM crossbar consisting of memristor devices can naturally carry out the matrix-vector multiplication; it thereby has gained a great momentum as a highly energy-efficient accelerator for neuromorphic computing. The resistance variations and stuck-at fault ... Full text Cite

MoDNN: Local distributed mobile computing system for Deep Neural Network

Conference Proceedings of the 2017 Design, Automation and Test in Europe, DATE 2017 · May 11, 2017 Although Deep Neural Networks (DNN) are ubiquitously utilized in many applications, it is generally difficult to deploy DNNs on resource-constrained devices, e.g., mobile platforms. Some existing attempts mainly focus on client-server computing paradigm or ... Full text Cite

PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning

Conference Proceedings - International Symposium on High-Performance Computer Architecture · May 5, 2017 Convolution neural networks (CNNs) are the heart of deep learning applications. Recent works PRIME [1] and ISAAC [2] demonstrated the promise of using resistive random access memory (ReRAM) to perform neural computations in memory. We found that training c ... Full text Cite

A novel image encryption algorithm based on the chaotic system and DNA computing

Journal Article International Journal of Modern Physics C · May 1, 2017 A novel image encryption algorithm using the chaotic system and deoxyribonucleic acid (DNA) computing is presented. Different from the traditional encryption methods, the permutation and diffusion of our method are manipulated on the 3D DNA matrix. Firstly ... Full text Cite

Exploiting multiple write modes of Nonvolatile main memory in embedded systems

Journal Article ACM Transactions on Embedded Computing Systems · May 1, 2017 Existing Nonvolatile Memories (NVMs) have many attractive features to be the main memory of embedded systems. These features include low power, high density, and better scalability. Recently, Multilevel Cell (MLC) NVM has gained more and more popularity as ... Full text Cite

Energy-Aware Adaptive Restore Schemes for MLC STT-RAM Cache

Journal Article IEEE Transactions on Computers · May 1, 2017 For the sake of higher cell density while achieving near-zero standby power, recent research progress in Magnetic Tunneling Junction (MTJ) devices has leveraged Multi-Level Cell (MLC) configurations of Spin-Transfer Torque Random Access Memory (STT-RAM). H ... Full text Cite

A visually secure image encryption scheme based on compressive sensing

Journal Article Signal Processing · May 1, 2017 A novel visually secure image encryption scheme based on compressive sensing (CS) is proposed. Firstly, the plain image is transformed into wavelet coefficients, and then confused by a zigzag path and encrypted into a compressed cipher image using compress ... Full text Cite

A novel image encryption algorithm based on the chaotic system and DNA computing

Journal Article International Journal of Modern Physics C · May 2017 A novel image encryption algorithm using the chaotic system and deoxyribonucleic acid (DNA) computing is presented. Different from the traditional encryption methods, the permutation and diffusion of our method are manipulated on the 3D DNA matrix. Firstly ... Cite

Modeling STT-RAM fabrication cost and impacts in NVSim

Conference 2016 7th International Green and Sustainable Computing Conference, IGSC 2016 · April 4, 2017 Reducing power consumption of computational systems in the use-phase has become a significant focus to decrease thermal impacts and overall energy consumption of computing systems while having battery life benefits for increasingly mobile computing product ... Full text Cite

Data-Pattern-Aware Error Prevention Technique to Improve System Reliability

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · April 1, 2017 Program disturb, read disturb, and retention time noise are identified as three major contributors to multilevel cell (MLC) NAND flash memory bit errors. With program/erase cycling and technology scaling, bit error rate (BER) of MLC NAND flash memory rapid ... Full text Cite

An image encryption algorithm based on the memristive hyperchaotic system, cellular automata and DNA sequence operations

Journal Article Signal Processing: Image Communication · March 1, 2017 A novel image encryption scheme employing the memristive hyperchaotic system, cellular automata (CA) and DNA sequence operations is presented, which consists of diffusion process. SHA 256 hash function is used to give the secret key and compute the initial ... Full text Cite

Extending the lifetime of object-based NAND flash device with STT-RAM/DRAM hybrid buffer

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 16, 2017 A major limitation of NAND flash memory is erase-before-program characteristics. It incurs write amplification, severely degrading system performance and endurance. Previous works reveal that metadata update substantially contributes to write amplification ... Full text Cite

Low-power neuromorphic speech recognition engine with coarse-grain sparsity

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · February 16, 2017 In recent years, we have seen a surge of interest in neuromorphic computing and its hardware design for cognitive applications. In this work, we present new neuromorphic architecture, circuit, and device co-designs that enable spike-based classification fo ... Full text Cite

An image encryption scheme based on three-dimensional Brownian motion and chaotic system

Journal Article Chinese Physics B · February 1, 2017 At present, many chaos-based image encryption algorithms have proved to be unsafe, few encryption schemes permute the plain images as three-dimensional (3D) bit matrices, and thus bits cannot move to any position, the movement range of bits are limited, an ... Full text Cite

Forgetting memristor based neuromorphic system for pattern training and recognition

Journal Article Neurocomputing · January 26, 2017 This paper presents a neuromorphic system for mean variance based pattern training and recognition. The system contains a self-learning circuit, a training circuit and a recognition circuit. Memristor model with forgetting effect which has memory ability a ... Full text Cite

Nonvolatile memory design: Magnetic, resistive, and phase change

Book · January 1, 2017 The manufacture of flash memory, which is the dominant nonvolatile memory technology, is facing severe technical barriers. So much so, that some emerging technologies have been proposed as alternatives to flash memory in the nano-regime. Nonvolatile Memory ... Full text Cite

Looking Ahead for Resistive Memory Technology: A broad perspective on ReRAM technology for future storage and computing

Journal Article IEEE Consumer Electronics Magazine · January 1, 2017 Resistive random-access memory (ReRAM) is regarded as one of the most promising alternative nonvolatile memory technologies for its advantages in very-high-storage density, simple structure, low power consumption, and long endurance, as well as good compat ... Full text Cite

In-place logic obfuscation for emerging nonvolatile FPGAs

Chapter · January 1, 2017 To enhance system integrity of FPGA-based embedded systems on hardware design and data communication, we propose a hardware security scheme for nonvolatile resistive random access memory (RRAM) based FPGA, in which internal block RAM (BRAMs) are used for c ... Full text Cite

A novel chaos-based image encryption algorithm using DNA sequence operations

Journal Article Optics and Lasers in Engineering · January 1, 2017 An image encryption algorithm based on chaotic system and deoxyribonucleic acid (DNA) sequence operations is proposed in this paper. First, the plain image is encoded into a DNA matrix, and then a new wave-based permutation scheme is performed on it. The c ... Full text Cite

TernGrad: Ternary gradients to reduce communication in distributed deep learning

Conference Advances in Neural Information Processing Systems · January 1, 2017 High network communication cost for synchronizing gradients and parameters is the well-known bottleneck of distributed training. In this work, we propose TernGrad that uses ternary gradients to accelerate distributed deep learning in data parallelism. Our ... Cite

Nanoscale memory architectures for neuromorphic computing

Chapter · January 1, 2017 216On one hand, machine learning has been widely used in data processing to help users understand the underlying property of the data [1]. As a popular type of machine learning model, neural network [2] processes input data by multiplying them with layers ... Full text Cite

Faster cnns with direct sparse convolutions and guided pruning

Conference 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings · January 1, 2017 Phenomenally successful in practical inference problems, convolutional neural networks (CNN) are widely deployed in mobile devices, data centers, and even supercomputers. The number of parameters needed in CNNs, however, are often large and undesirable. Co ... Cite

RAM and TCAM designs by using STT-MRAM

Conference 2016 16th Non-Volatile Memory Technology Symposium, NVMTS 2016 · December 9, 2016 Spin-transfer torque magnetic random access memory (STT-MRAM) is a prospective candidate for cache and main memory designs. However, the reliable revision of magnetization using current requires high current density, which is hardly affordable in aggressiv ... Full text Cite

A Time, Energy, and Area Efficient Domain Wall Memory-Based SPM for Embedded Systems

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · December 1, 2016 Applications that run in the embedded systems normally should be finished within a timing constraint in energy-efficient fashion. Due to these two requirements, the embedded systems often employ software-controlled scratch pad memory (SPM) instead of hardw ... Full text Cite

Design techniques of eNVM-enabled neuromorphic computing systems

Conference Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016 · November 22, 2016 The recently emerged research on 'neuromorphic computing', which stands for hardware acceleration of brain-inspired computing, has become one of the most active research areas in computer engineering. In this invited paper, we start with a background intro ... Full text Cite

Neural processor design enabled by memristor technology

Conference 2016 IEEE International Conference on Rebooting Computing, ICRC 2016 - Conference Proceedings · November 8, 2016 Matrix-vector multiplication is a key computing operation in neural processor design and hence greatly affects the execution efficiency. Memristor crossbar is highly attractive for the implementation of matrix-vector multiplication for its analog storage s ... Full text Cite

Security of neuromorphic computing: Thwarting learning attacks using memristor's obsolescence effect

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 7, 2016 Neuromorphic architectures are widely used in many applications for advanced data processing, and often implements proprietary algorithms. In this work, we prevent an attacker with physical access from learning the proprietary algorithm implemented by the ... Full text Cite

Security challenges in smart surveillance systems and the solutions based on emerging nano-devices

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 7, 2016 Modern smart surveillance systems can not only record the monitored environment but also identify the targeted objects and detect anomaly activities. These advanced functions are often facilitated by deep neural networks, achieving very high accuracy and l ... Full text Cite

A data locality-aware design framework for reconfigurable sparse matrix-vector multiplication kernel

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 7, 2016 Sparse matrix-vector multiplication (SpMV) is an important computational kernel in many applications. For performance improvement, software libraries designated for SpMV computation have been introduced, e.g., MKL library for CPUs and cuSPARSE library for ... Full text Cite

Scope - Quality retaining display rendering workload scaling based on user-smartphone distance

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · November 7, 2016 Modern smartphone display system come equipped with powerful GPU's capable of rendering advanced 2D and 3D graphics. These GPU's make up a significant portion of the system power profile due to the high resolution and framerate of smartphone display. These ... Full text Cite

Statistical Cache Bypassing for Non-Volatile Memory

Journal Article IEEE Transactions on Computers · November 1, 2016 With the increasing data throughput requirement, non-volatile memories, such as STT-RAM, PCM and RRAM, have become very competitive designs as on-chip caches in chip-multi-processors (CMPs). Since the write operations are more expensive in an asymmetric-ac ... Full text Cite

ApesNet: A pixel-wise efficient segmentation network

Conference Proceedings of the 14th ACM/IEEE Symposium on Embedded Systems for Real-Time Multimedia, ESTIMedia 2016 · October 1, 2016 Autonomous driving can effectively reduce traffic congestion and road accidents. Therefore, it is necessary to implement an efficient high-level, scene understanding model in an embedded device with limited power and sources. Toward this goal, we propose A ... Full text Cite

A design to reduce write amplification in object-based NAND flash devices

Conference Proceedings of the 11th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES 2016 · October 1, 2016 Write amplification is a major cause of performance and endurance degradations in NAND ash based storage sys-tems. In an object-based NAND ash device, two causes of write amplification are onode partial update and cascad-ing update. Updating one onode, a k ... Full text Cite

A novel color image encryption algorithm based on genetic recombination and the four-dimensional memristive hyperchaotic system

Journal Article Chinese Physics B · October 1, 2016 Recently, many image encryption algorithms based on chaos have been proposed. Most of the previous algorithms encrypt components R, G, and B of color images independently and neglect the high correlation between them. In the paper, a novel color image encr ... Full text Cite

Exploring the optimal learning technique for IBM TrueNorth platform to overcome quantization loss

Conference Proceedings of the 2016 IEEE/ACM International Symposium on Nanoscale Architectures, NANOARCH 2016 · September 14, 2016 As the first large-scale commercial spiking-based neuromorphic computing platform, IBM TrueNorth chip received tremendous attentions in society. However, one of the known issues in TrueNorth design is the limited precision of synaptic weights, each of whic ... Full text Cite

ObjNandSim: Object-based NAND flash device simulator

Conference 2016 5th Non-Volatile Memory Systems and Applications Symposium, NVMSA 2016 · August 17, 2016 An object-based NAND flash storage system (ONFS) is proposed to overcome the architectural limitation of the existing block-based storage system. The ONFS can improve system performance by removing redundant software layers and reducing garbage collection ... Full text Cite

Design and Implementation of a 4Kb STT-MRAM with Innovative 200nm Nano-ring Shaped MTJ

Conference Proceedings of the International Symposium on Low Power Electronics and Design · August 8, 2016 Programmability is as a severe challenge in development of spin-transfer torque magnetic random access memory (STT-MRAM). Theoretical analysis have indicated that nano-ring shaped magnetic tunneling junction (NR-MTJ) can achieve lower write current and hig ... Full text Cite

Dictionary learning for sparse representation and classification of neural spikes.

Conference Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference · August 2016 Spike sorting is the problem of identifying and clustering neurons spiking activity from recorded extracellular electro-physiological data. This is important for experimental neuroscience. Existing approaches to solve this problem consist of three steps: s ... Full text Cite

A neuromorphic ASIC design using one-selector-one-memristor crossbar

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 The applications of memristors in neuromorphic computing have been extensively studied for its analogy to synapse. To overcome sneak path issue, nonlinear resistive selectors have been introduced to the design of memristor crossbar, enabling a high integra ... Full text Cite

Heterogeneous systems with reconfigurable neuromorphic computing accelerators

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 Developing heterogeneous system with hardware accelerator is a promising solution to implement high performance applications where explicitly programmed, rule-based algorithms are either infeasible or inefficient. However, mapping a neural network model to ... Full text Cite

Security of neuromorphic systems: Challenges and solutions

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 With the rapid growth of big-data applications, advanced data processing technologies, e.g., machine learning, are widely adopted in many industry fields. Although these technologies demonstrate powerful data analyzing and processing capability, there exis ... Full text Cite

Built-in selectors self-assembled into memristors

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 We demonstrate an approach to build a selector into ReRAM (memristors) using engineered materials. In this approach, a segment(s) of nonlinear material is self-assembled into the conduction channel (s) (filament) of a memristor. The nonlinear material exhi ... Full text Cite

Cyclical sensing integrate-and-fire circuit for memristor array based neuromorphic computing

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 The brain-inspired, spike-based neuromorphic system is highly anticipated in the artificial intelligence community due to its high computational efficiency. The recently developed memristor-crossbar-array technology, which is able to efficiently emulate th ... Full text Cite

Adaptive refreshing and read voltage control scheme for FeDRAM

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 29, 2016 Ferroelectric Field Effect Transistor (FeFET) is a promising nonvolatile device which provides high integration density, fast programming speed, and excellent CMOS compatibility. In general, the non-volatility of FeFET is impacted by its physical structure ... Full text Cite

Recent progresses of STT memory design and applications

Conference Proceedings - 2015 IEEE 11th International Conference on ASIC, ASICON 2015 · July 21, 2016 Spin-transfer torque (STT) memory is a promising new memory technology that aims replacing SRAM and DRAM in embedded and stand-alone applications. The data is normally stored in a magnetic device, e.g., magnetic tunnelling junction (MTJ), and represented b ... Full text Cite

Practical power consumption analysis with current smartphones

Conference International System on Chip Conference · July 2, 2016 In this paper, we analyzed the power consumption of all Samsung Galaxy smartphones to explore modern smartphones' power consumption characters. With dedicated measurement and analysis, we found that, some previously emphasized power hungry consumers, like ... Full text Cite

A new learning method for inference accuracy, core occupation, and performance co-optimization on TrueNorth chip

Conference Proceedings - Design Automation Conference · June 5, 2016 IBM TrueNorth chip uses digital spikes to perform neuromorphic computing and achieves ultrahigh execution parallelism and power efficiency. However, in TrueNorth chip, low quantization resolution of the synaptic weights and spikes significantly limits the ... Full text Cite

TEMP: Thread batch enabled memory partitioning for GPU

Conference Proceedings - Design Automation Conference · June 5, 2016 As massive multi-threading in GPU imposes tremendous pressure on memory subsystems, efficient bandwidth utilization becomes a key factor affecting the GPU throughput. In this work, we propose thread batch enabled memory partitioning (TEMP), to improve GPU ... Full text Cite

NVSim-VXs: An improved NVSim for variation aware STT-RAM simulation

Conference Proceedings - Design Automation Conference · June 5, 2016 Spin-transfer torque random access memory (STT-RAM) recently received significant attentions for its promising characteristics in cache and memory applications. As an early-stage modeling tool, NVSim has been widely adopted for simulations of emerging nonv ... Full text Cite

MORPh: Mobile OLED-friendly recording and playback system for low power video streaming

Conference Proceedings - Design Automation Conference · June 5, 2016 Even with the adoption of the latest OLED technology, the display panel remains one of the most power-hungry components in smartphones. Existing attempts for OLED power optimization have mainly focused on modifying the content that is shown on the display ... Full text Cite

AOS: Adaptive overwrite scheme for energy-efficient MLC STT-RAM cache

Conference Proceedings - Design Automation Conference · June 5, 2016 Spin-Transfer Torque Random Access Memory (STT-RAM) has been identified as an advantageous candidate for on-chip memory technology due to its high density and ultra low leakage power. Recent research progress in Magnetic Tunneling Junction (MTJ) devices ha ... Full text Cite

Spin-hall assisted STT-RAM design and discussion

Conference Proceedings of the 18th ACM/IEEE System Level Interconnect Prediction 2016 Workshop, SLIP 2016 · June 4, 2016 In recent years, Spin-Transfer Torque Random Access Memory (STT-RAM) has attracted significant attentions from both industry and academia due to its attractive attributes such as small cell area and non-volatility. However, long switching time and large pr ... Full text Cite

Spintronic Memristor as Interface between DNA and Solid State Devices

Journal Article IEEE Journal on Emerging and Selected Topics in Circuits and Systems · June 1, 2016 Recently biomolecular computing platforms have been widely investigated with great potentials in both biomedical research and practices, such as using molecular structures of DNA to present the data bits and to operate the logic. Emerging CMOS/molecular hy ... Full text Cite

The applications of NVM technology in hardware security

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 18, 2016 The emerging nonvolatile memory (NVM) technologies have demonstrated great potentials in revolutionizing modern memory hierarchy because of their many promising properties: nanosecond read/write time, small cell area, non-volatility, and easy CMOS integrat ... Full text Cite

Harmonica: A Framework of Heterogeneous Computing Systems with Memristor-Based Neuromorphic Computing Accelerators

Journal Article IEEE Transactions on Circuits and Systems I: Regular Papers · May 1, 2016 Following technology scaling, on-chip heterogeneous architecture emerges as a promising solution to combat the power wall of microprocessors. This work presents Harmonica - aframework of heterogeneous computing system enhanced by memristor-based neuromorph ... Full text Cite

A holistic tri-region MLC STT-RAM design with combined performance, energy, and reliability optimizations

Conference Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016 · April 25, 2016 Multi-level cell spin-transfer torque random access memory (MLC STT-RAM) demonstrates great potentials in onchip cache design for its high storage density and non-volatility but also suffers from the degraded access time, reliability and energy efficiency. ... Full text Cite

Sliding Basket: An adaptive ECC scheme for runtime write failure suppression of STT-RAM cache

Conference Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016 · April 25, 2016 Write reliability is one of the major challenges in design of spin-transfer torque random access memory (STT-RAM) caches. To ensure design quality, error correction code (ECC) scheme is usually adopted in STT-RAM caches. However, it incurs significant hard ... Full text Cite

Hardware acceleration for neuromorphic computing: An evolving view

Conference 2015 15th Non-Volatile Memory Technology Symposium, NVMTS 2015 · April 20, 2016 The rapid growth of computing capacity of modern microprocessors enables the wide adoption of machine learning and neural network models. The ever-increasing demand for performance, combining with the concern on power budget, motivated the recent research ... Full text Cite

A novel PUF based on cell error rate distribution of STT-RAM

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 7, 2016 Physical Unclonable Functions (PUFs) have been widely proposed as security primitives to provide device identification and authentication. Recently, PUFs based on Non-volatile Memory (NVM) are widely proposed since the promise of NVMs' wide application. In ... Full text Cite

Footfall-GPS polling scheduler for power saving on wearable devices

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 7, 2016 Wrist-worn wearable fitness devices, such as FitBit and Apple Watch, have become popular in recent years. Runners can use the GPS embedded in these wearable devices to log the route taken during their exercise, providing vital feedback on pace and distance ... Full text Cite

Thermal optimization for memristor-based hybrid neuromorphic computing systems

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 7, 2016 Neuromorphic computing is used for accelerating the computation of neural network which can simulate the brain of animal and composed by neurons and synapses. However, the neuromorphic computing with the traditional computer architecture leads to serious v ... Full text Cite

SlowMo-enhancing mobile gesture-based authentication schemes via sampling rate optimization

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 7, 2016 In the era of network service, the user authentication becomes more indispensable but also vulnerable. Traditional user verification approaches such as PIN or pattern lock suffer from easy hacking and replica, motivating the research on many new approaches ... Full text Cite

Radiation-induced soft error analysis of STT-MRAM: A device to circuit approach

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · March 1, 2016 Spin-transfer torque magnetic random access memory (STT-MRAM) is a promising emerging memory technology due to its various advantageous features such as scalability, nonvolatility, density, endurance, and fast speed. However, the reliability of STT-MRAM is ... Full text Cite

Learning structured sparsity in deep neural networks

Conference Advances in Neural Information Processing Systems · January 1, 2016 High demand for computation resources severely hinders deployment of large-scale Deep Neural Networks (DNN) in resource constrained devices. In this work, we propose a Structured Sparsity Learning (SSL) method to regularize the structures (i.e., filters, c ... Cite

MORPh: Mobile OLED power friendly camera system

Conference Proceedings of the 2016 27th International Symposium on Rapid System Prototyping: Shortening the Path from Specification to Prototype, RSP 2016 · January 1, 2016 With superior advantages of better display quality and power efficiency, the latest OLED technology has achieved unprecedented popularity in the display screen market. However, the OLED remains one of the most power-hungry components in mobile devices. Var ... Full text Cite

The bipolar and unipolar reversible behavior on the forgetting memristor model

Journal Article Neurocomputing · January 1, 2016 In the further study of our previous forgetting model in Chen et al. (2013) [27], we found that the three dimensional (3D) model can be a more general mathematical model compared with the one dimensional (1D) model. It can describe not only the behaviors o ... Full text Cite

Fork path: Improving efficiency of ORAM by removing redundant memory accesses

Conference Proceedings of the Annual International Symposium on Microarchitecture, MICRO · December 5, 2015 Oblivious RAM (ORAM) is a cryptographic primitive that can prevent information leakage in the access trace to untrusted external memory. It has become an important component in modern secure processors. However, the major obstacle of adopting an ORAM desig ... Full text Cite

RRAM-Based Analog Approximate Computing

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · December 1, 2015 Approximate computing is a promising design paradigm for better performance and power efficiency. In this paper, we propose a power efficient framework for analog approximate computing with the emerging metal-oxide resistive switching random-Access memory ... Full text Cite

Spiking-based matrix computation by leveraging memristor crossbar array

Conference 2015 IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2015 - Proceedings · August 17, 2015 As process technology continues scaling down, the memory barrier becomes more severe. Thus, spiking neuromorphic computing that can significantly enhance computing and communication efficiencies has been widely studied. Both conventional CMOS technology an ... Full text Cite

Compiler-assisted refresh minimization for volatile STT-RAM cache

Journal Article IEEE Transactions on Computers · August 1, 2015 Spin-transfer torque RAM (STT-RAM) has been proposed to build on-chip caches because of its attractive features such as high storage density and ultra low leakage power. However, long write latency and high write energy are the two challenges for STT-RAM. ... Full text Cite

The applications of memristor devices in next-generation cortical processor designs

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 27, 2015 Discovery of memristor opened a new era of the research on universal memory thanks to many attractive properties demonstrated by this emerging device. In this paper, we switch our research focus to neuromorphic computing, which, same as memory technology, ... Full text Cite

A new self-reference sensing scheme for TLC MRAM

Conference Proceedings - IEEE International Symposium on Circuits and Systems · July 27, 2015 Density is one of the major design factors of magnetic random access memory (MRAM). Very recently, a tri-level cell (TLC) structure was proposed to enhance the storage density of MRAM. In this work, we propose a new self-reference sensing scheme for the TL ... Full text Cite

Cloning your mind: Security challenges in cognitive system designs and their solutions

Conference Proceedings - Design Automation Conference · July 24, 2015 With the booming of big-data applications, cognitive information processing systems that leverage advanced data processing technologies, e.g., machine learning and data mining, are widely used in many industry fields. Although these technologies demonstrat ... Full text Cite

VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications

Conference Proceedings - Design Automation Conference · July 24, 2015 Massive multi-threading of GPGPU demands for efficient usage of caches with limited capacity. In this work, we propose a versatile warp scheduler (VWS) to reduce the cache miss rate in GPGPU. VWS retains the intra-warp cache locality using an efficient per ... Full text Cite

RENO: A high-efficient reconfigurable neuromorphic computing accelerator design

Conference Proceedings - Design Automation Conference · July 24, 2015 Neuromorphic computing is recently gaining significant attention as a promising candidate to conquer the well-known von Neumann bottleneck. In this work, we propose RENO - a efficient reconfigurable neuromorphic computing accelerator. RENO leverages the ex ... Full text Cite

FlexLevel: A novel NAND flash storage system design for LDPC latency reduction

Conference Proceedings - Design Automation Conference · July 24, 2015 LDPC code is introduced in NAND flash memory to handle high BER (bit error rate) incurred by technology scaling. Despite strong error correction capability, LDPC decoding induces long NAND flash read latency. In this work, we propose FlexLevel - a robust N ... Full text Cite

Vortex: Variation-aware training for memristor X-bar

Conference Proceedings - Design Automation Conference · July 24, 2015 Recent advances in development of memristor devices and cross-bar integration allow us to implement a low-power on-chIP neuromorphic computing system (NCS) with small footprint. Training methods have been proposed to program the memristors in a crossbar by ... Full text Cite

A spiking neuromorphic design with resistive crossbar

Conference Proceedings - Design Automation Conference · July 24, 2015 Neuromorphic systems recently gained increasing attention for their high computation efficiency. Many designs have been proposed and realized with traditional CMOS technology or emerging devices. In this work, we proposed a spiking neuromorphic design buil ... Full text Cite

An EDA framework for large scale hybrid neuromorphic computing systems

Conference Proceedings - Design Automation Conference · July 24, 2015 In implementations of neuromorphic computing systems (NCS), memristor and its crossbar topology have been widely used to realize fully connected neural networks. However, many neural networks utilized in real applications often have a sparse connectivity, ... Full text Cite

DaTuM: Dynamic tone mapping technique for OLED display power saving based on video classification

Conference Proceedings - Design Automation Conference · July 24, 2015 The adoption of the latest OLED (organic light emitting diode) technology does not change the fact that screen is still one of the most energy-consuming modules in modern smartphones. In this work, we found that video streams from the same video category s ... Full text Cite

Spin-hall assisted STT-RAM design and discussion

Conference 2015 IEEE International Magnetics Conference, INTERMAG 2015 · July 14, 2015 Conventional spin-transfer torque random access memory (STT-RAM) is a promising technology due to its non-volatility and dense cell structure. However, the long switching time of magnetic tunneling junction (MTJ) limits the write speed of the STT-RAM. In o ... Full text Cite

Area and performance co-optimization for domain wall memory in application-specific embedded systems

Conference Proceedings - Design Automation Conference · June 7, 2015 Domain Wall Memory (DWM), a recently developed spin-based non-volatile memory technology, inherently offers unprecedented benefits in density by storing multiple bits in the domains of a ferromagnetic nanowire, which logically resembles a bit-serial tape. ... Full text Cite

Read performance: The newest barrier in scaled stt-ram

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · June 1, 2015 Spin-torque transfer RAM (STT-RAM), a promising alternative to static RAM (SRAM) for reducing leakage power consumption, has been widely studied to mitigate the impact of its asymmetrically long write latency. However, physical effects of technology scalin ... Full text Cite

Compact model of subvolume MTJ and its design application at nanoscale technology nodes

Journal Article IEEE Transactions on Electron Devices · June 1, 2015 The current-induced perpendicular magnetic anisotropy magnetic tunnel junctions (p-MTJs) offer a number of advantages, such as high density and high speed. As p-MTJs downscale to ∼ 40 nm, further performance enhancements can be realized thanks to high spin ... Full text Cite

Multi-bit soft error tolerable L1 data cache based on characteristic of data value

Journal Article Journal of Central South University · May 26, 2015 Due to continuous decreasing feature size and increasing device density, on-chip caches have been becoming susceptible to single event upsets, which will result in multi-bit soft errors. The increasing rate of multi-bit errors could result in high risk of ... Full text Cite

EDA challenges for memristor-crossbar based neuromorphic computing

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 20, 2015 The increasing gap between the high data processing capability of modern computing systems and the limited memory bandwidth motivated the recent significant research on neuromorphic computing systems (NCS), which are inspired from the working mechanism of ... Full text Cite

Giant spin hall effect (GSHE) logic design for low power application

Conference Proceedings -Design, Automation and Test in Europe, DATE · April 22, 2015 Conventional CMOS transistors will reach its power wall, a huge leakage power consumption limits the performance growth when technology scales down, especially beyond 45nm technology nodes. Spin based devices are one of the alternative computing technologi ... Full text Cite

Spiking neural network with RRAM: Can we use it for real-world application?

Conference Proceedings -Design, Automation and Test in Europe, DATE · April 22, 2015 The spiking neural network (SNN) provides a promising solution to drastically promote the performance and efficiency of computing systems. Previous work of SNN mainly focus on increasing the scalability and level of realism in a neural simulation, while fe ... Full text Cite

Low voltage two-state-variable memristor model of vacancy-drift resistive switches

Journal Article Applied Physics A: Materials Science and Processing · April 1, 2015 We illustrate a heuristic two-state-variable memristor model of charged O vacancy-drift resistive switches that include the effects of internal Joule heating on both the electronic transport and the drift velocity (i.e., switching speed) of vacancies in th ... Full text Cite

Reconfigurable Neuromorphic Computing System with Memristor-Based Synapse Design

Journal Article Neural Processing Letters · April 1, 2015 Conventional CMOS technology is slowly approaching its physical limitations and researchers are increasingly utilizing nanotechnology to both extend CMOS capabilities and to explore potential replacements. Novel memristive systems continue to attract growi ... Full text Cite

An efficient STT-RAM-based register file in GPU architectures

Conference 20th Asia and South Pacific Design Automation Conference, ASP-DAC 2015 · March 11, 2015 Modern GPGPUs employ a large register file (RF) to efficiently process heavily parallel threads in single instruction multiple thread (SIMT) fashion. The up-scaling of RF capacity, however, is greatly constrained by large cell area and high leakage power c ... Full text Cite

Checkpoint-aware instruction scheduling for nonvolatile processor with multiple functional units

Conference 20th Asia and South Pacific Design Automation Conference, ASP-DAC 2015 · March 11, 2015 Embedded systems powered with harvested energy experience frequent execution interruption due to unstable energy source. Nonvolatile (NV) register based processor is proposed to realize fast resume after power failure. The states in the volatile registers ... Full text Cite

Circuit design and exponential stabilization of memristive neural networks.

Journal Article Neural networks : the official journal of the International Neural Network Society · March 2015 This paper addresses the problem of circuit design and global exponential stabilization of memristive neural networks with time-varying delays and general activation functions. Based on the Lyapunov-Krasovskii functional method and free weighting matrix te ... Full text Cite

Neuromorphic hardware acceleration enabled by emerging technologies (Invited paper)

Conference Proceedings of the 14th International Symposium on Integrated Circuits, ISIC 2014 · February 2, 2015 The explosion of big data applications imposes severe challenges of data processing speed and scalability on traditional computer systems. However, the performance of the von Neumann machine is greatly hindered by the increasing performance gap between CPU ... Full text Cite

Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 5, 2015 Neuromorphic computing system (NCS) is a promising architecture to combat the well-known memory bottleneck in Von Neumann architecture. The recent breakthrough on memristor devices made an important step toward realizing a low-power, small-footprint NCS on ... Full text Cite

Memristor crossbar array for image storing

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2015 This letter uses image overlay technique on memristor crossbar array (MCA) structure for image storing. Different programming circuits with time slot techniques are designed for the MCA consisting of the nonlinear HP memristor (HPMCA) and the MCA composed ... Full text Cite

A thermal and process variation aware MTJ switching model and its applications in soft error analysis

Chapter · January 1, 2015 Spin-transfer torque random access memory (STT-RAM) has recently gained increased attention from circuit design and architecture societies. Although STT-RAM offers a good combination of small cell size, nanosecond access time and non-volatility for embedde ... Full text Cite

Statistical reliability/energy characterization in STT-RAM cell designs

Chapter · January 1, 2015 Spin-transfer torque random access memory (STT-RAM) is a very promising candidate to replace the SRAM and DRAM based traditional memory systems. However, the development of STT-RAM is facing two major technical challenges—poor write reliability and high wr ... Full text Cite

A novel self-reference technique for STT-RAM read and write reliability enhancement

Journal Article IEEE Transactions on Magnetics · November 1, 2014 Spin-transfer torque random access memory (STT-RAM) has demonstrated great potential in embedded and stand-alone applications. However, process variations and thermal fluctuations greatly influence the operation reliability of STT-RAM and limit its scalabi ... Full text Cite

CPU-GPU system designs for high performance cloud computing

Chapter · November 1, 2014 Improvement of parallel computing capability will greatly increase the efficiency of high performance cloud computing. By combining the powerful scalar processing on CPU with the efficient parallel processing on GPU, CPU-GPU systems provide a hybrid comput ... Full text Cite

PS3-RAM: A fast portable and scalable statistical STT-RAM reliability/energy analysis method

Journal Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · November 1, 2014 The development of emerging spin-transfer torque random access memory (STT-RAM) is facing two major technical challenges-poor write reliability and high write energy, both of which are severely impacted by process variations and thermal fluctuations. The e ... Full text Cite

NV-TCAM: Alternative interests and practices in NVM designs

Conference 2014 IEEE Non-Volatile Memory Systems and Applications Symposium, NVMSA 2014 · October 16, 2014 TCAM (ternary content addressable memory) is a special memory type that can compare input search data with stored data, and return location (sometime, the associated content) of matched data. TCAM is widely used in microprocessor designs as well as communi ... Full text Cite

3M-PCM: Exploiting multiple write modes MLC phase change main memory in embedded systems

Conference 2014 International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2014 · October 12, 2014 Multi-level Cell (MLC) Phase Change Memory (PCM) has many attractive features to be used as main memory for embedded systems. These features include low power, high density, and better scalability. However, there are also two drawbacks in MLC PCM, namely, ... Full text Cite

Memristor crossbar-based neuromorphic computing system: a case study.

Journal Article IEEE transactions on neural networks and learning systems · October 2014 By mimicking the highly parallel biological systems, neuromorphic hardware provides the capability of information processing within a compact and energy-efficient platform. However, traditional Von Neumann architecture and the limited signal connections ha ... Full text Cite

An adjustable memristor model and its application in small-world neural networks

Conference Proceedings of the International Joint Conference on Neural Networks · September 3, 2014 This paper presents a novel mathematical model for the TiO2 thin-film memristor device discovered by Hewlett-Packard (HP) labs. Our proposed model considers the boundary conditions and the nonlinear ionic drift effects by using a piecewise linear window fu ... Full text Cite

STDP learning rule based on memristor with STDP property

Conference Proceedings of the International Joint Conference on Neural Networks · September 3, 2014 Spike-timing-dependent plasticity (STDP) learning ability has been observed in physical memristors, but whether the STDP is caused by the neuron or the memristor is unclear. In this paper, we proved the STDP property in the model for both symmetric and asy ... Full text Cite

The Prospect of STT-RAM Scaling

Chapter · August 4, 2014 Featuring contributions from well-known and respected industrial and academic experts, this cutting-edge work not only presents the latest research and developments but also: Describes spintronic applications in current and future magnetic ... ... Cite

Spintronic memristor as interface between DNA and solid state devices

Chapter · August 1, 2014 Magnetic sensing is widely used in various modern bio-medical devices since many physiological functions (e.g., nerve impulses) generate electrical currents that create magnetic field [24]. Monitoring such signals by detecting magnetic field is less invasi ... Full text Cite

Memristor crossbar-based unsupervised image learning

Journal Article Neural Computing and Applications · August 1, 2014 This letter presents a new memristor crossbar array system and demonstrates its applications in image learning. The controlled pulse and image overlay technique are introduced for the programming of memristor crossbars and promising a better performance fo ... Full text Cite

The stochastic modeling of TiO2 memristor and its usage in neuromorphic system design

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 27, 2014 Memristor, the fourth basic circuit element, has shown great potential in neuromorphic circuit design for its unique synapse-like feature. However, though the continuous resistance state of memristor has been expected, obtaining and maintaining an arbitrar ... Full text Cite

Prefetching techniques for STT-RAM based last-level cache in CMP systems

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 27, 2014 Prefetching is widely used in modern computer systems to mitigate the impact of long memory access latency by paying extra cost in memory and cache accesses. However, the efficacy of prefetching significantly degrades in the memory hierarchy using the emer ... Full text Cite

DPA: A data pattern aware error prevention technique for NAND flash lifetime extension

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 27, 2014 The recent research reveals that the bit error rate of a NAND flash cell is highly dependent on the stored data patterns. In this work, we propose Data Pattern Aware (DPA) error protection technique to extend the lifespan of NAND flash based storage system ... Full text Cite

STD-TLB: A STT-RAM-based dynamically-configurable translation lookaside buffer for GPU architectures

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 27, 2014 Translation lookaside buffer (TLB) was recently introduced into modern graphics processing unit (GPU) architectures to support virtual memory addressing. Compared to CPUs, the performance of GPUs is more sensitive to the capacity of TLBs because of heavier ... Full text Cite

Training itself: Mixed-signal training acceleration for memristor-based neural network

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 27, 2014 The artificial neural network (ANN) is among the most widely used methods in data processing applications. The memristor-based neural network further demonstrates a power efficient hardware realization of ANN. Training phase is the critical operation of me ... Full text Cite

A heterogeneous computing system with memristor-based neuromorphic accelerators

Conference 2014 IEEE High Performance Extreme Computing Conference, HPEC 2014 · February 11, 2014 As technology scales, on-chip heterogeneous architecture emerges as a promising solution to combat the power wall of microprocessors. In this work, we propose a heterogeneous computing system with memristor-based neuromorphic computing accelerators (NCAs). ... Full text Cite

Bio-inspired computing with resistive memories - Models, architectures and applications

Conference Proceedings - IEEE International Symposium on Circuits and Systems · January 1, 2014 The traditional Von Neumann architecture has constrained the potential for applying massively parallel architecture to embedded high performance computing where we must optimize the size, weight and power of the system. Inspired by highly parallel biologic ... Full text Cite

ICE: Inline calibration for memristor crossbar-based computing engine

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2014 The emerging neuromorphic computation provides a revolutionary solution to the alternative computing architecture and effectively extends Moore's Law. The discovery of the memristor presents a promising hardware realization of neuromorphic systems with inc ... Full text Cite

Exploration of GPGPU register file architecture using domain-wall-shift- write based racetrack memory

Conference Proceedings - Design Automation Conference · January 1, 2014 SRAM based register le (RF) is one of the major factors lim-iting the scaling of GPGPU. In this work, we propose to use the emerging nonvolatile domain-wall-shift-write based race-track memory (DWSW-RM) to implement a power-effcient GPGPU RF, of which the ... Full text Cite

A new field-assisted access scheme of STT-RAM with self-reference capability

Conference Proceedings - Design Automation Conference · January 1, 2014 Spin-transfer torque random access memory (STT-RAM) has demonstrated great potentials in embedded and stand-alone applications. However, process variations and thermal fluctuations greatly influence the operation reliability of STT-RAM and limit its scalab ... Full text Cite

A phenomenological memristor model for short-term/long-term memory

Journal Article Physics Letters, Section A: General, Atomic and Solid State Physics · January 1, 2014 Memristor is considered to be a natural electrical synapse because of its distinct memory property and nanoscale. In recent years, more and more similar behaviors are observed between memristors and biological synapse, e.g., short-term memory (STM) and lon ... Full text Cite

SBAC: A statistics based cache bypassing method for asymmetric-access caches

Conference Proceedings of the International Symposium on Low Power Electronics and Design · January 1, 2014 Asymmetric-access caches with emerging technologies, such as STT-RAM and RRAM, have become very competitive designs recently. Since the write operations consume more time and energy than read ones, data should bypass an asymmetric-access cache unless the l ... Full text Cite

A hybrid solid-state storage architecture for the performance, energy consumption, and lifetime improvement

Chapter · January 1, 2014 In recent years, many systems have employed NAND flash memory as storage devices because of its advantages of high I/O performance, increasing capacity, and falling cost. On the other hand, the performance of NAND flash memory is limited by its erase-befor ... Full text Cite

An energy-efficient 3D stacked STT-RAM cache architecture for CMPs

Chapter · January 1, 2014 In this chapter, we introduce how to adopt spin-transfer torque random access memory (STT-RAM) as on-chip L2 caches to achieve better performance and lower energy consumption, compared to traditional L2 cache designs. STT-RAM is a promising memory technolo ... Full text Cite

User classification and authentication for mobile device based on gesture recognition

Journal Article Advances in Information Security · January 1, 2014 Full text Cite

Memristive radial basis function neural network for parameters adjustment of PID controller

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2014 Radial basis function (RBF) based-identification proportional– integral–derivative (PID) can automatically adjust the parameters of PID controller with strong self-organization, self-learning and self-adaptive ability. However, the compound controller has ... Full text Cite

STT-RAM reliability enhancement through ECC and access scheme optimization

Conference Proceedings of the International Symposium on Consumer Electronics, ISCE · January 1, 2014 Multi-level cell Spin-Transfer Torque RAM (MLC STT-RAM) greatly suffers from the significantly degraded operation reliability and high programming cost. In this paper, a novel MLC design, namely ternary-state MLC (TS-MLC STT-RAM), is proposed for high-reli ... Full text Cite

NAND flash service lifetime estimate with recovery effect and retention time relaxation

Journal Article Journal of Central South University · January 1, 2014 A service life model of NAND flash and threshold voltage shift process is proposed to calculate the service life and endurance. The relationships among achievable program/erase (P/E) cycles, recovery time, bad block rate and storage time are analyzed. The ... Full text Cite

Energy efficient neural networks for big data analytics

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2014 The world is experiencing a data revolution to discover knowledge in big data. Large scale neural networks are one of the mainstream tools of big data analytics. Processing big data with large scale neural networks includes two phases: the training phase a ... Full text Cite

State-restrict MLC stt-ram designs for high-reliable high-performance memory system

Conference Proceedings - Design Automation Conference · January 1, 2014 Multi-level Cell Spin-Transfer Torque Random AccessMemory (MLC STT-RAM) is a promising nonvolatile memory technology for highcapacity and high-performance applications. However, the reliability concerns and the complicated access mechanism greatly hinder t ... Full text Cite

Demystifying energy usage in smartphones

Conference Proceedings - Design Automation Conference · January 1, 2014 In this paper, we presented our recent characterization and analysis on the power consumption of smartphone radio components, including Wi-Fi, GPS and cellular (3G/4G) modules. Different from previous research that focused on the properties of single modul ... Full text Cite

Ebutton: A wearable computer for health monitoring and personal assistance

Conference Proceedings - Design Automation Conference · January 1, 2014 Recent advances in mobile devices have made profound changes in people's daily lives. In particular, the impact of easy access of information by the smartphone has been tremendous. However, the impact of mobile devices on healthcare has been limited. Diagn ... Full text Cite

Reduction of data prevention cost and improvement of reliability in MLC NAND flash storage system

Conference 2014 International Conference on Computing, Networking and Communications, ICNC 2014 · January 1, 2014 In recent years, multi-level-cell (MLC) NAND flash technologies are prevailingly employed in both enterprise and consumer storage systems due to the advantages on power consumption and fabrication cost. However, short endurance and long write access time o ... Full text Cite

Mobile GPU power consumption reduction via dynamic resolution and frame rate scaling

Conference 6th Workshop on Power-Aware Computing and Systems, HotPower 2014 · January 1, 2014 The emerging industry trend of ever-increasing display density on mobile devices has dramatically increased workload placed on a mobile GPU's. Because mobile GPU power consumption increases almost linearly with workload, increasing the display density dire ... Cite

FingerShadow: An OLED power optimization based on smartphone touch interactions

Conference 6th Workshop on Power-Aware Computing and Systems, HotPower 2014 · January 1, 2014 Despite that OLED screen has been increasingly adopted in smartphones to save power; screen is still one of the most energy-consuming modules in smartphones. Techniques such as local dimming are proposed to further reduce the power consumption of OLED scre ... Cite

A synapse memristor model with forgetting effect

Journal Article Physics Letters, Section A: General, Atomic and Solid State Physics · December 17, 2013 In this Letter we improved the ion diffusion term proposed in literature [13] and redesigned the previous model as a dynamical model with two more internal state variables 'forgetting rate' and 'retention' besides the original variable 'conductance'. The n ... Full text Cite

Memristor-based approximated computation

Conference Proceedings of the International Symposium on Low Power Electronics and Design · December 11, 2013 The cessation of Moore's Law has limited further improvements in power efficiency. In recent years, the physical realization of the memristor has demonstrated a promising solution to ultra-integrated hardware realization of neural networks, which can be le ... Full text Cite

ADAMS: Asymmetric differential STT-RAM cell structure for reliable and high-performance applications

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2013 Spin-transfer torque random access memory (STT-RAM) is an emerging non-volatile memory technology offering many attractive characteristics like high integration density, nanosecond access time, and good CMOS compatibility. However, the performance and reli ... Full text Cite

C1C: A configurable, compiler-guided STT-RAM L1 cache

Journal Article Transactions on Architecture and Code Optimization · December 1, 2013 Spin-Transfer Torque RAM (STT-RAM), a promising alternative to SRAM for reducing leakage power consumption, has been widely studied to mitigate the impact of its asymmetrically long write latency. Recently, STT-RAM has been proposed for L1 caches by relaxi ... Full text Cite

Considering fabrication in sustainable computing

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2013 The term green computing has become effectively synonymous with low-power/energy computing. However, for computing to be truly sustainable, all phases of the system life-cycle must be considered. In contrast to the considerable effort that has been applied ... Full text Cite

CD-ECC: Content-dependent error correction codes for combating asymmetric nonvolatile memory operation errors

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2013 The write operation asymmetry of many memory technologies causes different write failure rates at 0 →1 and 1 → 0 bit-flipping's. Conventional error correction codes (ECCs) spend the same efforts on both bit-flipping directions, leading to very unbalanced w ... Full text Cite

Low-energy volatile STT-RAM cache design using cache-coherence-enabled adaptive refresh

Journal Article ACM Transactions on Design Automation of Electronic Systems · December 1, 2013 Spin-Torque Transfer RAM (STT-RAM) is a promising candidate for SRAM replacement because of its excellent features, such as fast read access, high density, low leakage power, and CMOS technology compatibility. However, wide adoption of STT-RAM as cache mem ... Full text Cite

Global exponential synchronization of memristor-based recurrent neural networks with time-varying delays.

Journal Article Neural networks : the official journal of the International Neural Network Society · December 2013 This paper deals with the problem of global exponential synchronization of a class of memristor-based recurrent neural networks with time-varying delays based on the fuzzy theory and Lyapunov method. First, a memristor-based recurrent neural network is des ... Full text Cite

Fuzzy modeling and synchronization of different memristor-based chaotic circuits

Journal Article Physics Letters, Section A: General, Atomic and Solid State Physics · November 1, 2013 This Letter is concerned with the problem of fuzzy modeling and synchronization of memristor-based Lorenz circuits with memristor-based Chua's circuits. In this Letter, a memristor-based Lorenz circuit is set up, and illustrated by phase portraits and Lyap ... Full text Cite

On-chip caches built on multilevel spin-transfer torque RAM cells and its optimizations

Journal Article ACM Journal on Emerging Technologies in Computing Systems · October 21, 2013 It has been predicted that a processor's caches could occupy as much as 90% of chip area a few technology nodes from the current ones. In this article, we investigate the use of multilevel spin-transfer torque RAM (STT-RAM) cells in the design of processor ... Full text Cite

BSB training scheme implementation on memristor-based circuit

Conference Proceedings of the 2013 IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013 · October 9, 2013 In this work, we propose a hardware realization of the Brain-State-in-a-Box (BSB) neural network model training algorithm. This method can be implemented as an analog/digital mixed-signal circuit to train memristor crossbar arrays within BSB circuits. The ... Full text Cite

Common-source-line array: An area efficient memory architecture for bipolar nonvolatile devices

Journal Article ACM Transactions on Design Automation of Electronic Systems · October 1, 2013 Traditional array organization of bipolar nonvolatile memories such as STT-MRAM and memristor utilizes two bitlines for cell manipulations.With technology scaling, such bitline pair will soon become the bottleneck for further density improvement. In this a ... Full text Cite

Passivity analysis of memristor-based recurrent neural networks with time-varying delays

Journal Article Journal of the Franklin Institute · October 1, 2013 This paper investigates the delay-dependent exponential passivity problem of the memristor-based recurrent neural networks (RNNs). Based on the knowledge of memristor and recurrent neural network, the model of the memristor-based RNNs is established. Takin ... Full text Cite

MLC STT-RAM design considering probabilistic and asymmetric MTJ switching

Conference Proceedings - IEEE International Symposium on Circuits and Systems · September 9, 2013 Spin-transfer torque random access memory (STT-RAM) has widely believed as a promising candidate for the post-silicon nonvolatile memory technology. In many recent researches, STT-RAM has demonstrated many attractive characteristics, such as nanosecond acc ... Full text Cite

Coordinating prefetching and STT-RAM based last-level cache management for multicore systems

Conference Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI · May 30, 2013 Data prefetching is a common mechanism to mitigate the bottleneck of off-chip memory bandwidth in modern computing systems. Unfortunately, the side effects of prefetching are an additional burden on off-chip communication and increased cache write operatio ... Full text Cite

Mobile user classification and authorization based on gesture usage recognition

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · May 20, 2013 Intelligent mobile devices have been widely serving in almost all aspects of everyday life, spanning from communication, web surfing, entertainment, to daily organizer. A large amount of sensitive and private information is stored on the mobile device, lea ... Full text Cite

Loadsa: A yield-driven top-down design method for STT-RAM array

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · May 20, 2013 As an emerging nonvolatile memory technology, spin-transfer torque random access memory (STT-RAM) faces great design challenges. The large device variations and the thermal-induced switching randomness of the magnetic tunneling junction (MTJ) introduce the ... Full text Cite

Compiler-assisted refresh minimization for volatile STT-RAM cache

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · May 20, 2013 Spin-Transfer Torque RAM (STT-RAM) has been proposed to build on-chip caches because of its attractive features: high storage density and negligible leakage power. Recently, researchers propose to improve the write performance of STT-RAM by relaxing its no ... Full text Cite

A compact modeling of TiO2-TiO2-x memristor

Journal Article Applied Physics Letters · April 15, 2013 We developed a spice-compatible compact model of TiO2-TiO 2-x memristors based on classic ion transportation theory. Our model is shown to simulate important dynamic memristive properties like real-time memristance switching, which are critical in memristo ... Full text Cite

How is energy consumed in smartphone display applications?

Conference ACM HotMobile 2013: The 14th Workshop on Mobile Computing Systems and Applications · April 3, 2013 Smartphones have emerged as a popular and frequently used platform for the consumption of multimedia. New display technologies, such as AMOLED, have been recently introduced to smartphones to fulfill the requirements of these multimedia applications. Howev ... Full text Cite

DA-RAID-5: A disturb aware data protection technique for NAND flash storage systems

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2013 Program disturb, read disturb and retention time limit are three major reasons accounting for the bit errors in NAND flash memory. The adoption of multi-level cell (MLC) technology and technology scaling further aggravates this reliability issue by narrowi ... Full text Cite

Digital-assisted noise-eliminating training for memristor crossbar-based analog neuromorphic computing engine

Conference Proceedings - Design Automation Conference · January 1, 2013 The invention of neuromorphic computing architecture is inspired by the working mechanism of human-brain. Memristor technology revitalized neuromorphic computing system design by efficiently executing the analog Matrix-Vector multiplication on the memristo ... Full text Cite

C1C: A Configurable, Compiler-Guided STT-RAM L1 Cache

Journal Article ACM Transactions on Architecture and Code Optimization · January 1, 2013 Spin-Transfer Torque RAM (STT-RAM), a promising alternative to SRAM for reducing leakage power consumption, has been widely studied to mitigate the impact of its asymmetrically long write latency. Recently, STT-RAM has been proposed for L1 caches by relaxi ... Full text Cite

Online OLED dynamic voltage scaling for video streaming applications on mobile devices

Conference 2013 International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2013 · January 1, 2013 While OLED is replacing LCD and becoming the display of choice for mobile devices, display still consumes a large portion of total mobile device's power. Reducing OLED display power is of paramount importance for battery-powered mobile devices. With the ex ... Full text Cite

Cache coherence enabled adaptive refresh for volatile STT-RAM

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2013 Spin-Transfer Torque RAM (STT-RAM) is extensively studied in recent years. Recent work proposed to improve the write performance of STT-RAM through relaxing the retention time of STT-RAM cell, magnetic tunnel junction (MTJ). Unfortunately, frequent refresh ... Full text Cite

Low cost power failure protection for MLC NAND flash storage systems with PRAM/DRAM hybrid buffer

Conference Proceedings -Design, Automation and Test in Europe, DATE · January 1, 2013 In the latest PRAM/DRAM hybrid MLC NAND flash storage systems (NFSS), DRAM is used to temporarily store file system data for system response time reduction. To ensure data integrity, super-capacitors are deployed to supply the backup power for moving the d ... Full text Cite

Multi-level cell STT-RAM: Is it realistic or just a dream?

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2012 Spin-transfer torque random access memory (STT-RAM) is a promising nonvolatile memory technology aiming on-chip or embedded applications. In recent years, many researches have been conducted to improve the storage density and enhance the scalability of STT ... Cite

Mobile devices user-The subscriber and also the publisher of real-time OLED display power management plan

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2012 OLED (Organic Light Emitting Diode) technology has already been adopted in many modern smart mobile devices, including cellphones, tablets, laptop etc. However, the power dissipation of displays in some applications like real-time video streaming, signific ... Cite

Combating write penalties using software dispatch for on-chip MRAM integration

Journal Article IEEE Embedded Systems Letters · December 1, 2012 Recent advances in the emerging memory technology magnetic RAM (MRAM) enrich the opportunities to build high density and low power embedded systems. One common way of utilizing MRAM is integrating it with conventional memories and distributing data to the ... Full text Cite

The circuit realization of a neuromorphic computing system with memristor-based synapse design

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · November 19, 2012 Conventional CMOS technology is slowly approaching its physical limitations and researchers are increasingly utilizing nanotechnology to both extend CMOS capabilities and to explore potential replacements. Novel memristive systems continue to attract growi ... Full text Cite

The prospect of STT-RAM scaling from readability perspective

Journal Article IEEE Transactions on Magnetics · October 29, 2012 Due to its fast access time, high integration density, nonvolatility and good CMOS compatibility, Spin-transfer torque random access memory (STT-RAM) becomes one promising technology for the memory hierarchy of the next-generation computing systems. In rec ... Full text Cite

Utilizing PCM for energy optimization in embedded systems

Conference Proceedings - 2012 IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2012 · October 29, 2012 Due to its high density, bit alterability, and low standby power, phase change memory (PCM) is considered as a promising DRAM alternative. In embedded systems, especially battery-driven mobile devices, energy is one of the most important performance metric ... Full text Cite

Improving energy efficiency of write-asymmetric memories by log style write

Conference Proceedings of the International Symposium on Low Power Electronics and Design · September 4, 2012 The significant scaling challenges of conventional memories, i.e., SRAM and DRAM, motivated the research on emerging memory technologies. Many promising memory technology candidates, however, suffer from a common issue in their write operations: the switch ... Full text Cite

A software approach for combating asymmetries of non-volatile memories

Conference Proceedings of the International Symposium on Low Power Electronics and Design · September 4, 2012 The recent advances in non-volatile memory technologies promise the delivery of future high performance and low power computing systems. While these technologies provide attractive features, they exhibit different degrees of asymmetric read/write behavior, ... Full text Cite

STT-Ram cell design considering MTJ asymmetric switching

Journal Article SPIN · September 1, 2012 As one promising candidate for next-generation nonvolatile memory technologies, spin-transfer torque random access memory (STT-RAM) has demonstrated many attractive features, such as nanosecond access time, high integration density, nonvolatility, and good ... Full text Cite

Memristor crossbar based hardware realization of BSB recall function

Conference Proceedings of the International Joint Conference on Neural Networks · August 22, 2012 The Brain-State-in-a-Box (BSB) model is an auto-associative neural network that has been widely used in optical character recognition and image processing. Traditionally, the BSB model was realized at software level and carried out on high-performance comp ... Full text Cite

Statistical memristor modeling and case study in neuromorphic computing

Conference Proceedings - Design Automation Conference · July 11, 2012 Memristor, the fourth passive circuit element, has attracted increased attention since it was rediscovered by HP Lab in 2008. Its distinctive characteristic to record the historic profile of the voltage/current creates a great potential for future neuromor ... Full text Cite

PS3-RAM: A fast portable and scalable statistical STT-RAM reliability analysis method

Conference Proceedings - Design Automation Conference · July 11, 2012 Process variations and thermal fluctuations significantly affect the write reliability of spin-transfer torque random access memory (STT-RAM). Traditionally, modeling the impacts of these variations on STT-RAM designs requires expensive Monte-Carlo runs wi ... Full text Cite

Quality-retaining OLED dynamic voltage scaling for video streaming applications on mobile devices

Conference Proceedings - Design Automation Conference · July 11, 2012 This paper developed a dynamic voltage scaling (DVS) technique for the power management of the OLED display on mobile devices in video streaming applications. An optimal voltage control scheme is proposed under input constraints. Fine-grained DVS technique ... Full text Cite

Nonvolatile memories as the data storage system for implantable ecg recorder

Journal Article ACM Journal on Emerging Technologies in Computing Systems · June 1, 2012 In this article, we propose a data storage systemwith the emerging nonvolatilememory technologies used for the implantable electrocardiography (ECG) recorder. The proposed storage system can record the digitalized real-time ECG waveforms continuously insid ... Full text Cite

Spintronic memristor based temperature sensor design with CMOS current reference

Conference Proceedings -Design, Automation and Test in Europe, DATE · May 24, 2012 As the technology scales down, the increased power density brings in significant system reliability issues. Therefore, the temperature monitoring and the induced power management become more and more critical. The thermal fluctuation effects of the recentl ... Cite

Architecting a common-source-line array for bipolar non-volatile memory devices

Conference Proceedings -Design, Automation and Test in Europe, DATE · May 24, 2012 Traditional array organization of bipolar non-volatile memories such as STT-MRAM and memristor utilizes two bitlines for cell manipulations. With technology scaling, such bitline pair will soon become the bottleneck of density improvement. In this paper we ... Cite

Asymmetry of MTJ switching and its implication to STT-RAM designs

Conference Proceedings -Design, Automation and Test in Europe, DATE · May 24, 2012 As one promising candidate for next-generation nonvolatile memory technologies, spin-transfer torque random access memory (STT-RAM) has demonstrated many attractive features, such as nanosecond access time, high integration density, non-volatility, and goo ... Cite

Fine-grained dynamic voltage scaling on OLED display

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · April 26, 2012 Organic Light Emitting Diode (OLED) has emerged as the new generation display technique for mobile multimedia devices. Compared to existing technologies OLEDs are thinner, brighter, lighter, and cheaper. However, OLED panels are still the biggest contribut ... Full text Cite

Probabilistic design in spintronic memory and logic circuit

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · April 26, 2012 Spin-transfer torque random access memory (STTRAM) is a promising candidate for next-generation non-volatile memory technologies. It combines many attractive attributes such as nanosecond access time, high integration density, non-volatility, and good CMOS ... Full text Cite

A 130 nm 1.2 V/3.3 v 16 Kb spin-transfer torque random access memory with nondestructive self-reference sensing scheme

Journal Article IEEE Journal of Solid-State Circuits · February 1, 2012 Among all the emerging memories, Spin-Transfer Torque Random Access Memory (STT-RAM) has demonstrated many promising features such as fast access speed, nonvolatility, excellent scalability, and compatibility to CMOS process. However, the large process var ... Full text Cite

Voltage driven nondestructive self-reference sensing scheme of spin-transfer torque memory

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · January 1, 2012 Spin-transfer torque random access memory (STT-RAM) has demonstrated great potentials as a universal memory for its fast access speed, zero standby power, excellent scalability, and simplicity of cell structure. However, large process variations of both ma ... Full text Cite

A thermal and process variation aware MTJ switching model and its applications in soft error analysis

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2012 Spin-transfer torque random access memory (STT-RAM) has recently gained increased attentions from circuit design and architecture societies. Although STT-RAM offers a good combination of small cell size, nanosecond access time and non-volatility for embedd ... Full text Cite

Active compensation technique for the thin-film transistor variations and OLED aging of mobile device displays

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2012 OLED is becoming the main stream display for mobile devices. The process variations of thin-film transistors (TFT) and the aging degradation of OLED devices severely impact the display quality and the user experience on mobile devices throughout lifetime. ... Full text Cite

MRAC: A memristor-based reconfigurable framework for adaptive cache replacement

Conference Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT · December 1, 2011 Memristor, a long postulated yet missing circuit element, has recently emerged as a promising device in non-volatile memory technologies. However, beyond its use as memory cell, it is challenging to integrate memristor in modern architectures for general p ... Full text Cite

STT-RAM cell design optimization for persistent and non-persistent error rate reduction: A statistical design view

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · December 1, 2011 The rapidly increased demands for memory in electronic industry and the significant technical scaling challenges of all conventional memory technologies motivated the researches on the next generation memory technology. As one promising candidate, spin-tra ... Full text Cite

Emerging non-volatile memories: Opportunities and challenges

Conference Embedded Systems Week 2011, ESWEEK 2011 - Proceedings of the 9th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS'11 · November 22, 2011 In recent years, non-volatile memory (NVM) technologies have emerged as candidates for future universal memory. NVMs generally have advantages such as low leakage power, high density, and fast read spead. At the same time, NVMs also have disadvantages. For ... Full text Cite

A 1.0V 45nm nonvolatile magnetic latch design and its robustness analysis

Conference Proceedings of the Custom Integrated Circuits Conference · November 9, 2011 A new nonvolatile latch design is proposed based on the magnetic tunneling junction (MTJ) devices. In the standby mode, the latched data can be retained in the MTJs without consuming any power. Two types of operation errors, namely, persistent and non-pers ... Full text Cite

Processor caches built using multi-level spin-transfer torque RAM cells

Conference Proceedings of the International Symposium on Low Power Electronics and Design · September 19, 2011 It has been predicted that a processor's caches could occupy as much as 90% of chip area for technology nodes from the current. In this paper, we study the use of multi-level spin-transfer torque RAM (STT-RAM) cells in the design of processor caches. Compa ... Full text Cite

3D-ICML: A 3D bipolar ReRAM design with interleaved complementary memory layers

Conference Proceedings -Design, Automation and Test in Europe, DATE · May 31, 2011 Resistive random access memory (ReRAM) has been demonstrated as a promising non-volatile memory technology with features such as high density, low power, good scalability, easy fabrication and compatibility to the existing CMOS technology. The conventional ... Cite

Stacking magnetic random access memory atop microprocessors: An architecture-level evaluation

Journal Article IET Computers and Digital Techniques · May 1, 2011 Magnetic random access memory (MRAM) has been considered as a promising memory technology because of its attractive properties such as non-volatility, fast access, zero standby leakage and high density. Although integrating MRAM with complementary metal-ox ... Full text Cite

Geometry variations analysis of TiO2 thin-film and spintronic memristors

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 28, 2011 The fourth passive circuit element, memristor, has attracted increased attentions since the first real device was discovered by HP Lab in 2008. Its distinctive characteristic to record the historic profile of the voltage/current through itself creates grea ... Full text Cite

Emerging sensing techniques for emerging memories

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · March 28, 2011 Among all emerging memories, Spin-Transfer Torque Random Access Memory (STT-RAM) has shown many promising features such as fast access speed, nonvolatility, compatibility to CMOS process and excellent scalability. However, large process variations of both ... Full text Cite

Design of last-level on-chip cache using spin-torque transfer RAM (STT RAM)

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · March 1, 2011 Because of its high storage density with superior scalability, low integration cost and reasonably high access speed, spin-torque transfer random access memory (STT RAM) appears to have a promising potential to replace SRAM as last-level on-chip cache (e.g ... Full text Cite

Current switching in MgO-based magnetic tunneling junctions

Journal Article IEEE Transactions on Magnetics · January 1, 2011 Spin-transfer induced magnetization switching in a MgO-based magnetic tunneling junction (MTJ) has been measured over a wide time range. It was found that the switching current response is asymmetric going from the high resistance state to the low resistan ... Full text Cite

Spintronic memristor: Compact model and statistical analysis

Journal Article Journal of Low Power Electronics · January 1, 2011 The fourth fundamental passive circuit element - memristor, has received the increased attentions after a real device was demonstrated by HP Lab in 2008. The distinctive characteristic of a memristor to record the historical profile of the voltage/current ... Full text Cite

Nonpersistent errors optimization in spin-MOS logic and storage circuitry

Journal Article IEEE Transactions on Magnetics · January 1, 2011 By combining the flexibility of MOS logic and the nonvolatility of spintronic devices, Spin-MOS logic and storage circuitries offer a promising approach to implement a highly integrated, power-efficient, and nonvolatile computing and storage systems. Besid ... Full text Cite

STT-RAM cell optimization considering MTJ and CMOS variations

Journal Article IEEE Transactions on Magnetics · January 1, 2011 Spin-transfer torque random access memory (STT-RAM) becomes a promising technology for future computing systems for its fast access time, high density, nonvolatility, and small write current. However, like all the other nanotechnologies, STT-RAM suffers fr ... Full text Cite

Performance, power, and reliability tradeoffs of STT-RAM cell subject to architecture-level requirement

Journal Article IEEE Transactions on Magnetics · January 1, 2011 Large switching current and long switching time have significantly limited the adoption of spin-transfer torque random access memory (STT-RAM). Technology scaling, moreover, makes it very challenging to reduce the switching current while maintaining the re ... Full text Cite

Asymmetry in STT-RAM cell operations

Chapter · January 1, 2011 Spin-transfer torque random access memory (STT-RAM) has emerged as a promising technology to replace SRAM and DRAM in embedded memory applications. In STT-RAM, the data are stored in a magnetic device (magnetic tunneling junction or MTJ) as different resis ... Full text Cite

Design margin exploration of spin-transfer torque RAM (STT-RAM) in scaled technologies

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · December 1, 2010 We propose a magnetic and electric level spin-transfer torque random access memory (STT-RAM) cell model to simulate the write operation of an STT-RAM. The model of a magnetic tunneling junction (MTJ) is modified to take into account the electrical response ... Full text Cite

Applications of TMR devices in solid state circuits and systems

Conference 2010 International SoC Design Conference, ISOCC 2010 · December 1, 2010 Spintronic devices have recently attracted significant attentions in solid state circuit society as a promising device in the applications of nonvolatile memory and emerging circuit design, i.e., memristor-based system. In this paper, we introduce Tunnelin ... Full text Cite

Spintronic devices: From memory to memristor

Conference 2010 International Conference on Communications, Circuits and Systems, ICCCAS 2010 - Proceedings · November 19, 2010 In 1971, Professor Leon Chua in UC Berkeley predicted the fourth fundamental passive circuit element - memristor, based on the conceptual completeness of circuit theory. 37 years later, a team at HP Labs led by Dr. Stanley Williams announced the developmen ... Full text Cite

Combined magnetic-and circuit-level enhancements for the nondestructive self-reference scheme of STT-RAM

Conference Proceedings of the International Symposium on Low Power Electronics and Design · October 21, 2010 A nondestructive self-reference read scheme (NSRS) was recently proposed to overcome the bit-to-bit variation in Spin-Transfer Torque Random Access Memory (STT-RAM). In this work, we introduced three magnetic-and circuit-level techniques, including 1) R-I ... Full text Cite

Low-power dual-element memristor based memory design

Conference Proceedings of the International Symposium on Low Power Electronics and Design · October 21, 2010 Recently, the emerging memristor device technology has attracted significant research interests due to its distinctive hysteresis characteristic, which potentially can enable novel circuit designs for future VLSI circuits. In particular, characteristics su ... Full text Cite

Emerging non-volatile memory technologies: From materials, to device, circuit, and architecture

Conference Midwest Symposium on Circuits and Systems · September 20, 2010 The emerging nonvolatile memory technologies are gaining significant attentions from semiconductor in recent years. Multiple promising candidates, such as phase change memory, magnetic memory, resistive memory, and memristor, have gained substantial attent ... Full text Cite

Access scheme of multi-level cell spin-transfer torque random access memory and its optimization

Conference Midwest Symposium on Circuits and Systems · September 20, 2010 In this work, we study the access (read and write) scheme of the newly proposed Multi-Level Cell Spin-Transfer Torque Random Access Memory (MLC STT-RAM) from both the circuit design and architectural perspectives. Based on the physical principles of the re ... Full text Cite

The application of spintronic devices in magnetic bio-sensing

Conference Proceedings of the 2nd Asia Symposium on Quality Electronic Design, ASQED 2010 · September 17, 2010 Recently integrated magnetic/spintronic device microarrays have demonstrated great potentials in both biomedical research and practices. In this work, we discuss the physical mechanisms of three types of spintronic devices for magnetic signal sensing, incl ... Full text Cite

Impact of process variations on emerging memristor

Conference Proceedings - Design Automation Conference · September 7, 2010 The memristor, known as the fourth basic two-terminal circuit element, has attracted many research interests since the first real device was developed by HP labs in 2008. The nano-scale memristive device has the potential to construct some novel computing ... Full text Cite

PCMO device with high switching stability

Journal Article IEEE Electron Device Letters · August 1, 2010 We studied the relationship between the resistive-switching properties of the Pr0.7Ca0.3MnO3 (PCMO) thin-film elements and their geometry dimensions below submicrometers. Our electrical test results of a series of PCMO-based resistive-switching devices wit ... Full text Cite

Patents relevant to cross-point memory array

Journal Article Recent Patents on Electrical Engineering · June 25, 2010 Patents relevant to cross-point memory array structure are reviewed. These patents are selected from the categories of cross-point, crossbar, memory array and emerging memory. These patents address the questions of how to build a cross-point memory, includ ... Full text Cite

A nondestructive self-reference scheme for spin-transfer torque random access memory (STT-RAM)

Conference Proceedings -Design, Automation and Test in Europe, DATE · June 9, 2010 We proposed a novel self-reference sensing scheme for Spin-Transfer Torque Random Access Memory (STT-RAM) to overcome the large bit-to-bit variation of Magnetic Tunneling Junction (MTJ) resistance. Different from all the existing schemes, our solution is n ... Cite

Spintronic memristor devices and application

Conference Proceedings -Design, Automation and Test in Europe, DATE · June 9, 2010 Spintronic memristor devices based upon spin torque induced magnetization motion are presented and potential application examples are given. The structure and material of these proposed spin torque memristors are based upon existing (and/or commercialized) ... Cite

Scalability of PCMO-based resistive switch device in DSM technologies

Conference Proceedings of the 11th International Symposium on Quality Electronic Design, ISQED 2010 · May 28, 2010 This work systematically explores the relationship between the resistive switching properties of Pr0.7Ca0.3MnO3 (PCMO) thin film element and its geometry dimensions in deep submicron (DSM) technologies. A series of PCMO-based resistive switch devices (RSDs ... Full text Cite

Spin transfer torque memory with thermal assist mechanism: A case study

Journal Article IEEE Transactions on Magnetics · March 1, 2010 We have investigated spin transfer torque random access memory (STT-RAM) with a thermal-assist programming scheme using finite-element thermal simulation. We conducted the study on a specific memory element design to analyze the thermal dynamics and therma ... Full text Cite

Spintronic memristor temperature sensor

Journal Article IEEE Electron Device Letters · January 1, 2010 Thermal fluctuation effects on the electric behavior of a spintronic memristor based upon the spin-torque-induced domain-wall motion are explored. Depending upon material, geometry, and electric excitation strength, the device electric behavior can be eith ... Full text Cite

Variable-Latency Adder (VL-Adder) Designs for Low Power and NBTI Tolerance

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · January 1, 2010 In this paper, we proposed a new adder design called variable-latency adder (VL-adder). This technique allows the adder to work at a lower supply voltage than that required by a conventional adder while maintaining the same throughput. The VL-adder design ... Full text Cite

Variation tolerant sensing scheme of spin-transfer torque memory for yield improvement

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2010 Spin-Transfer Torque Random Access Memory (STTRAM) demonstrated great potentials as an universal memory for its fast access speed, zero standby power, excellent scalability and simplicity of cell structure. However, large process variations of both magneti ... Full text Cite

Design of spin-torque transfer magnetoresistive RAM and CAM/TCAM with high sensing and search Speed

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · January 1, 2010 With a great scalability potential, nonvolatile magnetoresistive memory with spin-torque transfer (STT) programming has become a topic of great current interest. This paper addresses cell structure design for STT magnetoresistive RAM, content addressable m ... Full text Cite

A hybrid solid-state storage architecture for the performance, energy consumption, and lifetime improvement

Conference Proceedings - International Symposium on High-Performance Computer Architecture · January 1, 2010 In recent years, many systems have employed NAND flash memory as storage devices because of its advantages of higher performance (compared to the traditional hard disk drive), high-density, random-access, increasing capacity, and falling cost. On the other ... Full text Cite

Patents relevant to spintronic memristor

Journal Article Recent Patents on Electrical Engineering · January 1, 2010 Patents relevant to spintronic memristor devices are reviewed. These patterns are selected from the categories of memristor, spintronic device, magnetic tunneling junction and magnetic domain wall devices. These patents address the questions of how to buil ... Full text Cite

Magnetization Switching in Spin Torque Random Access Memory: Challenges and Opportunities

Chapter · January 1, 2010 Dynamic thermal magnetization switching and magnetization switching variability determine nano-scale magnetic device performance. As a magnetic device scales down, achieving fast nanosecond time scale magnetization switching and maintaining thermal stabili ... Full text Cite

Gated decap: Gate leakage control of on-chip decoupling capacitors in scaled technologies

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · December 1, 2009 To minimize the leakage power dissipation of present-day on-chip Decaps, we propose a gated decoupling capacitor (GDecap) technique that deactivates a Decap when it is not needed. The application of the proposed GDecap technique on an eight-way clock-gated ... Full text Cite

The salvage cache: A fault-tolerant cache architecture for next-generation memory technologies

Conference Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors · December 1, 2009 There has been much work on the next generation of memory technologies such as MRAM, RRAM and PRAM. Most of these are non-volatile in nature, and compared to SRAM, they are often denser, just as fast, and have much lower energy consumption. Using 3-D stack ... Full text Cite

Compact modeling and corner analysis of spintronic memristor

Conference 2009 IEEE/ACM International Symposium on Nanoscale Architectures, NANOARCH 2009 · November 11, 2009 The 4th fundamental circuit elements - Memristor received significant attentions after a real device was recently demonstrated for the first time. Besides the solid-state thin film memristive device, sprintonic memristor was also invented based on the magn ... Full text Cite

Thermal-assisted spin transfer torque memory (STT-RAM) cell design exploration

Conference Proceedings of the 2009 IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2009 · October 5, 2009 Thermal-assisted spin-transfer torque random access memory (STT-RAM) has been considered as a promising candidate of next-generation nonvolatile memory technology. We conducted finite element simulation on thermal dynamics in the programming process of the ... Full text Cite

Tolerating process variations in large, set-associative caches: The buddy cache

Journal Article Transactions on Architecture and Code Optimization · June 1, 2009 One important trend in today's microprocessor architectures is the increase in size of the processor caches. These caches also tend to be set associative. As technology scales, process variations are expected to increase the fault rates of the SRAM cells t ... Full text Cite

Ordering of magnetic nanoparticles in bilayer structures

Journal Article Journal of Physics D: Applied Physics · April 8, 2009 In this study, we predict crystalline ordering of magnetic nanoparticles in a bilayer structure where only magnetic dipole interaction is taken into account. Estimates show that the two-dimensional lattice structure can be observed in the liquid nitrogen t ... Full text Cite

Spintronic memristor through spin-thorque-induced magnetization motion

Journal Article IEEE Electron Device Letters · February 12, 2009 Existence of spintronic memristor in nanoscale is demonstrated based upon spin-torque-induced magnetization switching and magnetic-domain-wall motion. Our examples show that memristive effects are quite universal for spin-torque spintronic device at the ti ... Full text Cite

A novel architecture of the 3D stacked MRAM L2 Cache for CMPs

Conference Proceedings - International Symposium on High-Performance Computer Architecture · January 1, 2009 Magnetic random access memory (MRAM) is a promising memory technology, which has fast read access, high density, and non-volatility. Using 3D heterogeneous integrations, it becomes feasible and cost-efficient to stack MRAM atop conventional chip multiproce ... Full text Cite

Improving STT MRAM storage density through smaller-than-worst-case transistor sizing

Conference Proceedings - Design Automation Conference · January 1, 2009 This paper presents a technique to improve the storage density of spin-torque transfer (STT) magnetoresistive random access memory (MRAM) in the presence of significant magnetic tunneling junction (MTJ) write current threshold variability. In conventional ... Full text Cite

Spin-transfer torque magnetoresistive content addressable memory (CAM) cell structure design with enhanced search noise margin

Conference Proceedings - IEEE International Symposium on Circuits and Systems · September 19, 2008 This paper presents a new memory cell structure for content addressable memory (CAM) based on magnetic tunneling junction (MTJ). Each CAM cell uses a pair of differential MTJs as basic storage element and incorporates transistors to greatly improve the cel ... Full text Cite

Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement

Conference Proceedings - Design Automation Conference · September 17, 2008 Magnetic Random Access Memory (MRAM) has been considered as a promising memory technology due to many attractive properties. Integrating MRAM with CMOS logic may incur extra manufacture cost, due to its hybrid magnetic-CMOS fabrication process. Stacking MR ... Full text Cite

Design margin exploration of Spin-Torque Transfer RAM (SPRAM)

Conference Proceedings of the 9th International Symposium on Quality Electronic Design, ISQED 2008 · August 25, 2008 We proposed a combined magnetic and circuit level technique to explore the design methodology of Spin-Torque Transfer RAM (SPRAM). A dynamic magnetic model of magnetic tunneling junction (MTJ), which is based upon measured spin torque induced magnetization ... Full text Cite

Spin torque random access memory down to 22 nm technology

Journal Article IEEE Transactions on Magnetics · January 1, 2008 Spin torque random access memory (ST-MRAM) design spaces down to CMOS 22 nm technology node are explored using a dynamic magnetic tunneling junction (MTJ)-CMOS model. The coupled dynamics of MTJ and CMOS is modeled by a combination of MTJ micromagnetic sim ... Full text Cite

Variable-latency adder (VL-adder): New arithmetic circuit design practice to overcome NBTI

Conference Proceedings of the International Symposium on Low Power Electronics and Design · December 17, 2007 Negative bias temperature instability (NBTI) has become a dominant reliability concern for nanoscale PMOS transistors. In this paper, we propose variable-latency adder (VL-adder) technique for NBTI tolerance. By detecting the circuit failure on-the-fly, th ... Full text Cite

VOSCH: Voltage scaled cache hierarchies

Conference 2007 IEEE International Conference on Computer Design, ICCD 2007 · December 1, 2007 The cache hierarchy of state-of-the-art - especially multicore - microprocessors consumes a significant amount of area and energy. A significant amount of research has been devoted especially to reducing the latter. One of the most important microarchitect ... Full text Cite

Statistical timing analysis considering spatial correlations

Conference Proceedings - Eighth International Symposium on Quality Electronic Design, ISQED 2007 · August 28, 2007 In this paper, we present an efficient algorithm to predict the probability distribution of the circuit delay while accounting for spatial correlations. We exploit the structure of the covariance matrix to decouple the correlated variables to independent o ... Full text Cite

SAVS: A self-adaptive variable supply-voltage technique for process- Tolerant and power-efficient multi-issue superscalar processor design

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · September 19, 2006 Technology scaling and sub-wavelength optical lithography is associated with significant process variations. We propose a self-adaptive variable supply-voltage scaling (SAVS) technique for multi-issue out-of-order pipeline to improve parametric yield with ... Cite

Cascaded carry-select adder (C2 SA): A new structure for low-power CSA design

Conference Proceedings of the International Symposium on Low Power Electronics and Design · December 12, 2005 In this paper we propose a novel low-power Carry-Select Adder (CSA) design called Cascaded CSA (C2SA). Based on the prediction of the critical path delay of current operation, C2SA can automatically work with one or two clock-cycle latency and a scaled sup ... Cite

Power supply noise-aware scheduling and allocation for DSP synthesis

Conference Proceedings - International Symposium on Quality Electronic Design, ISQED · December 1, 2005 As technology scales down, power supply noise is becoming a performance and reliability bottleneck in modern VLSI. We propose a power supply noise-aware design methodology for high-level synthesis. By evaluating power supply noise in the early design stage ... Full text Cite

Statistical based link insertion for robust clock network design

Conference IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD · January 1, 2005 We present a statistical based non-tree clock distribution construction algorithm that starts with a tree and incrementally insert cross links, such that the skew variation of the final clock network is within a certain confidence interval under variations ... Full text Cite

Gated Decap: Gate leakage control of on-chip decoupling capacitors in scaled technologies

Conference Proceedings of the Custom Integrated Circuits Conference · January 1, 2005 A novel on-chip Decoupling Capacitor (Decap) design - Gated Decoupling Capacitor (GDecap) - is proposed to minimize the leakage power dissipation associated with present-day on-chip decoupling capacitors. Experiments on the application of GDecap in an 8-wa ... Full text Cite

Current demand balancing: A technique for minimization of current surge in high performance clock-gated microprocessors

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · January 1, 2005 In this paper, we propose an integrated architectural and physical planning approach to minimize the current surge in high-performance clock-gated microprocessors. In our approach, we use priority assignment optimization (PAO) and dynamic functional unit ( ... Full text Cite

Priority assignment optimization for minimization of current surge in high performance power efficient clock-gated microprocessor

Conference Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC · June 1, 2004 We propose an integrated architectural/physical-planning approach named priority assignment optimization to minimize the current surge in high performance power efficient clock-gated microprocessors. The proposed approach balances the current demands acros ... Cite

DCG: Deterministic Clock-Gating for Low-Power Microprocessor Design

Journal Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems · March 1, 2004 With the scaling of technology and the need for higher performance and more functionality, power dissipation is becoming a major bottleneck for microprocessor designs. Because clock power can be significant in high-performance processors, we propose a dete ... Full text Cite

Deterministic clock gating for microprocessor power reduction

Conference Proceedings - International Symposium on High-Performance Computer Architecture · January 1, 2003 With the scaling of technology and the need for higher performance and more functionality, power dissipation is becoming a major bottleneck for microprocessor designs. Pipeline balancing (PLB), a previous technique, is essentially a methodology to clock-ga ... Full text Cite

Integrated architectural/physical planning approach for minimization of current surge in high performance clock-gated microprocessors

Conference Proceedings of the International Symposium on Low Power Electronics and Design · January 1, 2003 We propose an integrated architectural/physical planning approach to reduce the power supply noise due to current surge in high performance, general-purpose, clock-gated microprocessors. The proposed approach combines dynamic selection of functional units ... Full text Cite

Integrated Architectural/Physical Planning Approach for Minimization of Current Surge in High Performance Clock-gated Microprocessors

Conference Proceedings of the International Symposium on Low Power Electronics and Design · January 1, 2003 We propose an integrated architectural/physical planning approach to reduce the power supply noise due to current surge in high performance, general-purpose, clock-gated microprocessors. The proposed approach combines dynamic selection of functional units ... Full text Cite

Model reduction in the time-domain using Laguerre polynomials and Krylov methods

Conference Proceedings -Design, Automation and Test in Europe, DATE · December 1, 2002 Presents a new passive model reduction algorithm based on the Laguerre expansion of the time response of interconnect networks. We derive expressions for the Laguerre coefficient matrices that minimize a weighted square of the approximation error, and show ... Full text Cite