Department of Electrical and Computer Engineering Technical Reports

Optimal AoI for Systems With Queueing Delay in Both Forward and Backward Directions

Chih-Chun Wang — Thu, 25 Jul 2024 13:44:07 PDT

Age-Of-Information (AoI) is a metric that focuses directly on the application-layer objectives, and a canonical AoI minimization problem is the update-through-queues models. Existing results in this direction fall into two categories: The open-loop setting for which the sender is oblivious of the packet departure time, versus the closed-loop setting for which the decision is based on instantaneous Acknowledgment (ACK). Neither setting perfectly reflects modern networked systems, which almost always rely on feedback that experiences some delay. Motivated by this observation, this work subjects the ACK traffic to a second queue so that the closed-loop decision is made based on delayed feedback. Near-optimal schedulers have been devised, which smoothly transition from the instantaneous-ACK to the open-loop schemes depending on how long the feedback delay is. The results quantify the benefits of delayed feedback for AoI minimization in the update-through-queues systems.

Sparse Ensemble Networks for Hypserspectral Image Classification

Rakesh Kumar Iyer et al. — Mon, 29 Apr 2024 09:55:36 PDT

We explore the efficacy of sparsity and ensemble model in the classification of hyperspectral images, a pivotal task in remote sensing applications. While Convolutional Neural Networks (CNNs) and Transformer models have shown promise in this domain, each exhibits distinct limitations; CNNs excel in capturing the spatial/local features but falter to capture spectral features, whereas Transformers captures the spectral features at the expense of spatial features. Furthermore, the computational cost associated with training several independent CNN and Transformer networks becomes expensive. To address these limitations, we propose a novel ensemble framework comprising pruned CNNs and Transformers, optimizing both spatial and spectral feature utilization while curbing computational costs. By integrating sparsity through model pruning, our approach effectively reduces redundancy and computational complexity without compromising accuracy. Through extensive experimentation, we find that our method achieves comparable accuracy to its non-sparse counterparts while decreasing the computational cost. Our contribution enhances remote sensing analytics by demonstrating the potential of sparse and ensemble models in improving the precision and computational efficiency of hyperspectral image classification.

Natural Language Processing for Novel Writing

Leqing Qu et al. — Fri, 23 Sep 2022 21:40:04 PDT

Synthetic Aperture Radar Imaging

Srivatsan Ravichandran et al. — Fri, 23 Sep 2022 21:39:59 PDT

Simulation programs are used to locate the positions of the input target points and generate a 2D SAR image with the Range Migration Algorithm. Using the same methodology, we can create a scene geometry using the concept of Point cloud and run the simulation program to generate raw SAR data.

Outdated Measurements Are Still Useful For Multi-Sensor Linear Control Systems With Random Communication Delays

Jia Zhang et al. — Wed, 07 Sep 2022 06:38:00 PDT

Linear systems are a widely used model for the control tasks of modern cyber physical systems around their stationary state(s), e.g., smart grids, remote health applications, and autonomous driving systems. Specifically, each sensor first compresses its own measurement and then sends it to the controller. Due to the inevitable random communication delay, the controller needs to decide how to fuse the received information to compute the desired control action. Suppose a fusion center has received several measurements over time. One common belief is that the control decision should be made solely based on the latest measurement of each sensor while ignoring the older/stale measurements from the same sensor. This work shows that while such a strategy is optimal in a single-sensor environment, it can be strictly suboptimal for a multi-sensor system. Namely, if one properly fuses both the latest and outdated measurements from each of the sensors, one can strictly improve the underlying control system performance. The numerical evaluation shows that even at a very low communication rate of 8 bits per measurement per sensor, the proposed scheme achieves a state variance of only 5% away from the best possible achievable L2 norm. It is 15% better than the MMSE fusion scheme using exclusively the freshest measurements (while discarding outdated ones).

Distribution-oblivious Online Algorithms for Age-of-Information Penalty Minimization

Cho-Hsin Tsai et al. — Thu, 10 Mar 2022 14:42:40 PST

The ever-increasing needs of supporting real-time applications have spurred new studies on minimizing Age-of-Information (AoI), a novel metric characterizing the data freshness of the system. This work studies the single-queue information update system and strengthens the seminal results of Sun et al. on the following fronts: (i) When designing the optimal offline schemes with full knowledge of the delay distributions, a new fixed-point-based method is proposed with quadratic convergence rate, an order-of-magnitude improvement over the state-of-the-art; (ii) When the distributional knowledge is unavailable (which is the norm in practice), two new low-complexity online algorithms are proposed, which provably attain the optimal average AoI penalty; and (iii) the online schemes also admit a modular architecture, which allows the designer to upgrade certain components to handle additional practical challenges. Two such upgrades are proposed for the situations: (iii.1) The AoI penalty function is also unknown and must be estimated on the fly, and (iii.2) the unknown delay distribution is Markovian instead of i.i.d. The performance of our schemes is either provably optimal or within 3% of the omniscient optimal offline solutions in all simulation scenarios.

Unifying AoI Minimization and Remote Estimation — Optimal Sensor/Controller Coordination with Random Two-way Delay

Cho-Hsin Tsai et al. — Mon, 03 Aug 2020 13:35:42 PDT

The ubiquitous usage of communication networks in modern sensing and control applications has kindled new interests on the timing coordination between sensors and controllers, i.e., how to use the "waiting time'' judicially to improve the system performance. Contrary to the common belief that a zero-wait policy is optimal, Sun et al. showed that a controller can strictly improve the data freshness, the so-called Age-of-Information (AoI), by postponing transmission in order to lengthen the duration of staying in a good state. The optimal waiting policy for the sensor side was later characterized in the context of remote estimation. Instead of focusing on the sensor and controller sides separately, this work develops the jointly optimal sensor/controller waiting policy in a Wiener-process system. This work generalizes the above two important results in the sense that not only do we consider joint sensor/controller designs (as opposed to sensor-only or controller-only schemes), but we also assume random delay in both the forward and feedback directions (as opposed to random delay in only one direction). In addition to provable optimality, extensive simulation is used to verify the performance of the proposed scheme.

Parallel Multistage Wide Neural Network

Jiangbo Xi et al. — Wed, 29 Jul 2020 11:32:32 PDT

Deep learning networks have achieved great success in many areas such as in large scale image processing. They usually need large computing resources and time, and process easy and hard samples inefficiently in the same way. Another undesirable problem is that the network generally needs to be retrained to learn new incoming data. Efforts have been made to reduce the computing resources and realize incremental learning by adjusting architectures, such as scalable effort classifiers, multi-grained cascade forest (gc forest), conditional deep learning (CDL), tree CNN, decision tree structure with knowledge transfer (ERDK), forest of decision trees with RBF networks and knowledge transfer (FDRK). In this paper, a parallel multistage wide neural network (PMWNN) is presented. It is composed of multiple stages to classify different parts of data. First, a wide radial basis function (WRBF) network is designed to learn features efficiently in the wide direction. It can work on both vector and image instances, and be trained fast in one epoch using subsampling and least squares (LS). Secondly, successive stages of WRBF networks are combined to make up the PMWNN. Each stage focuses on the misclassified samples of the previous stage. It can stop growing at an early stage, and a stage can be added incrementally when new training data is acquired. Finally, the stages of the PMWNN can be tested in parallel, thus speeding up the testing process. To sum up, the proposed PMWNN network has the advantages of (1) fast training, (2) optimized computing resources, (3) incremental learning, and (4) parallel testing with stages. The experimental results with the MNIST, a number of large hyperspectral remote sensing data, CVL single digits, SVHN datasets, and audio signal datasets show that the WRBF and PMWNN have the competitive accuracy compared to learning models such as stacked auto encoders, deep belief nets, SVM, MLP, LeNet-5, RBF network, recently proposed CDL, broad learning, gc forest etc. In fact, the PMWNN has often the best classification performance.

Composable, Sound Transformations of Nested Recursion and Loops

Kirshanthan Sundararajah et al. — Mon, 15 Apr 2019 11:46:52 PDT

Scheduling transformations reorder a program’s operations to improve locality and/or parallelism. The polyhedral model is a general framework for composing and applying instancewise scheduling transformations for loop-based programs, but there is no analogous framework for recursive programs. This paper presents an approach for composing and applying scheduling transformations—like inlining, interchange, and code motion—to nested recursive programs. This paper describes the phases of the approach—representing dynamic instances, composing and applying transformations, reasoning about correctness—and shows that these techniques can verify the soundness of composed transformations.

Ladder Networks for Semi-Supervised Hyperspectral Image Classification

okan ersoy et al. — Mon, 03 Dec 2018 05:27:01 PST

We discuss the ladder network to perform hyperspectral image classification in a semi-supervised setting. The ladder network distinguishes itself from other semi-supervised methods by jointly optimizing a supervised and unsupervised cost. In many settings this has proven to be more successful than other semi-supervised techniques, such as pretraining using unlabeled data. We furthermore show that the convolutional ladder network outperforms most of the current techniques used in hyperspectral image classification and achieves new state-of-the-art performance on the Pavia University dataset given only 5 labeled data points per class.

DISC: A Method for Dynamic Intelligent Scheduling and Control of Reconfigurable Parallel Architectures

F. j. Weil et al. — Fri, 30 Nov 2018 10:13:20 PST

This work studies the use of intelligence-guided control of reconfigurable parallel processing systems. A reconfigurable architecture is one that can be partitioned into several independent virtual parallel machines operating in either SIMD or MIMD mode. Reconfigurable systems, while allowing great flexibility, present many scheduling and control problems. Scheduling tasks on such a system is an exponential time problem. Therefore, in an effort to achieve reduced, task execution time without incurring unacceptable scheduling costs, an expert system is used to apply heuristics to approximate an optimal schedule. When the execution time of a task is not known a priori, conventional scheduling methods which produce optimal or near-optimal Schedules cannot be used effectively. A dynamic controller, however, is not locked into a static schedule and pan reconfigure the machine and process subtasks based on the current state of the parallel processing system. The scheduling system attempts to achieve decreased execution time by balancing the overall processing scenario of the task with the needs of the individual routines that make up the task. Repartitioning is done when either the processor’s resources need to be split among the subtasks or the processor’s resources have become fragmented and need to be merged into larger partitions. The scheduler keeps track of what subtasks are potentially executable and chooses the best candidate by considering the relative importance of quickly finishing the subtask and the matching of partition data contents and subtask data needs.

Wide and Deep Neural Networks in Remote Sensing: A Review

okan ersoy — Fri, 30 Nov 2018 09:10:07 PST

Wide and deep neural networks in multispectral and hyperspectral image classification are discussed. Wide versus deep networks have always been a topic of intense interest. Deep networks mean large number of layers in the depth direction. Wide networks can be defined as networks growing in the vertical direction. Then, wide and deep networks are networks which have growth in both vertical and horizontal directions. In this report, several directions in order to achieve such networks are described. We first review a methodology called Parallel, Self-Organizing, Hierarchical Neural Networks (PSHNN’s) which have stages growing in the vertical direction, and each stage can be a deep network as well. In turn, each layer of a deep network can be a PSHNN. The second methodology involves making each layer of a deep network wide, and this has been discussed especially with deep residual networks. The third methodology is wide and deep residual neural networks which grow both in horizontal and vertical directions, and include residual learning principles for improving learning. The fourth methodology is wide and deep neural networks in parallel. Here wide and deep networks are two parallel branches, the wide network specializing in memorization while the deep network specializing in generalization. In leading to these methods, we also review various types of PSHNN’s, deep neural networks including convolutional neural networks, autoencoders, and residual learning. Partially due to moderate sizes of current multispectral and hyperspectral image sets, design and implementation of wide and deep neural networks hold the potential to yield most effective solutions. These conclusions are expected to be valid in other areas with similar data structures as well.

Minimizing Quotient Space Norms Using Penalty Functions

Stefen Hui et al. — Tue, 13 Nov 2018 11:09:07 PST

A penalty function method approach is proposed to solve the general problem of quotient space norms minimization. A new class of penalty functions is introduced which allows one to transform constrained optimization problems of quotient space norms minimization by unconstrained optimization problems. The sharp bound on the weight parameter is given for which constrained and unconstrained problems are equivalent. Also a computationally efficient bound on the weight parameter is given. Numerical examples and computer simulations illustrate the results obtained.

Generalized 'Probe' Model for Arbitrary Dephasing Mechanisms

Supriyo Datta — Tue, 13 Nov 2018 11:09:00 PST

It has been shown earlier that in the linear response regime, dephasing by point scatterers (within the self-consistent Born approximation) can be visualized in terms of point voltage probes attached to each space and energy coordinate (r,E). In this paper we derive a generalized linear response equation starting from the non-equilibrium Green function formalism that can be used to describe any dephasing process in any approximation. The dephasing is characterized by a ‘reservoir function’ which can be evaluated from the self-energy. The linear response equation can be visualized in terms of voltage probes but with individual probes connected to each pair of spatial coordinates and to each energy (r,r',E ). Unlike point scatterers, this generalized ‘probe’ model allows us to introduce phase relaxation without necessarily introducing momentum relaxation. We obtain explicit expressions for the transmission Tij from terminal ‘j ’ to terminal T by eliminating the ‘floating probes’ inside the device. These expressions for Tij clearly show the role of the exclusion principle in determining the transmission. Proof of reciprocity in multiterminal conductors is provided. We also present a simple illustrative example calculating Tij for a short single-moded electron waveguide with electron-phonon interactions. An important difference between the present formulation and usual linear response theory is that the electrochemical potential difference is treated as the driving force; however, we do not neglect the self-consistent fields that appear in an interacting system when a small bias is applied.

Neural Networks for Constrained Optimization Problems

Walter E. Lillo et al. — Tue, 13 Nov 2018 11:08:51 PST

This paper is concerned with utilizing neural networks and analog circuits to solve constrained optimization problems. A novel neural network architecture is proposed for solving a class of nonlinear programming problems. The proposed neural network, or more precisely a physically realizable approximation, is then used to solve minimum norm problems subject to linear constraints. Minimum norm problems have many applications in various areas, but we focus on their applications to the control of discrete dynamic processes. The applicability of the proposed neural network is demonstrated on numerical examples

Data Structures and Sparsity in a Digital Simulation of an HVDC Link

Mark Timothy Luehmann — Tue, 13 Nov 2018 11:08:45 PST

An earlier algorithm for the detailed simulation of the High Voltage Direct Current (HVDC) link using the tensor approach was modified to improve the runtime speed. The runtime, approximately eleven percent that of the original algorithm, was improved by employing sparsity techniques, restructuring the state matrix, and condensing subroutines. The readability of the code was accomplished by a modular programming approach. This research was done for a two-terminal simulation, but could be expanded and modified to include multiterminal simulations.

Methodologies for Voltage Contingency Ranking

James Matthew Kulaga — Tue, 13 Nov 2018 11:08:38 PST

Contingency studies in interconnected electric power systems are performed to assess the capability of a system to withstand disturbances caused by equipment outages and other factors, and are relevant to the area of power system security. Due to the large size and complexity of modern power systems, accurate techniques for measuring the effect of line contingencies are often time consuming and impose a heavy computational burden. Thus, ranking methods are used to define a subset of the most severe contingencies to be studied in full detail. Contingencies are classified into voltage contingencies and power contingencies; the term voltage contingency refers to cases of bus voltage magnitude out of range, and the term power contingency refers to the case of line power flow out of range (overloads). The purpose of this research is to examine the singular value decomposition of a voltage sensitivity matrix derived from the Jacobian matrix of the Newton-Raphson power flow equation and its relationship to voltage contingency ranking. Possible ways of using the singular values in a bus voltage ranking performance index will also be studied. Additionally, methods of assessing the accuracy of contingency ranking performance indices will be considered. Proposed enhancements to these strategies and recommendations for further research will be discussed.

Toward a Robust Minimum Variance Beamformer for Multi-Rank Signal Via Mini-Max Processing Final Technical Report

Michael D. Zoltowski et al. — Tue, 13 Nov 2018 11:08:30 PST

A variation of Minimum Variance Distortionless Response (MVDR) based Match Field Processing (MFP) referred to as Semi-coherent MVDR MFP has been developed; Initial simulation results presented here indicate that Semi-coherent MVDR MFP is both relatively robust to mismatch, with respect to relative amplitudes and phases amongst multipath arrivals in an isovelocity ocean, and comparable in performance to Full-coherent MVDR MFP under no mismatch conditions. Full-coherent MVDR MFP assumes complete and perfect characterization of the underwater propagation channel and is extremely sensitive to mismatch between assumed model parameters and actual environmental parameters. Three main conclusions may be drawn from the simulation results and deduced by analysis as well: Full-coherent MVDR MFP is extremely sensitive to mismatch. A 2 m error in the assumed ocean depth caused between a 15 and 25 dB drop in the peak of the Full-coherent MVDR MFP ambiguity surface at the true source location relative to that obtained in the no mismatch case. The performance of the "Incoherent" MVDR MFP scheme proposed by Krolik et. al. [1] is substantially degraded relative to that of Full-coherent MVDR MFP in the no mismatch case. The peak of the "Incoherent" MVDR MFP ambiguity surface with no mismatch was between 10 and 15 dB less than that of the corresponding Full-coherent MVDR MFP ambiguity surface. The performance of Semi-coherent MVDR MFP is both relatively robust to mismatch, with respect to error in the assumed ocean depth, and comparable in performance to Full-coherent MVDR MFP under no mismatch conditions. The three versions of MVDR MFP, Full-coherent, "Incoherent", and Semi-Coherent, are described in detail in Sections 2 and 3 of this report.

Blockwise Transform Image Coding Enhancement and Edge Detection

Sabzali Aghagolzadeh — Tue, 13 Nov 2018 11:08:24 PST

The goal of this thesis is high quality image coding, enhancement and edge detection. A unified approach using novel fast transforms is developed to achieve all three objectives. Requirements are low bit rate, low complexity of implementation and parallel processing. The last requirement is achieved by processing the image in small blocks such that all blocks can be processed simultaneously. This is similar to biological vision. A major issue is to minimize the resulting block effects. This is done by using proper transforms and possibly an overlap-save technique. The bit rate in image coding is minimized by developing new results in optimal adaptive multistage transform coding. Newly developed fast trigonometric transforms are also utilized and compared for transform coding, image enhancement and edge detection. Both image enhancement and edge detection involve generalised bandpass filtering wit fast transforms. The algorithms have been developed with special attention to the properties of biological vision systems.

Exploiting Fine-Grain Concurrency Analytical Insights in Superscalar Processor Design

Pradeep K. Dubey et al. — Tue, 13 Nov 2018 11:08:17 PST

This dissertation develops analytical models to provide insight into various design issues associated with superscalar-type processors, i.e., the processors capable of executing multiple instructions per cycle. A survey of the existing machines and literature has been completed with a proposed classification of various approaches for exploiting fine-grain concurrency. Optimization of a single pipeline is discussed based on an analytical model. The model-predicted performance curves are found to be in close proximity to published results using simulation techniques. A model is also developed for comparing different branch strategies for single-pipeline processors in terms of their effectiveness in reducing branch delay. The additional instruction fetch traffic generated by certain branch strategies is also studied and is shown to be a useful criterion for choosing between equally well performing strategies. Next, processors with multiple pipelines are modelled to study the tradeoffs associated with deeper pipelines versus multiple pipelines. The model developed can reveal the cause of performance bottleneck: insufficient resources to exploit discovered parallelism, insufficient instruction stream parallelism, or insufficient scope of concurrency detection. The cost associated with speculative (i.e., beyond basic block) execution is examined via probability distributions that characterize the inherent parallelism in the instruction stream. The throughput prediction of the analytic model is shown, using a variety of benchmarks, to be close to the measured static throughput of the compiler output, under resource and scope constraints. Further experiments provide misprediction delay estimates for these benchmarks under scope constraints, assuming beyond-basic-block, out-of-order execution and run-time scheduling. These results were derived using traces generated by the Multiflow TRACE SCHEDULING™(*) compacting C and FORTRAN 77 compilers. A simplified extension to the model to include multiprocessors is also proposed. The extended model is used to analyze combined systems, such as superpipelined multiprocessors and superscalar multiprocessors, both with shared memory. It is shown that the number of pipelines (or processors) at which the maximum throughput is obtained is increasingly sensitive to the ratio of memory access time to network access delay, as memory access time increases. Further, as a function of inter-iteration dependency distance, optimum throughput is shown to vary nonlinearly, whereas the corresponding Optimum number of processors varies linearly. The predictions from the analytical model agree with published results based on simulations. (*)TRACE SCHEDULING is a trademark of Multiflow Computer, Inc.