Lightweight Programming Abstractions for Increased Safety and Performance
Abstract
In high-level programming languages, programmers do not need to worry about certain implementation details that compilers or interpreters do behind the scenes. However, this oftentimes results in some loss; in the former case, it is the inability to precisely communicate programmer’s intentions to a compiler that compromises safety, and in the latter case, it is the loss of performance because an interpreter needs to do extra work at runtime. Modern languages tend to address this problem differently, albeit rarely without serious limitations. In this dissertation, we develop lightweight programming abstractions whose implementation is practical in multi-paradigm high-level languages such as Scala and C++. The main idea of this work is exploitation of the type system to guide both the code generation (for performance) and type checking (for safety), so that more efficient specialized code is produced or more compiler errors are raised, respectively. This is done by encoding properties of the data as well as data layout, and employing metaprogramming techniques such as staging and template instantiation. We make five main scientific contributions. First, we formalize second-class values with stack-bounded lifetimes as an extension of simply-typed λ calculus, as well as its generalization to polymorphic type systems such as F, and calculi with path-dependent types described in the Dependent Object Types (DOT) family; we further generalize the binary first- vs second-class distinction to an arbitrary type lattice—or, more generally, a privilege lattice—then show that abstract type members naturally enable privilege parametricity. Second, we propose a model of checked exceptions based on second-class values, which unlike monads, do not suffer from well-established shortcomings of requiring users to rewrite their code in monadic style throughout. Third, we develop a memory model with data views, which decouple the presentation/interface of a data structure from its layout/storage, and offer not only performance gains through code specialization but also increased safety due to a finer grained control of references to the underlying storage (similar to ownership type systems). Fourth, we design lexically scoped borrowed references with Rust’s semantics, including no mutable aliasing, but in a flow-insensitive setting using second-class values. Fifth, we empirically show within a realistic subset of Scala (MiniScala) that performance gains enabled by stack in place of heap allocation, which may be significant according to previous studies, can be guaranteed via second-class values; in fact, the usage of the more expensive heap is reduced to O(1) in the majority of the benchmarks ported from Scala Native and the Computer Languages Benchmarks Game. Finally, all of these findings are backed by artifacts: an extension of the Scala language with type-checking rules for second-class values and multiple case studies, data views as a library-based framework in C++/Scala along with an evaluation pipeline involving microbenchmarks, an implementation of Rust-like borrowed references as a Scala library, and a modified MiniScala’s type-checker and memory allocation scheme, as well as accordingly ported and annotated benchmarks.
Degree
Ph.D.
Advisors
Rompf, Purdue University.
Subject Area
Computer science
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.