Operating Systems Support for NVM
Byte-addressable non-volatile memory changes the way that programs can operating on data. We are exploring ways to design an operating system centered around the goal of non-volatile memory support, from consistency support for applications and the kernel in the face of power failures, to programming models for applications to better make use of non-volatile storage. For research work focusing on non-volatile or storage class memories, see Storage Class Memories. The work here focuses more on operating system design and support for applications with non-volatile memory.
We are developing a new operating system called Twizzler. We take inspiration from prior systems such as MULTICS and Opal when designing Twizzler, including large, single address space features with symbolic address references. We are currently developing a prototype in FreeBSD, allowing us to leverage our design without needing to invest in building a kernel from scratch.
To best use NVM, programs must be able to operate on data directly, without OS involvement and without ephemeral copies. Programs operate in a data-centric way, not a process-centric way. Persistent data is accessed by multiple processes, and shared between them, thus data structures must support being accessed invariantly---from anywhere in any virtual address space. However, we cannot afford to explicitely serialize data, both because this presents an unacceptible performance overhead, but also because these explicit operations obscure programming and prevent simple, high performance applications from computing directly on persistent memory, something that was not possible before. The separation between the persistent and volatile memory domains is no longer reasonable, especially as device latency drops below OS operation latency.
Operating System Support
We are building Twizzler to support this memory-access centric model. Twizzler provides programs with cross-object pointers that can access data in other objects. Like MULTICS, a pointer refers not to an ephemeral virtual address, but a piece of data in an object. This requires OS support to avoid application developers from implementing this vital feature in an application-dependent manner.
The operating system is built around memory access. This leads to two ideas: kernel state should itself be memory, and security should be enforced by the MMU. By building the kernel around the idea of memory access being the primary operation of programs, we can build a simpler, more secure, and higher performance operating system that supports composable programs that are themselves high performance. The direct access nature of cross-object pointers, and the direct support for these primitive, allow programs to be written in a way that naturally expresses computation without being hidden behind a serialization layer.
We are building a Twizzler prototype by augmenting FreeBSD to include necessary kernel support. We have a working prototype userspace with example applications, and have run microbenchmarks on our pointer operation overhead. We are currently in the process of finishing implementing support for the security model, and plan to run experiments on it soon.
A New Kernel
We plan to implement a new kernel to explore the gains in simplicity and security our model allows. We expect the new kernel to be vastly simpler than a UNIX kernel, and we plan to evaluate the gains in performance that may be possible.
We plan to explore the implications to security that a memory-access centric model provides, including improved implementations of security contexts, capability-based systems, and how we can restrict access control enforcement to the MMU in many cases. We have a designed security model, and plan to implement test programs to explore its use shortly.
By building a POSIX layer atop Twizzler, we can provide existing applications with an easy way to run on Twizzler with no modification. We plan to build this, and study the performance impact. We may gain in performance, since I/O can be translated to memory accesses in many cases.
We have a number of publications lined up, and are currently working on publishing an initial paper describing the overall system design.