FSL Sponsor Abstract: IBM Faculty Award

We propose to enable the operating system to use encrypted data as often as possible, thereby shrinking the window of vulnerability for an attacker to extract information from the machine. Although using encrypted storage on a persistent store may help secure data, the data is still vulnerable while it is in cleartext form in memory. Using network storage further increases this risk because we have to worry about network security, as well as the security of clients and servers. In this setting, we plan on creating end-to-end security for the data. We can ensure privacy at all points along the data path: the disks on the server side, in all layers of the server's OS, over the network, and in the layers of the client's OS until the data reaches the user process.

Moreover, the level of security must match the specific needs. Encrypting all data at the highest possible strength may be overkill for short-lived data that is not at the highest level of confidentiality. Not all data is created equal. Long-lived persistent top-secret data needs strong cryptography that could last years or even decades; but short-lived data could use weaker but a lot more efficient forms of encryption---and yet not compromise security because of the ephemeral nature of that data. We propose to build policies that allow CSOs, administrators, and users to set the level of encryption to match their needs. This would provide flexible mechanisms to trade off performance, convenience, and the desired level of security. For example, .c files may be more important to protect than shorter-lived intermediate .o files built during a compile process, and so .o files can use weaker encryption and hence be encrypted much faster.

For securing the data on disk, we would like to experiment with disks that have built-in encryption (P1619); if this is not available initially, we will utilize encryption layers inside the OS: device driver level, loop driver, or a file system wrapper.

To secure data end-to-end we will ensure that the server-side caches will only keep ciphertext data, while the client-side caches will be split into two parts: a small cleartext sub-cache for the "hottest" of data (to improve efficiency), and a larger ciphertext cache for performance. We plan to leverage the extensibility provided by NFSv4 to provide additional security (NFSv4 already supports RPC-layer security so we can be assured that while in transit, packets are secure). Since we have a split-cache model on the client side, we will leverage compression on the ciphertext cache to allow more pages to be stored in memory (this can reduce I/O rates): if the same data pages are compressed, then effectively more of them can be cached in memory. Compression is useful to trade off CPU cycles (which appear to be plentiful these days), to save significant amounts of network and disk-I/O.

We will investigate key management issues by leveraging existing techniques and technologies. Keys could be stored on servers or on clients, depending on the site's needs. In the simple case, the client will hold all the keys securely. But if the server has any secure hardware (ala 4758 or crypto-disks), then keys may reside on the server instead. Keys may have to be exchanged when files have to be shared: if a server keeps the keys, then we'll investigate NFSv4's delegation model to provide the keys to other clients if needed; alternatively, the client can send those keys to other trusted clients directly via NFSv4 callback channel (the CB_* operations).

Furthermore, we will evaluate the overheads associated with this added security, and create optimizations that will allow users to comfortably utilize the added security while keeping overheads to a minimum. We will also keep usability in mind: ease-of-use for administrators and end-users alike. These efforts are important since users generally will not use a product that offers increased security if performance or usability suffer too much.

For this project, we will use the latest Linux distributions and kernels, and utilize the kernel's own CryptoAPI and LSM. (We have always worked hard to minimize changes to existing kernels.) We expect the project to proceed in three one-year phases:

  1. Develop client side policies to determine the level of encryption on a per file basis. Split the client's page cache into a ciphertext and cleartext components and revise page flushing algorithms accordingly. Transmit ciphertext data to NFSv4 servers and store ciphertext directly on servers. Support keys stored on a single server or a single client.
  2. Extend NFSv4 protocol to understand new file attributes "compressed" and "encrypted" and enhance ext2/3 on server side to utilize those inode bits. Develop client-side unified split cache which can mix compressed and/or encrypted pages: this can effectively increase the size of the cache; compressed data will be sent over-the-wire and stored in compressed form on the server (using an efficient indexing scheme to support random reads/writes inside compressed files). This addition of compression should significantly save on network and disk I/O latencies, improving overall performance and scalability.
  3. Support client-to-client key exchanges via NFSv4 callbacks. After doing two examples (encryption and compression), we would have enough experience to generalize these security extensions and create a uniform API for such extensions. To demonstrate the practicality of the extensions API, we will port the encryption and compression components to this API, and we will also develop a new integrity-checking module.