Moreover, the level of security must match the specific needs. Encrypting all data at the highest possible strength may be overkill for short-lived data that is not at the highest level of confidentiality. Not all data is created equal. Long-lived persistent top-secret data needs strong cryptography that could last years or even decades; but short-lived data could use weaker but a lot more efficient forms of encryption---and yet not compromise security because of the ephemeral nature of that data. We propose to build policies that allow CSOs, administrators, and users to set the level of encryption to match their needs. This would provide flexible mechanisms to trade off performance, convenience, and the desired level of security. For example, .c files may be more important to protect than shorter-lived intermediate .o files built during a compile process, and so .o files can use weaker encryption and hence be encrypted much faster.
For securing the data on disk, we would like to experiment with disks that have built-in encryption (P1619); if this is not available initially, we will utilize encryption layers inside the OS: device driver level, loop driver, or a file system wrapper.
To secure data end-to-end we will ensure that the server-side caches will only keep ciphertext data, while the client-side caches will be split into two parts: a small cleartext sub-cache for the "hottest" of data (to improve efficiency), and a larger ciphertext cache for performance. We plan to leverage the extensibility provided by NFSv4 to provide additional security (NFSv4 already supports RPC-layer security so we can be assured that while in transit, packets are secure). Since we have a split-cache model on the client side, we will leverage compression on the ciphertext cache to allow more pages to be stored in memory (this can reduce I/O rates): if the same data pages are compressed, then effectively more of them can be cached in memory. Compression is useful to trade off CPU cycles (which appear to be plentiful these days), to save significant amounts of network and disk-I/O.
We will investigate key management issues by leveraging existing techniques and technologies. Keys could be stored on servers or on clients, depending on the site's needs. In the simple case, the client will hold all the keys securely. But if the server has any secure hardware (ala 4758 or crypto-disks), then keys may reside on the server instead. Keys may have to be exchanged when files have to be shared: if a server keeps the keys, then we'll investigate NFSv4's delegation model to provide the keys to other clients if needed; alternatively, the client can send those keys to other trusted clients directly via NFSv4 callback channel (the CB_* operations).
Furthermore, we will evaluate the overheads associated with this added security, and create optimizations that will allow users to comfortably utilize the added security while keeping overheads to a minimum. We will also keep usability in mind: ease-of-use for administrators and end-users alike. These efforts are important since users generally will not use a product that offers increased security if performance or usability suffer too much.
For this project, we will use the latest Linux distributions and kernels, and utilize the kernel's own CryptoAPI and LSM. (We have always worked hard to minimize changes to existing kernels.) We expect the project to proceed in three one-year phases: