{{Header}} {{title|title= Confidential Computing }} {{#seo: |description=Confidential computing is an advanced security technology that protects data while it's in use, complementing existing protections for data at rest and in transit. The goal is to isolate sensitive data from unauthorized access, even from the cloud provider or system administrators. |image=Confidential_computing.jpeg }} {{boot_firmware}} [[File:Confidential_computing.jpeg|thumb]] {{intro| Confidential computing is an advanced security technology that protects data while it's in use, complementing existing protections for data at rest and in transit. The goal is to isolate sensitive data from unauthorized access, even from the cloud provider or system administrators. }} = Introduction = Confidential computing is an advanced security technology that protects data while it's in use, complementing existing protections for data at rest and in transit. The goal is to isolate sensitive data from unauthorized access, even from the cloud provider or system administrators. One use case of confidential computing are virtual machines that can be trusted to remain uncompromised even in the event the cloud provider is malicious. Confidential computing is difficult because, as per the development status of most operating systems, a malicious host machine is usually able to control the environment, including all VMs in its entirety. There are two primary things that need to be shielded against in order for confidential computing to work: compromise of the host operating system/hypervisor, and compromise of the host hardware. Three separate methods must be used in combination to defend against these threats: * Encryption of virtual machine RAM and both physical and virtual machine disk. This is required to prevent data compromise in the event of an attacker gaining physical access to the hardware. * Isolation of host and guest kernels. A properly configured host must '''NOT''' be able to freely inspect virtual machine RAM or inject malicious commands into virtual machines. * Remote attestation. A user must be able to verify that the host is properly configured without having to trust the host. The following related projects provide useful features towards confidential computing: = Cloud Server Permission Goals = The cloud server administrator should have the capability: * to commission access for new customers * to remove access for customers (such as on non-payment) The cloud server administrator should have the capability to: * read hard drive or RAM contents of the customer. * observe, manipulate software execution of the customer. = Treat Model = * 1) random cloud employee cold boot attacking random machines -> 97% RamCrypt still better than nothing * 2) sophisticated direct attack against specific server -> 97% RamCrypt insufficient * not worry about intentional backdoor in Intel/AMD because then should not use CPU. ** With this in mind, Intel TME-MK might be more worthwhile than RamCrypt. Except if we assume, that Intel TME-MK might be buggy. But then RamCrypt is buggy for sure due to the 3% unencrypted issue. = Implementation Goals = * If at all possible, the host OS should be remotely attestable. When the host OS is able to be freely controlled by an attacker, it allows for advanced attacks such as MemBuster to be used to carry out side-channel attacks against transparently encrypted memory. (https://www.usenix.org/conference/usenixsecurity20/presentation/lee-dayeol) * If at all possible, the hardware state of the host machine should be remotely attestable. Otherwise a malicious PCIe or other hardware device could be used to attack the system in various ways. It may be difficult to tell for sure if a device is what it says it is, but if that hurdle can be overcome, something such as System Transparency could be used to compare the system's hardware state to a known-good state before permitting the host OS to boot. * Individual guest operating systems must be remotely attestable. * The RAM of virtual machines should be transparently encrypted by the host's kernel, using a key that is not saved to RAM. ** If this is impossible, enabling the secure encryption of the data of individual processes may be an acceptable workaround. * The disk contents of virtual machines must be transparently encrypted. * Ideally all software based. Not using hardware based RAM encryption and/or TPM. If that is at all possible. TODO: research = Components = == Full Disk Encryption == * [[Full Disk Encryption]] * todo == Full Disk Encryption Key Entry == * todo * Options: ** A) manual password entry ** B) [[Full_Disk_Encryption#TPM_Transparent_Encryption|TPM Transparent Encryption]] ** C) IP-KVM such as [https://pikvm.org/ PiKVM - Open and inexpensive IP-KVM on Raspberry Pi] == Full Disk Encryption Key Proetction from Cold Boot Attacks == === TRESOR === * https://www.cs1.tf.fau.de/research/system-security-group/tresor-trevisor-armored/ * Provides full disk encryption capabilities using an encryption key that is stored only within CPU registers. Uses x86-64 debug registers. * TRESOR is actually more performant than standard AES encryption. * When used on the host machine, it renders it impossible for a cold-boot attack to compromise disk encryption keys without being able to compromise the CPU's memory itself. * Similar advantages are conferred on guest VMs if their CPU registers are encrypted when saved to RAM. * Not yet present in the Linux kernel. === Loop-Amnesia === * https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=e5f940decaa589f3b2030429f48739281839e4d8 * Similar to TRESOR, provids full disk encryption. * Only saves disk encryption keys in RAM in encrypted form. A "master key" is used to decrypt the encryption keys in CPU registers and never allows them to touch RAM unencrypted. Uses performance counter MSR registers. * Does not use CPU-accelerated AES instructions. * Not yet present in the Linux kernel. * Slows down disk I/O performance by a factor of 2. * Apparently will destroy any filesystems it protects if the system is put into suspend and then resumed. See warnings at https://patricksimmons.us/amnesia.html. TRESOR avoids this by re-prompting for an FDE passphrase on resume. * Not yet present in the Linux kernel. * Mentions non-maskable interrupts (NMIs) as a potential attack vector to get disk decryption state flushed to RAM via a carefully timed interrupt injection. This is a real danger in a cloud provider scenario. The paper mentions that the NMI handler for Linux could be modified to wipe all general purpose register contents from RAM after being invoked in order to reduce the likelihood of this causing a problem. This wouldn't entirely mitigate the attack however and the use of a DDR interposer could be used to render the subsequent wiping of RAM futile. This is probably an effective attack against TRESOR too. == RAM Encryption == === Intel === ==== Intel TME-MK ==== Total Memory Encryption - Multi-Key (TME-MK) {{quotation |quote=Intel has introduced Intel® Total Memory Encryption - Multi-Key (Intel® TME-MK), a new memory encryption technology, on 3rd generation Intel® Xeon® Scalable processors (formerly code named Ice Lake). |context=https://www.intel.com/content/www/us/en/developer/articles/news/runtime-encryption-of-memory-with-intel-tme-mk.html }} * Provides granular, user-controllable full memory encryption. Multiple keys can be used at once, allowing a hypervisor to use one key for the host and different keys for each guest. The performance impact of using TME-MK is negligible (less than 2%). ** Allows the use of tenant-provided encryption keys, which theoretically should prevent an attacker with access to Intel from compromising the keys. ** Does not attempt to enforce guest/hypervisor isolation; the hypervisor must be trusted. ** Only supports 128-bit AES encryption. Does not support 256-bit AES encryption. * GNOME shows status of RAM encryption: https://www.reddit.com/r/gnome/comments/12lc5l4/how_do_i_fix_the_encrypted_ram_status_in_device/ Check if TME is available, enabled. {{CodeSelect|code= grep -i tme /proc/cpuinfo }} {{CodeSelect|code= sudo dmesg {{!}} grep -i x86/tme }} Example output in dmesg might be: https://forums.anandtech.com/threads/what-are-the-requirements-for-intel%C2%AE-total-memory-encryption-on-13900k.2611844/
x86/tme: enabled by BIOS x86/tme: Unknown policy is active: 0x2 x86/mktme: No known encryption algorithm is supported: 0x4 x86/mktme: enabled by BIOS x86/mktme: 15 KeyIDs available{{CodeSelect|code= sudo fwupdmgr security }} * availability: ** according to https://www.notebookcheck.net/Intel-Tiger-Lake-H-Alder-Lake-P-and-Alder-Lake-S-detailed-Alder-Lake-to-offer-up-to-8C-16T-configs-with-Xe-LP-DDR5-4400-RAM-Wi-Fi-6E-and-PCIe-Gen5-support.496197.0.html the following CPUs come with TME-MK *** Tiger Lake-H *** Alder Lake-P *** Alder Lake-S BGA {{quotation |quote=Intel® Hardware Shield on the Intel vPro® platform as they pertain to helping to protect system memory. It covers both software and hardware security capabilities. Specifically, this document provides in-depth information Intel® TME. |context=https://www.intel.com/content/www/us/en/architecture-and-technology/vpro/hardware-shield/total-memory-encrpytion.html }} Intel TME-MK might only be available in combination with Intel vPro. However, Intel vPro comes often (or always?) packages with Intel ME and Intel AMT, which many users would rather not have inside their CPU. See [[Out-of-band Management Technology]]. * https://github.com/Dasharo/dasharo-issues/issues/463 * https://github.com/Dasharo/dasharo-issues/issues/464 * https://github.com/Dasharo/dasharo-issues/issues/175 Search term: {{CodeSelect|code= site:ark.intel.com "Intel® Total Memory Encryption – Multi Key" }} ==== Intel TDX ==== * https://cdrdv2.intel.com/v1/dl/getContent/690419 ** Provides a full suite of confidential computing technologies in one solution, including memory encryption, host/guest isolation, remote attestation, and both physical and software based attack mitigations. ** The TDX module is FOSS under the MIT license and can be built, but only builds signed by Intel can be loaded by a TDX-supporting CPU, thus it is impossible to modify the TDX module and run it yourself. One can only verify that the binaries shipped by Intel correspond to their source code. ** Confidential VMs running under TDX are known as Trust Domains (abbreviated TDs). ** Considers the VMM emulator (such as QEMU) untrusted, thus device drivers in TDs are an attack surface. ** Only allows remote attestation of the VMs, not the host machine. ** Only supports 128-bit AES encryption for memory. ** Relies on encryption keys generated by the CPU for most memory encryption tasks and for remote attestation. Thus, potentially compromisable by Intel. ==== Intel SGX ==== * Find whitepaper and link here, Intel no longer provides the SGX product brief and Archive.org was down when I went to find it there. ** Allows running limited security-sensitive code in an encrypted, remotely attestable enclave. ** Code running in enclaves is very limited in capabilities, though this can be circumvented to some degree with the help of Gramine. ** Cannot be used to encrypt an entire VM. ** Relies on encryption keys controlled by Intel, thus potentially vulnerable to an attacker with access to Intel. === AMD === ==== AMD SEV technologies ==== * https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/memory-encryption-white-paper.pdf * https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf ** Encrypts VM memory to make VMs immune to memory-based attacks. *** Attempts to provide host/guest isolation; however, this isolation is not bulletproof as the hypervisor's device emulation is free to provide malicious data to guest device drivers (same problem mentioned in https://x86.lol/generic/2023/06/28/intel-tdx-2.html, "With TDX, the attack surface includes all device drivers... All devices are fair game from the attacker’s perspective. The malicious VMM can craft problematic responses from any device, such as the PCI Configuration Space or VirtIO."). Thus, it remains critical that the hypervisor be trusted. *** Only supports 128-bit AES encryption. *** Makes use of keys generated by AMD SEV firmware and does not appear to allow the hypervisor to specify the key to use. Thus, potentially compromisable by AMD. ==== AMD SME ==== * https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/memory-encryption-white-paper.pdf ** Somewhat similar to Intel TME-MK, but does not allow the use of user-provided encryption keys and appears to only use one key at a time. *** Only supports 128-bit AES encryption. *** Makes use of keys generated by AMD firmware and does not appear to allow the hypervisor to specify the key to use. Thus, can potentially be compromised by AMD. * Troublesome: [https://www.phoronix.com/news/Linux-SME-No-Default-Use Linux To No Longer Enable AMD SME Usage By Default Due To Problems With Some Hardware] ** Better to use [[#AMD TSME]]. ==== AMD SMEE ==== * TODO: https://blog.cloudflare.com/securing-memory-at-epyc-scale/ ==== AMD TSME ==== {{quotation |quote=Transparent SME (TSME) as the name implies is a stricter subset of SME that requires no software intervention. Under TSME, all memory pages are encrypted regardless of the C-bit value. TSME is designed for legacy OS and hypervisor software that cannot be modified. Note that when TSME is enabled, standard SME as well as SEV are still available. TSME and SME share a memory encryption key. |context=https://en.wikichip.org/wiki/x86/sme#Transparent_SME }} * https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/memory-encryption-white-paper.pdf ** Encrypts a machine's RAM contents transparently. It generates an encryption key at boot time known only to the CPU, then uses that key to encrypt anything written to RAM and decrypt anything read from RAM. *** Available on some consumer systems (where it is branded as "Memory Guard"). *** Can be effective at stopping cold-boot attacks, thus it is recommended to enable TSME or Memory Guard if your hardware supports it. *** TSME is not visible to the OS or applications running on the hardware, and thus does not provide any additional isolation between applications or system components. *** Makes use of keys generated by AMD firmware and does not allow the user to specify the key to use. Thus, can potentially be compromised by AMD. How to check if TSME is enabled? {{CodeSelect|code= sudo dmesg {{!}} grep SME }} Expected output:
AMD Secure Memory Encryption (SME) active* Check BIOS -> security -> TSME * https://github.com/AMDESE/mem-encryption-tests * https://unix.stackexchange.com/questions/627453/how-to-know-if-a-amd-cpus-sme-feature-is-enabled ==== AMD Infinity Guard ==== {{quotation |quote= GD-183A: AMD Infinity Guard features vary by EPYC™ Processor generations and/or series. Infinity Guard security features must be enabled by server OEMs and/or Cloud Service Providers to operate. Check with your OEM or provider to confirm support of these features. Learn more about Infinity Guard at http://www.amd.com/en/products/processors/server/epyc/infinity-guard.html. |context=footnote 1 on https://www.amd.com/en/products/processors/server/epyc/infinity-guard.html }} * Infinity Guard is a brand name for AMDs collection of hypervisor security features, including Secure Encrypted Virtualization technologies. * AMD Infinity Guard and AMD TSME (Transparent Secure Memory Encryption) are not the same, but [[#AMD TSME]] is a component of AMD Infinity Guard. * availability: ** Best to confirm the feature status with the server provider before purchasing. ** Some or all AMD EPYC CPUs have the TSME feature. Best to check the availability of the feature before purchasing the CPU or renting the server. ** https://www.hetzner.com/news/new-amd-ryzen-7950-server/ *** AMD EPYC available from root server provider
hetzner
for 236 EUR / month
**** TODO: Does this encrypt the RAM and protects it from hypothetical compromise of hetzner?
=== RamCrypt ===
* https://faui1-files.cs.fau.de/filepool/projects/ramcrypt/ramcrypt.pdf
** Provides memory encryption for individual processes running under a Linux kernel. Fully software-based, uses TRESOR for encryption of RAM. Due to being software-based, RamCrypt must store process memory in a decrypted form when that memory is being actively used. Otherwise, the software using that memory would be unable to understand the memory contents. This means that sensitive data is occasionally exposed in cleartext in system memory, though it is later re-encrypted when no longer in active use. RamCrypt therefore *mitigates* cold-boot attacks, but does not *prevent* them. RamCrypt also comes with a very substantial performance hit (at least 25%, sometimes much more).
*** It may be possible to use no-fill cache mode (a.k.a. cache-as-RAM) to mitigate RamCrypt's shortcomings in this area. It is not yet known if this is possible, and it is very likely that doing so would come with an even greater performance hit. See https://frozencache.blogspot.com/
**** PrivateCore vCage somehow manages to control CPU cache in a manner that doesn't harm performance too badly and does acheive the desired goal, so this may very well be possible. No-fill cache mode on Intel CPUs has an effect on L3 cache, so in theory if one managed the cache in a fully manual manner it could be possible to keep decrypted data present only in CPU cache.
*** Could be made to support 256-bit AES? The default implementation only supports 128-bit AES.
* The "3% problem". 3% of memory were unencrypted.
** At this level of threat model where we are considering a cold boot attack by the cloud server admin, unfortunately, 97% encrypted and 3% unencrypted means still a 100% loss.
*** The way to break this is my measuring the electricity to the RAM banks. There was a similar attack against ledger hardware wallet by researchers presenting at CCC (wallet.fail).
*** Or by using a malicious RAM bank that logs access to everything that goes into as a hardware proxy. Unknown if that exists yet but it seems quite easier to develop than breaking into the CPUs cache with a cold boot attack.
=== PrivateCore vCage ===
* Provides similar security to Intel TME / AMD SME, but appears to use a software implementation rather than hardware CPU features.
* Somehow keeps an entire Linux kernel and KVM hypervisor resident in L3 cache
* Company was acquired by Facebook/Meta in 2014, website appears "dead" ever since
* Technology was based on Linux and KVM, thus their source code modifications to Linux may be something that can be retrieved if it is possible to contact them?
** No source code appears to have been published
** No contact information was visible on the website
** Facebook acquired PrivateCore and it's products have never been made available to the general public.
=== OpenPOWER ===
* TODO: easier to implement RAM encryption with POWER9/11?
* It's a different CPU architecture. POWER9 is different from AMD64(which includes Intel). Software needs to be available for POWER9. With Debian ("the universal operating system") this mostly isn't an issue as it supports most platforms including POWER9. In edge cases such as with Tor Browser, no POWER9 binaries are available and building from source code is difficult.
* Demo: [https://www.whonix.org/wiki/Dev/Porting#Existing_Ports_of_Whonix Whonix reported to be running on POWER9 (OpenPOWER), Raptor Talos II using distro-morphing and KVM.]
* There's Debian cloud (server) images (qcow2; raw) for "64-bit Little Endian PowerPC" which is probably compatible with POWER9.
* Raptor is selling desktop computers and servers here: https://www.raptorcs.com/TALOSII/
* There are no notebooks based on POWER9 as in October 2024.
* Raptor is selling cloud servers: IntegriCloud - https://www.integricloud.com/
** [[#IntegriCloud_raptorengineering|Some "weirdness" about IntegriCloud.]]
=== RISV-V ===
* Good enough performance to consider using?
== Kernel isolation ==
=== Use Case ===
Only needed if intending to use VMs.
In case of "root servers" / dedicated servers with only 1 customer, there is no need for for pKVM, Xen or similar.
=== pKVM ===
* https://source.android.com/docs/core/virtualization/architecture
* https://lwn.net/Articles/836693/
** Provides separation of host and guest operating systems on arm64 platforms (specifically Android). Uses ARM virtualization features to install a portion of the KVM hypervisor in a higher-privileged area than the rest of the kernel, thus preventing the kernel from accessing the memory of, or interfering with the execution of, VMs on the same physical device. Additionally, provides strong local and remote attestation features both within pKVM itself and with Verified Boot.
*** Guest operating systems are not assumed to be locked down, arbitrary code can be run in guests with pKVM. See https://source.android.com/docs/core/virtualization/security
*** Local attestation is provided by pKVM itself to prevent malicious VM tampering.
*** pKVM's security features are not guaranteed to work correctly if the device pKVM runs on is unlocked. Verified Boot can be used for remote attestation of the device to detect when this is the case (this does not use SafetyNet/Play Integrity).
*** Currently ARM64-specific; while some of the code is in the upstream Linux kernel, it is likely that some of it may be Android-specific and thus hard to use outside of the context of Android.
** KVM private memory (https://lwn.net/Articles/890224/) Isolates guest memory from a malicious KVM hypervisor.
=== Xen ===
* https://xenproject.org
* A true type-1 hypervisor. Much smaller than the Linux kernel, currently at around 370K lines of code, thus has a dramatically lower attack surface and is much easier to trust than a full Linux kernel. While a privileged virtual machine (dom0) can *control* the hypervisor, it cannot inspect VM memory. Xen also provides guest (domU) protection from dom0 to some degree (this needs to be researched further). Additionally, it provides a hardware TPM-backed vTPM implementation. Used in Qubes OS.
=== KVM private memory ===
* https://lwn.net/Articles/890224/
* Doesn't appear to be related to pKVM.
=== Gramine ===
* https://github.com/gramineproject/gramine
** Allows running unmodified Linux applications within SGX enclaves.
** Works at the application level, not the VM level.
** Cannot be used to remotely attest the host machine or the guest running the enclave, instead attempts to keep the application safe from the host and guest.
** Used by Enclaive (https://github.com/enclaive) to provide SGX-protected versions of various apps and programming languages.
== Verified Boot ==
* See [[Verified Boot]].
* Documentation for replacing default Secure Boot keys with one's own keys: https://wiki.archlinux.org/title/Unified_Extensible_Firmware_Interface/Secure_Boot#Using_your_own_keys
== non-root enforcement ==
Non-root enforcement limits the privileges and access rights of non-root users or processes, ensuring they can't make system-level changes or access sensitive parts of the system without proper authorization.
Nobody, not even the cloud server administrator should have access to root (administrative rights). This is because with root rights, the cloud server administrator could get access to sensitive data on the hard drive and/or RAM contents.
related: [[Dev/user-sysmaint-split]]
== Remote Attestation ==
=== What is Remote Attestation ===
* '''Purpose:''' Remote attestation is used to prove to a remote party (usually a server or another device) that the software and hardware configuration of a system is in a known and trusted state. It provides an external entity with proof that the system is running trusted software, based on cryptographic measurements (such as hashes or signatures).
* '''Process:''' A trusted platform module (TPM) or similar hardware records cryptographic measurements during the boot process. These measurements (hashes of boot components) are sent to a remote party, which compares them to a list of trusted values. If the measurements match the expected trusted state, the remote party can trust the system's integrity.
* '''Examples:'''
** In cloud computing, remote attestation is used to verify that a virtual machine (VM) or physical server is in a trusted state before allowing it to interact with sensitive data or perform specific operations.
** On Android, the operating system or application ... see [[Miscellaneous_Threats_to_User_Freedom#Device_Attestation_such_as_SafetyNet|Device Attestation such as SafetyNet]].
In an ideal situation, the keys used for remote attestation should be able to be set by a known-trusted entity, and cycled out without requiring hardware replacement. Raptor Engineering's FlexVer is capable of this. Unfortunately many remote attestation technologies used in amd64-family CPUs (Intel TDX, Intel SGX, AMD SEV) rely on a key that is hardcoded into the CPU and ultimately controlled by Intel. TPM chips also oftentimes rely on the security of a key written to them by the manufacturer. Should one of these keys ever become compromised, remote attestation with these technologies will no longer be trustworthy, and replacing the compromised key with a good one will require that every single affected CPU or TPM device be replaced with a new one containing a new key.
=== Dynamic Root of Trust for Measurement ===
* https://trustedcomputinggroup.org/wp-content/uploads/TCG_D-RTM_Architecture_v1-0_Published_06172013.pdf
* https://security.stackexchange.com/questions/274214/how-does-measured-boot-work-using-tpm
** Uses TPM-based features to ensure that a booted OS is trusted and uncompromised at load time.
** Intel TXT (Trusted eXecution Technology) is one implementation of this.
*** https://www.intel.com/content/www/us/en/support/articles/000025873/processors.html
*** https://invisiblethingslab.com/resources/bh09dc/Attacking%20Intel%20TXT%20-%20paper.pdf
*** tboot (https://sourceforge.net/p/tboot/code/ci/default/tree/) Uses Intel TXT to securely boot Xen or Linux.
** AMD also has their own implementation.
*** https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/user-guides/58453.pdf
=== Raptor FlexVer ===
* https://www.raptorengineering.com/TALOS/documentation/integrimon_intro.pdf
* https://www.raptorengineering.com/TALOS/security_features.php
* https://www.crowdsupply.com/raptor-computing-systems/talos-secure-workstation/updates/talos-fpga-functions-and-responsibilities-part-2
* Present on Raptor Engineering POWER9 workstations and servers. Unclear whether there are any other systems that ship with similar technology.
** Allows one to provision a system with firmware and needed data for remote attestation on-site, then remotely attest that system's state thereafter even if transferred to a physically insecure location.
** Contains tripwires that erase sensitive data and make the machine fail attestation if tampering is detected.
** Usable to implement verified boot.
** Fully open-source with reproducible builds.
*** Source available somewhere under https://gitlab.raptorengineering.com/explore/groups most likely
** Requires a substantial investment of effort for one to gain the needed trust in a FlexVer provisioned system, as it requires one to either physically audit or trust someone else to physically audit the machine being used.
=== Trusted Platform Module remote attestation ===
* https://safeboot.dev/attestation/ provides a good overview of how this works.
* Allows a remote user to ensure they trust a machine that they will eventually run a VM on.
* Typically requires one to trust the TPM manufacturer, though Raptor FlexVer avoids this.
* At least some TPMs may be reprogrammable by the end-user, potentially allowing one to generate and install their own keys for use in remote attestation. https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Cyber-Sicherheit/SiSyPHus/Workpackage5b_TPM_Provisioning.pdf?__blob=publicationFile&v=1 Assuming proper key management on the user's part, this would be much safer than trusting the TPM manufacturer.
=== swtpm ===
* https://github.com/stefanberger/swtpm
* Provides a virtual TPM for use in virtual machines. If used on a remotely attested, trusted host, allows remote attestation of individual virtual machines.
** Xen has its own vTPM subsystem which makes this unnecessary for a Xen-based deployment.
=== Keylime ===
* https://keylime.dev/
* https://youtube.com/watch?v=Qhr_aVBCZPw
* Leverages TPM features to provide remote attestation and other security features. Investigate further.
* https://www.cvedetails.com/vendor/23068/Keylime.html
=== System Transparency ===
* https://www.system-transparency.org/
* https://mullvad.net/en/blog/2019/6/3/system-transparency-future
* https://docs.system-transparency.org/st-1.1.0/docs/reference/stboot-system/
** Uses a Linux Unified Kernel Image (UKI) as a bootloader for another OS.
** Carefully checks the integrity of the OS being booted before allowing it to be loaded and run.
** Tools are included for generating the bootloader executable and configuring it properly.
** Potentially usable for verifying the hardware state of the host machine. Intel TXT could be used to load the stboot bootloader in a verified fashion, then stboot could verify that the system's hardware matches a known-good state before loading the real OS.
** Does not seem to have remote attestation tools built in yet? Is the TPM supposed to handle this?
** Requires that the bootloader itself be uncompromised at boot - TPM attestation or Intel TXT might be useful for this.
== Tamper protection ==
Instantly shutting down and wiping part or all of a system in the event of physical intrusion can be effective to prevent hardware-based attacks.
=== ORWL ===
* https://www.crowdsupply.com/design-shift/orwl
** No longer in production.
** Used a very involved tamper protection mechanism to immediately wipe a disk encryption key if the system was opened or otherwise tampered with.
** Only wiped a disk encryption key, not RAM contents.
== Miscellaneous useful technologies ==
=== Raptor LPC Guard ===
* Mitigates physical LPC bus hijacking attacks that could be used to compromise TPM-stored secrets.
** Example of such a TPM secret compromise attack: https://www.youtube.com/watch?v=wTl4vEednkQ
=== Microsoft SEAL ===
* https://www.microsoft.com/en-us/research/project/microsoft-seal/
* General-purpose homomorphic encryption library
* Carries many warnings that it is easy to shoot yourself in the foot with
* Not directly related to confidential computing but might be useful for some part of it?
=== Constellation ===
* https://docs.edgeless.systems/constellation/
** Runs entire Kubernetes clusters inside confidential VMs, leveraging Intel TDX or AMD SEV along with remote attestation to keep the cluster secure
** Orchestrates a number of other components critical to confidential computing such as networking and drive encryption, key management, etc. Basically takes modern confidential computing functionality and puts it together into a single solution
** Open-source code under GNU Affero General Public License, prebuilt binaries are offered under a freemium licensing model
** Relies heavily on hardware-based confidential computing features requiring silicon vendor trust
** Depending on the cloud provider used, the cloud provider's hypervisor is considered trusted, greatly reducing security unless one can audit and remotely attest the hypervisor
=== Contrast ===
* https://docs.edgeless.systems/contrast/
** Runs individual Kubernetes containers inside a cluster managed and hosted by a third party.
** Orchestrates a number of other components critical to confidential computing such as networking and drive encryption, key management, etc. Basically takes modern confidential computing functionality and puts it together into a single solution
** Open-source code under GNU Affero General Public License, license of binaries is unclear
** Relies heavily on hardware-based confidential computing features requiring silicon vendor trust
=== Continuum ===
* https://docs.edgeless.systems/constellation/overview/confidential-kubernetes
** Runs AI code in a confidential VM, using sandboxing to make it more difficult for an AI code author to violate the privacy of the user interacting with the AI model.
** Does not use homomorphic encryption, instead relying on sandboxing to prevent the AI code from doing anything that it's not supposed to.
** Allows the user to remotely attest the confidential VMs to ensure they are trustworthy.
= Missing Pieces =
* How does one ensure that the VM they uploaded to the cloud vendor is really the one they want to run? One possible idea is to upload pre-installed, fully encrypted VMs to the cloud vendor, which can be booted only by passing an encryption key to the VM at boot time using an end-to-end encrypted channel (SSH login to initramfs). The VM then would contain sha512 hashes for every static file in it, with those hashes protected by a digital signature. On first boot, the VM would run a self-check using these hashes and signature to ensure it had not been tampered with. If this check passed, the VM would then consider itself trusted and would set itself up for vTPM-powered remote attestation. Investigate to see if a technology that performs the same functions already exists or is even needed; consider creating it if it doesn't exist and is needed.
* Does Xen support Intel TME-MK?
* Is there a way to remotely attest more than just the OS?
= Putting it all together =
* A cloud vendor uses Xen and a minimal dom0 to run cloud servers. tboot is used to ensure the hypervisor is secure.
* The cloud vendor uses TRESOR for full disk encryption, storing the encryption key on a hardware authentication device, such that in order to boot the device, a trusted staff member must insert the key into the server at boot time. (With the use of Intel TME-MK, it might theoretically be possible to implement cold-boot proof disk encryption *without* needing TRESOR; however, this requires further investigation.)
* The cloud vendor uses Intel TME-MK to encrypt the RAM of the host, preventing cold-boot attacks by a third party. At boot, the host generates an encryption key of its own and re-encrypts its memory with that new key, so as not to trust the processor-generated key used by default.
* Remote users can remotely attest the cloud vendor's physical machines at will, ensuring they trust the Xen and dom0 code running on them.
* Verified Boot and non-root enforcement attempt to mitigate tampering of the running system.
* Users upload pre-installed, encrypted VM images to the cloud vendor, which are kept tamper-proof using something (see the "missing pieces" section above). Upon passing a self-check, the VM sets itself up for remote attestation using a hypervisor-provided vTPM.
* Xen uses Intel TME-MK to encrypt the RAM of the running VM with a user-provided encryption key, preventing cold-boot attacks by either the vendor or a third party.
* Users remotely attest running VMs to ensure they are trusted.
* At this point, the user has a confidential, cold-boot-proof VM. The cloud vendor must be trusted to keep their Xen and dom0 software up-to-date so as to prevent vulnerabilities from allowing VM compromise. The cloud customer can verify remote cloud server software version numbers using remote attestation.
Rough time estimate to get TRESOR working in Linux: possibly around 80 hours of actual time invested (four hours a day, five days a week, for one month). This is a '''very''' rough estimate since actual time will depend on how receptive devs are to the ideas and how much effort is required to polish their implementation to a usable state.
= Technologies investigated but not useful =
* Microsoft Pluton: Basically a fancy hardware TPM integrated directly into the CPU. Under Linux, it only provides TPM features. It is implemented as another "secure processor" similar to Intel ME. Beyond TPM-like functionality, Pluton also provides features somewhat similar to Google's SafetyNet, but only on Windows. https://gabrielsieben.tech/2022/07/25/the-power-of-microsoft-pluton-2/
* HashiCrop
* Azure Key Vault: Cloud-based, HSM-backed secret management system for applications. Claims to be unable to see customer-provided secrets but whether this is enforced by hardware or by policy is unclear. Not particularly useful for confidential computing.
* Google Cloud Key Management: Very similar to Azure Key Vault but from Google rather than from Microsoft.
* HashiCorp Vault: Server-based centralized application secret management system. Not directly related to confidential computing, utility in a confidential computing scenario is questionable. Would probably be useful for individuals running cloud services on confidential VMs.
* Thales cloud security: Very vague umbrella term for a number of different "security" technologies provided by Thales. Does not appear directly related to confidential computing.
* AWS Cryptographic Computing: Homomorphic encryption based technologies for processing data without decrypting it. Mostly used for machine learning purposes.
* AWS Clean Rooms: Used for allowing companies to share data analysis information with each other without exposing sensitive info directly. Unrelated to confidential computing. Appears privacy-invasive.
* Key Management Interoperability Protocol: Key management server software, conceptually similar in some ways to Hashicorp Vault. Not directly related to confidential computing, utility in a confidential computing scenario is questionable. Would probably be useful for individuals running cloud services on confidential VMs.
= Discussions =
* [https://lore.kernel.org/lkml/20241003194147.2566a393@kf-ir16/T/ Investigating practicality of process memory encryption techniques using frozen cache and TRESOR/RamCrypt]
* [https://forums.whonix.org/t/enable-secure-memory-encryption-sme-kernel-parameter-mem-encrypt-by-default/10393 Enable Secure Memory Encryption (SME) - kernel parameter mem_encrypt - by default?]
= Cloud Vendors =
'''NOTE:''' This is not an endorsement or recommendation of any vendor or technology. Some of the vendors listed here have security concerns associated with them.
== Microsoft ==
* https://azure.microsoft.com/en-us/solutions/confidential-compute/#solution-architectures
** Azure Confidential Computing: A commercial offering powered by confidential computing technologies. No mentions of any special technologies in particular.
== Google ==
* https://cloud.google.com/confidential-computing
** Google Confidential Computing: A commercial offering powered by confidential computing technologies. No mentions of any special technologies in particular.
== IntegriCloud raptorengineering ==
* https://www.integricloud.com/
** Appears to be RAPTOR Engineering's cloud computing division.
** Only provides firmware source code to "active users of the IntegriCloud™ platform, auditors for active users of the IntegriCloud™ platform, and security researchers."
** Unclear whether physical auditing of the cloud servers is permitted, which is slightly strange since [[Dev/confidential_computing#Raptor_FlexVer|Raptor FlexVer]] relies on yourself or someone you trust auditing the servers from what I understood from the papers.
{{quotation
|quote=NOTE: The source packages required to build reproducible versions of our node firmware and software stack are only available to active users of the IntegriCloud™ platform, auditors for active users of the IntegriCloud™ platform, and security researchers.
Please contact us at support@integricloud.com with your IntegriCloud™ username to request access to the source packages and build instructions.
|context=https://www.integricloud.com/content/base/software.html
}}
* Asked by e-mail about this in October 2024 but did not receive an e-mail at time of writing.
Hello, During our research on confidential computing [1], we came across IntegriCloud. From an initial look, it seems your approach aligns well with our values in the security community, particularly around principles like security, Freedom Software, and transparency. However, one statement on your website struck us as somewhat inconsistent with these values: > NOTE: The source packages required to build reproducible versions of our node firmware and software stack are only available to active users of the IntegriCloud™ platform, auditors for active users of the IntegriCloud™ platform, and security researchers. Why the restriction? Wouldn't it make more sense to host the source code publicly on a source control platform, such as Git, to ensure greater openness and collaboration? Looking forward to your thoughts. I would also like to post your reply in full, without redactions, on our website for transparency and to satisfy the curiosity of our readers. Best regards, Patrick [1] https://www.kicksecure.com/wiki/Dev/confidential_computing [2] https://www.integricloud.com/content/base/software.html* reply received. TODO: update == Enclaive == * https://www.enclaive.io/ * https://github.com/enclaive ** Provides software for managing confidential computing resources on other clouds. ** Packages several applications with Gramine and Docker to allow them to run confidentially on otherwise untrusted VMs with Intel SGX. ** Many of their open-source projects appear to not have seen much maintenance in the recent past (many repos have a last commit date of one year ago) = Homomorphic Encryption = * https://en.wikipedia.org/wiki/Homomorphic_encryption * Microsoft ** https://learn.microsoft.com/en-us/azure/architecture/solution-ideas/articles/homomorphic-encryption-seal * AWS ** https://aws.amazon.com/security/cryptographic-computing/ = Full process memory encryption research = This is research that has been done to implement a RamCrypt-based memory encryption strategy that keeps decrypted information in the CPU cache, allowing important data to only touch system RAM in an encrypted form. == No-fill cache mode == This is a CPU cache mode present on Intel CPUs that (among other things) prevents cache contents from being replaced when a read or write misses the cache and accesses system memory instead. If a memory range is in the cache, the memory range's cache mode is set to writeback, and then the cache is placed into no-fill mode, it will essentially "freeze" that location of memory in the system's cache, ensuring that read accesses to that area of memory only access the cache, and that write accesses only update the cache. Note however it is possible to explicitly instruct the CPU to flush cache contents to system RAM, and that the instruction that does this (wbinvd) '''must not''' be executed while sensitive data is present in the cache. == Memory paging == See https://wiki.osdev.org/Paging. In all 32-bit and 64-bit Intel-based systems (and really most non-embedded systems in general), memory accesses are virtualized and go through a memory management unit (MMU) that translates virtual addresses to physical addresses. This is used for a lot of things, including allowing the OS to implement a swapfile, but notably for our use one can mark a particular page of memory as being "not present" for basically any reason. RamCrypt takes advantage of this to mark an encrypted memory page as being "not present", so that any attempt to access that page triggers a ''page fault''. This instructs the kernel to figure out why the page is marked as "not present" and fix it if possible. The kernel can then see that the page is encrypted, decrypt it, mark it as present, and then have the program attempt the memory access again. If there is already code executing in the page fault handler to decrypt the RAM page, it should be possible to relocate it at the same time. In theory, it may be possible to freeze a relatively large portion of system RAM into cache using no-fill cache mode, then have the RamCrypt page fault handler decrypt RAM contents into the cached area. Then the page fault handler would change the address the page is located at to point to the decrypted page in cache, mark it as present, and then return control to the application (such as QEMU). == Cache as Instruction RAM == https://pete.akeo.ie/2011/08/ubrx-l2-cache-as-instruction-ram.html Data loaded into cache will probably end up loaded into the L1 data cache first, which is problematic as the CPU cannot execute code in the L1 data cache. It has to be moved to the L1 instruction cache first. The UBRX project worked around this by loading executable code into the L1 data cache, then reading enough of some sort of data to eventually push the code out of L1 cache and into L2 cache. Then the CPU could be instructed to execute it. For memory pages containing executable code, this may be necessary to allow them to run after decryption. (It's unclear if this is actually still necessary - the data in the L1 cache should just be at an address, like any other form of data in memory, so it seems like the CPU should be able to figure out that it's dealing with "self-modifying" code and move the data to the L1 instruction cache by itself. Experimentation will be needed to determine if this is the case or not.) == BareMetal == https://github.com/ReturnInfinity/BareMetal Very small operating system written mostly in assembly language for 64-bit Intel and AMD CPUs. It's small and relatively simple, thus would be useful for prototyping and benchmarking transparent RAM encryption with cache used as RAM. If a working implementation can be made here, then it is likely possible to implement the same sort of technology in Xen or Linux. == BitVisor == https://github.com/matsu/bitvisor Small hypervisor meant to provide security features (such as transparent encryption) to a single guest OS. Could also potentially be used to prototype a RAM encryption + frozen cache implementation. Might be usable for a production implementation although this would require the use of nested virtualization, further slowing down the system. = Footnotes =