Four Years, A Shared Horizon: A Journey of Trust and a Vision for the Future

From WireTap to TEE.fail: Recent Advances and Technical Analysis in TEE Security

By Max He
|
From WireTap to TEE.fail: Recent Advances and Technical Analysis in TEE Security

Recently, a research team from Georgia Tech, Purdue University, and van Schaik LLC published an attack named TEE.fail, building on their earlier work known as WireTap[1][2][3][4]. While both attacks are based on the same core idea, they differ in implementation details and platform coverage. 

For simplicity, we will refer to them collectively as TEE.fail in this article. The research demonstrates that under the threat model of physical access, the deterministic memory encryption relied upon by Trusted Execution Environments (TEEs) is not unbreakable. By inserting a hardware interposer on the bus between the CPU and DRAM, the researchers were able to capture and analyze encrypted memory traffic, successfully recovering the nonce used in an ECDSA signature within a TEE and subsequently deriving the private attestation key associated with remote attestation.

This discovery presents a new challenge to systems that rely on TEE as their root of trust. TEEs provide an encrypted, isolated, and verifiable execution space for sensitive code and data through processor-level isolation and remote attestation, and have been widely deployed in cloud computing and Web3. Typical implementations include Intel SGX, Intel TDX, AMD SEV-SNP, and NVIDIA TEE. However, under this extreme threat model, which combines physical hardware tampering with OS kernel module modification, the experiment shows that certain memory-encryption implementations can still exhibit information-leakage pathways.

According to the researchers, the TEE.fail method applies to platforms including Intel SGX, Intel TDX, AMD SEV-SNP, and NVIDIA TEE. For clarity, this article will use Intel SGX as the example to explain the core principles, scope, and mitigation strategies of TEE.fail.

Remote Attestation 

ECDSA Signature Algorithm

ECDSA is a widely used elliptic curve digital signature algorithm[5], which can be briefly described as follows.

Given the parameters: an elliptic curve $E/\mathbb{F}_p$, a base point $G$ (of order $n$), and a hash function $H$. The attestation key is $d \in [1,n-1]$, with the corresponding public key $Q = dG$.

The signing process (for message $m$):

  1. Select a one-time nonce $k \in [1, n-1]$ (which must be unique and kept secret). 
  2. Calculate $R = kG = (x_R, y_R)$. Let $r = x_R \bmod n$. If $r = 0$, choose a new $k$.
  3. Calculate $e = H(m)$. 
  4. Calculate $s = k^{-1}(e + d \cdot r) \bmod n$. If $s = 0$, choose a new $k$.
  5. The output is the signature pair $(r, s)$.

Certificate Hierarchy

SGX Certificate Hierarchy
Figure 1: SGX Certificate Hierarchy

Mainstream TEEs establish trust through “remote attestation,” where a protected environment generates a quote that is signed by a certificate chain. A client verifies this chain and the platform’s state to confirm its trustworthiness.

Figure 1 illustrates the core structure of the Intel DCAP certificate hierarchy[6], which includes the certificate chain from the Intel Root CA down to the platform. The entire chain of trust originates from the Intel SGX Root CA, which signs certificates for a Processor CA and a Platform CA. These, in turn, generate the PCK (Provisioning Certification Key) certificates used for remote attestation. These PCK certificates carry the identity public key of the SGX platform or processor. They are signed by Intel’s private key chain, ensuring that the signature in an enclave’s attestation report (quote) can be verified externally. The entire system is ultimately guaranteed by the Intel Root CA, forming the root of trust for SGX remote attestation.

Remote Attestation Process

On-Platform Certificate Chain
Figure 2: On-Platform Certificate Chain

Figure 2 shows the on-platform certificate chain relationship for Intel DCAP[7]. A user enclave initiates the DCAP attestation process by sending a report to the Quote Enclave (QE). The QE, after local validation, endorses the report by signing it with its attestation key. The QE’s certificate is issued by the Provisioning Certification Enclave (PCE), which uses the private key corresponding to the Intel SGX PCK Certificate (shown in Figure 1). This makes the entire signature chain traceable back to the Intel SGX Root CA. A third party can verify this complete chain to confirm the identity and trustworthiness of the user Enclave.

Prepare Attack

Hardware Interposition

A key hardware breakthrough of this research was making high-speed bus snooping a low-cost, reproducible engineering feat. The researchers first downclocked the DDR4/5 frequency from its factory speed to 1333 MT/s by modifying the DIMM’s SPD (Serial Presence Detect). This slowed down the entire system, drastically reducing the sampling bandwidth required and allowing them to use older, less expensive logic analyzers to capture the memory bus signals[1][2].

Hardware Interposer Setup
Figure 3: Hardware Interposer Setup

For the physical connection, they started with a standard memory riser card and, inspired by Keysight’s SoftTouch probes, soldered the probe’s RLC isolation network components directly onto the riser. This created a custom-built DRAM interposer that both protected the CPU drivers and delivered a stable signal to the logic analyzer, ensuring the system could boot and operate without errors.

Data Acquisition
Figure 4: Data Acquisition

To minimize costs, they used second-hand logic analyzer modules and a chassis, keeping the total cost under $1,000. This setup allowed them to parse the DDR4/5 command, address, and data streams at the reduced clock speed, capturing the ciphertext fragments and their corresponding physical addresses for subsequent analysis.

Analysis of the Memory Encryption Mechanism

AES-XTS Encryption/Decryption Algorithm
Figure 5: AES-XTS Encryption/Decryption Algorithm

Total Memory Encryption (TME)[8] is a hardware-level memory encryption mechanism widely adopted in mainstream TEEs (though naming may vary across technical documents). It transparently encrypts the entire physical memory without requiring changes to the OS or applications. TME uses a key generated by a hardware nonce generator within the SoC to encrypt all data on the external memory bus and DIMMs via the NIST-standard AES-XTS algorithm [9] (shown above), thereby preventing physical snooping and cold-boot attacks. The TME key is inaccessible to software or external interfaces and has minimal performance impact.

Comparison of encryption behavior for the same data at the same (left) vs. different (right) physical addresses.
Figure 6: Comparison of encryption behavior for the same data at the same (left) vs. different (right) physical addresses

The researchers verified SGX’s memory encryption behavior by targeting specific EPC physical addresses on a DIMM connected to their logic analyzer. Inside an enclave, they repeatedly performed read, modify, and restore operations on the same memory location. The results showed that when the same data was written, the output ciphertext was identical, proving that SGX on server platforms employs deterministic encryption. Further tests revealed that the virtual address did not affect the ciphertext—different virtual addresses mapping to the same physical page produced the same ciphertext. However, when the physical address changed, the ciphertext also changed, even for the same plaintext. This indicates that the encryption algorithm uses only the physical address as a tweakable encryption, without incorporating any randomness or version counters.

This deterministic nature of the memory encryption is the cornerstone of this security research.

Controlling the Enclave’s Memory Layout

A critical challenge remained: how to ensure the specific memory traffic of the target enclave is captured by the logic analyzer? To achieve this, the attacker must precisely control the target enclave’s placement and access patterns in physical memory.

First, by modifying a kernel module, they reverse-engineered the mapping from physical addresses to DRAM topology (controller/channel/bank/row) using the BIOS/ACPI ADXL interface. This allowed them to identify which physical pages would reside on the monitored DIMM.

Next, they modified the SGX driver to add a “page pinning” mechanism, which prioritized allocating the victim enclave’s virtual pages to these targeted physical pages. During execution, the researchers used controlled-channel techniques, employing mprotect and ptrace to pause and intervene in the enclave’s execution at the page level, triggering kernel intervention at critical moments. To force the data onto the memory bus, they ran a cache-flushing program on an adjacent core that iterated over a memory region larger than the Last-Level Cache (LLC) to evict cache lines. They combined this with Intel’s Cache Allocation Technology (CAT) to limit the victim’s cache usage and reduce noise.

Finally, the logic analyzer was configured to trigger sampling based on the physical address, precisely capturing the ciphertext data from the DRAM bus.

Launching the Attack

The Strategy: Deriving the Attestation Key from Nonce $k$

It is nearly impossible for an attacker to directly snoop on the Attestation Key. This key is typically locked in a protected key slot or secure element and is never transmitted in plaintext over the bus between the CPU and DRAM. Accessing it would require highly invasive physical methods like decapsulation, micro-probing, or laser fault injection—techniques often referred to as “lab attacks” due to their high cost and complexity.

A more practical approach is to target the nonce $k$. According to the ECDSA algorithm, if an attacker can learn the nonce $k$ used in a signature, they can directly recover the private key $d$ from the publicly known signature components $(r,s)$ and the message digest $z$ :

$$s \equiv k^{-1}(z + r d)\pmod{n} \quad \Longrightarrow \quad d \equiv r^{-1}(k s – z)\pmod{n}$$

Compared to the private key, the nonce $k$ is involved in more intermediate computations during the signing process, providing a larger observation surface for side-channel attacks. An attacker can gather information about $k$, reconstruct it through statistical and cryptanalytic methods, and then recover $k$. Therefore, the nonce k becomes a much more practical attack target than the Attestation Key itself.

Building a Ciphertext Dictionary

The researchers focused on the Intel IPP library, specifically the scalar-by-point multiplication implementation used for SGX DCAP attestation. In this implementation, the scalar $k$ is recoded into a sequence of signed digits: $\{k_0,\dots,k_n\}$, where each $k_i$ falls within a small range (e.g., in the paper, $w=5$, so the range is $[-16,16]$). This allows for efficient computation using a pre-computed table $[i]G(i=0..2^{w-1})$ and a constant number of point additions/subtractions, a typical constant-time algorithm designed to thwart common side-channel attacks targeting memory access and control flow.

Elliptic Curve Point Multiplication Algorithm
Figure 7: Elliptic Curve Point Multiplication Algorithm

Looking closer at this scalar multiplication implementation. Note that the temporary variable $B = k_i * G$. Since $k_i \in [-16, 16]$, the value of $B$ comes from a finite and predetermined set $\{ m \cdot G | m \in \{-16, \… , 16\} \}$. During the computation, $B$ will appear in memory in ciphertext form. If the ciphertext form of $B$ in memory can be captured, naturally there is an opportunity to reverse-engineer $k_i$ through a lookup table, and subsequently recover $k$.

This approach relies on the deterministic nature of the memory encryption mode (AES-XTS): with a fixed key and physical address, the same plaintext is mapped to the same ciphertext. Therefore, if we anchor writes to the same physical address and can observe the encrypted ciphertext stream generated at that address in DRAM, then this set of ciphertexts can be viewed as a ‘dictionary’ from plaintext to ciphertext.

The specific procedure is: create a specialized enclave for building the dictionary, and sequentially write all 33 points from the candidate set to the same physical address. Since this address has been mapped to the monitored DIMM, by using an instrumented logic analyzer to capture the CA/DQ activity on the bus, a series of ciphertexts is obtained. This constitutes the mapping table (i.e., dictionary) of $m \Rightarrow Enc(m \cdot G)$.

Recovering the Nonce via Dictionary Lookup

After building the ciphertext dictionary, the researchers launched the target machine’s Quoting Enclave (QE) and triggered a standard DCAP remote attestation process in production mode. During this process, the QE uses ECDSA to sign the attestation report.

To capture the ciphertext of the intermediate variable $B = k_i \cdot G$ during the signature computation process, the attacker fixes the mapping of this variable to the same physical address previously used to generate the ciphertext dictionary. Subsequently, at the end of each iteration of the elliptic curve scalar multiplication main loop, the execution of the QE (Quoting Enclave) is temporarily interrupted through operating system interrupts or page fault mechanisms (triggering AEX), and during the interruption, the corresponding cache line is flushed (e.g., executing CLFLUSH), thereby ensuring that the next read of $B$ will inevitably trigger a DRAM read operation.

Since the DIMM holding $B$ was instrumented with the logic analyzer, the attackers could capture the ciphertext data appearing on the bus in each round. Leveraging the fact that all enclaves on the same SGX platform share the same TME memory encryption key, these captured ciphertexts could be matched one-to-one with the entries in their pre-built dictionary. This allowed them to recover the booth recoding digit $k_i$ for each round. 

By concatenating all the $\{k_0, \cdots, k_n\}$, the attackers were able to completely reconstruct the nonce $k$ used for the ECDSA-based attestation signature.

Cracking the Attestation Private Key

The final step is straightforward. Using the recovered nonce $k$, the report digest $z$, and the signature $(r,s)$, they calculated:

$$d \equiv r^{-1}(k s – z)\pmod{n}$$

This yielded the Attestation Key, and the attack was successful.

It is worth noting that the server-side SGX DCAP remote attestation flow differs from earlier client-side implementations: in the DCAP flow, the generated quote is not encrypted before being sent to the attestation server. Therefore, the transmitted quote contains the signature components $$(r,s)$$ in plaintext, readily available to an observer.

Impact and Implications

Once the platform’s local QE Attestation Key is recovered, the attacker can forge signatures/attestations without drawing the victim’s attention, generate fake quotes or forge attestation chains, which may result in confidential data that should have been protected being illegally distributed, or allow malicious nodes to replace/tamper with authenticated content, fundamentally undermining the trust model of TEE based on remote attestation.

TEE.fail affects multiple TEE architectures, including Intel SGX, Intel TDX, AMD SEV-SNP, and NVIDIA TEE, and covers both DDR4 and DDR5 memory. This conclusion has been officially confirmed[10][11]. It systematically validates the exploitability of deterministic memory encryption under physically observable conditions and demonstrates the feasibility of recovering remote attestation private keys under realistic hardware budgets.

TEE.fail has a significant impact on public network systems that rely on remote attestation as a single trust anchor, such as Phala Network and Secret Network. These systems depend on SGX’s remote attestation to establish node identity and trust chains; once an attacker with physical access recovers the Attestation private key, they can forge quotes, impersonate trusted nodes, and subsequently access or tamper with confidential contract data, severely weakening the network’s privacy and integrity guarantees.

In contrast, in scenarios of self-built private TEE services relying on large cloud providers, the practical impact of TEE.fail is relatively small. First, large cloud providers’ data centers have high physical access thresholds, with segregation of duties approval and comprehensive auditing in place, and hardware lifecycle fully controlled; second, at the architecture level, TEE is not used as the sole trust anchor, combining MPC/HSM and other parallel protections to reduce single point of failure risks; third, systems and business are tightly coupled with implementation of heterogeneous architectures and network isolation, making it difficult to form generalized attacks and lateral propagation.

Mitigation Strategies

Defenses against the TEE.fail attack can be considered from three perspectives:

For TEE chips, to fundamentally solve the problem, deterministic memory encryption schemes should be avoided. Current mainstream TEE implementations still employ the “AES-XTS” deterministic scheme (fixed address + fixed key), which is susceptible to the construction of ‘ciphertext dictionaries’. A more robust direction is to introduce integrity verification and randomized encryption, cutting off the ciphertext reuse pathway at the fundamental level. However, at the current stage, randomized encryption schemes inevitably increase chip design complexity and incur performance overhead. Therefore, in actual design and deployment, chip manufacturers need to make trade-offs between security and performance.

Cloud providers and TEE silicon vendors should collaborate to provide support for location verification and CPU whitelisting. By incorporating location or cloud service checks into the remote attestation process, key provisioning could be restricted to CPU instances registered in secure data centers, preventing keys from being issued to unknown hardware. Ideally, cloud providers should offer an “In-Cloud Proof” feature, enabling verifiers to confirm that an attestation truly originates from a trusted cloud environment, not from an external or forged TEE.

Users and developers employing TEE technology should limit permissions and distribute trust by design. For instance, no single attested node should be granted direct access to all critical permissions. Another approach is to use Secure Multi-Party Computation (MPC) to split keys among multiple independent parties, preventing a single point of failure from compromising the entire system.

Conclusion

The TEE.fail attack is another reminder that security is not a one-time achievement but a continuous arms race. Every new piece of research not only reveals potential vulnerabilities but also drives the evolution of defensive technologies. Whether it’s hardware encryption, trusted execution environments, or system-level protection mechanisms, all must advance through continuous validation and improvement. True security lies not in being “solved,” but in being “continuously researched and strengthened.” 

Safeheron will continue to follow the developments of TEE.fail and related research, and we will share our analysis and responses with the community as they emerge.

References

[1] Paper: WireTap: Breaking Server SGX via DRAM Bus Interposition 

[2] Paper: TEE.fail: Breaking Trusted Execution Environments via DDR5 Memory Bus Interposition

[3] WireTap.Fail

[4] TEE.Fail

[5] Elliptic Curve Digital Signature Algorithm

[6] Intel® SGX PCK Certificate and Certificate Revocation List Profile Specification

[7] Supporting Third Party Attestation for Intel® SGX with Intel® Data Center Attestation Primitives

[8] Intel® Architecture Memory Encryption Technologies

[9] [IEEE P1619] IEEE Standard for Cryptographic Protection of Data on Block-Oriented Storage Devices. April 2008

[10] Intel: More Information on Encrypted Memory Frameworks for Intel Confidential Computing

[11] AMD:  Compromising Trusted Execution Environments through DDR5 Memory Bus Interposition

SHARE THIS ARTICLE
联系我们