**Tools**

**Reference Traces **

REASSURE produced a series of datasets obtained by measuring power traces on various physical platforms. These datasets are freely available, and can be used as a basis to test attacks, leakage detection methods, or for any other purpose. Our only requirement is to acknowledge the sources when communicating about subsequent activities (see license files in repositories).

- AES traces on ATmega processor for Deep Learning Testing
- AES traces on ChipWhisperer board
- AES traces on FPGA board

**Software tools **

REASSURE developed various tools to help develop or assess side-channel-resistant implementations. These tools are released under public licenses (see license files in repositories). Here too, our main requirement is to acknowledge the sources when communicating about subsequent activities:

**Publications**

**Fast Side-Channel Security Evaluation of ECC Implementations: Shortcut Formulas for Horizontal Side-Channel Attacks Against ECSM with the Montgomery Ladder**

### By Melissa Azouaoui, Romain Poussier, and François-Xavier Standaert, presented at COSADE 2019

Horizontal attacks are a suitable tool to evaluate the (nearly) worst-case side-channel security level of ECC implementations, due to the fact that they allow extracting a large amount of information from physical observations. Motivated by the difficulty of mounting such attacks and inspired by evaluation strategies for the security of symmetric cryptography implementations, we derive shortcut formulas to estimate the success rate of horizontal differential power analysis attacks against ECSM implementations, for efficient side-channel security evaluations. We then discuss the additional leakage assumptions that we exploit for this purpose, and provide experimental confirmation that the proposed tools lead to good predictions of the attacks’ success.

**Reducing a Masked Implementation’s Effective Security Order with Setup Manipulations And an Explanation Based on Externally-Amplified Couplings**

### By Itamar Levi, Davide Bellizia and François-Xavier Standaert, published in IACR Transactions on Cryptographic Hardware and Embedded Systems, 2019(2)

Couplings are a type of physical default that can violate the independence assumption needed for the secure implementation of the masking countermeasure. Two recent works by De Cnudde et al. put forward qualitatively that couplings can cause information leakages of lower order than theoretically expected. However, the (quantitative) amplitude of these lower-order leakages (e.g., measured as the amplitude of a detection metric such as Welch’s T statistic) was usually lower than the one of the (theoretically expected) d th order leakages. So the actual security level of these implementations remained unaffected. In addition, in order to make the couplings visible, the authors sometimes needed to amplify them internally ( e.g., by tweaking the placement and routing or iterating linear operations on the shares). In this paper, we first show that the amplitude of low-order leakages in masked implementations can be amplified externally, by tweaking side-channel measurement setups in a way that is under control of a power analysis adversary.

**Study of Deep Learning Techniques for Side-Channel Analysis and Introduction to ASCAD Database**

### By Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli and Cécile Dumas, published in IACR Cryptology eprint archive 2018

To provide insurance on the resistance of a system against side-channel analysis, several national or private schemes are today promoting an evaluation strategy, common in classical cryptography, which is focussing on the most powerful adversary who may train to learn about the dependency between the device behaviour and the sensitive data values. Several works have shown that this kind of analysis, known as Template Attacks in the side-channel domain, can be rephrased as a classical Machine Learning classification problem with learning phase. Following the current trend in the latter area, recent works have demonstrated that deep learning algorithms were very efficient to conduct security evaluations of embedded systems and had many advantages compared to the other methods. Unfortunately, their hyper-parametrization has often been kept secret by the authors who only discussed on the main design principles and on the attack efficiencies.

**Masking Proofs Are Tight and How to Exploit it in Security Evaluations**

### By Vincent Grosso and François-Xavier Standaert, presented at Advances in Cryptology – EUROCRYPT 2018

Evaluating the security level of a leaking implementation against side-channel attacks is a challenging task. This is especially true when countermeasures such as masking are implemented since in this case: (i) the amount of measurements to perform a key recovery may become prohibitive for certification laboratories, and (ii) applying optimal (multivariate) attacks may be computationally intensive and technically challenging. In this paper, we show that by taking advantage of the tightness of masking security proofs, we can significantly simplify this evaluation task in a very general manner.

**Leakage Detection with the χ 2 -Test**

### By Amir Moradi , Bastian Richter , Tobias Schneider and François-Xavier Standaert , published in IACR Transactions on Cryptographic Hardware and Embedded Systems, 2018(1)

We describe how Pearson’s χ 2 test can be used as a natural complement to Welch’s t-test for black box leakage detection. In particular, we show that by using these two tests in combination, we can mitigate some of the limitations due to the moment-based nature of existing detection techniques based on Welch’s t-test (e.g., for the evaluation of higher-order masked implementations with insufficient noise). We also show that Pearson’s χ 2 test is naturally suited to analyze threshold implementations with information lying in multiple statistical moments, and can be easily extended to a distinguisher for key recovery attacks. As a result, we believe the proposed test and methodology are interesting complementary ingredients of the side-channel evaluation toolbox, for black box leakage detection and non-profiled attacks, and as a preliminary before more demanding advanced analyses.

**Start Simple and then Refine: Bias-Variance Decomposition as a Diagnosis Tool for Leakage Profiling**

### By Liran Lerman, Nikita Veshchikov, Olivier Markowitch, Franc¸ois-Xavier Standaert, published in IEEE Transactions on Computers 2018

Evaluating the resistance of cryptosystems to side-channel attacks is an important research challenge. Profiled attacks reveal the degree of resilience of a cryptographic device when an adversary examines its physical characteristics. So far, evaluation laboratories launch several physical attacks (based on engineering intuitions) in order to find one strategy that eventually extracts secret information (such as a secret cryptographic key). The certification step represents a complex task because in practice the evaluators have tight memory and time constraints. In this paper, we propose a principled way of guiding the design of the most successful evaluation strategies thanks to the (bias-variance) decomposition of a security metric of profiled attacks. Our results show that we can successfully apply our framework on unprotected and protected algorithms implemented in software and hardware.

**Towards Sound and Optimal Leakage Detection Procedure**

### By Liwei Zhang , A. Adam Ding , Francois Durvaux , Francois-Xavier Standaert , and Yunsi Fei, presented at CARDIS 2017

Evaluation of side channel leakage for the embedded crypto systems requires sound leakage detection procedures. We relate the test vector leakage assessment (TVLA) procedure to the statistical minimum p-value (mini-p) procedure, and propose a sound method of deciding leakage existence in the statistical hypothesis setting. To improve detection, an advanced statistical procedure Higher Criticism (HC) is applied. The detection of leakage existence and the identification of exploitable leakage are separated when there are multiple leakage points.

**Very High Order Masking: Efficient Implementation and Security Evaluation**

### By Anthony Journault, François-Xavier Standaert, presented at CHES 2017

In this paper, we study the performances and security of recent masking algorithms specialized to parallel implementations in a 32-bit embedded software platform, for the standard AES Rijndael and the bitslice cipher Fantomas. By exploiting the excellent features of these algorithms for bitslice implementations, we first extend the recent speed records of Goudarzi and Rivain (presented at Eurocrypt 2017) and report realistic timings for masked implementations with 32 shares. We then observe that the security level provided by such implementations is uneasy to quantify with current evaluation tools.

**Private Multiplication over Finite Fields**

### By Sonia Belaïd, Fabrice Benhamouda, Alain Passelègue, Emmanuel Prouff,, Adrian Thillard, and Damien Vergnaud,, presented at CRYPTO 2017

The notion of privacy in the probing model, introduced by Ishai, Sahai, and Wagner in 2003, is nowadays frequently involved to assess the security of circuits manipulating sensitive information. However, provable security in this model still comes at the cost of a significant overhead both in terms of arithmetic complexity and randomness complexity. In this paper, we deal with this issue for circuits processing multiplication over finite fields. Our contributions are manifold. Extending the work of Belaïd, Benhamouda, Passelègue, Prouff, Thillard, and Vergnaud at Eurocrypt 2016, we introduce an algebraic characterization of the privacy for multiplication in any finite field and we propose a novel algebraic characterization for non-interference (a stronger security notion in this setting). Then, we present two generic constructions of multiplication circuits in finite fields that achieve non-interference in the probing model.

**Getting the Most Out of Leakage Detection Statistical Tools and Measurement Setups Hand in Hand**

### By Santos Merino del Pozo and François-Xavier Standaert, presented at COSADE 2017

In this work, we provide a concrete investigation of the gains that can be obtained by combining good measurement setups and efficient leakage detection tests to speed up evaluation times. For this purpose, we first analyze the quality of various measurement setups. Then, we highlight the positive impact of a recent proposal for efficient leakage detection, based on the analysis of a (few) pair(s) of plaintexts. Finally, we show that the combination of our best setups and detection tools allows detecting leakages for a noisy threshold implementation of the block cipher PRESENT after an intensive measurement phase, while either worse setups or less efficient detection tests would not succeed in detecting these leakages. Overall, our results show that a combination of good setups and fast leakage detection can turn security evaluation times from days to hours (for first-order secure implementations) and even from weeks to days (for higher-order secure implementations).

**Applying Horizontal Clustering Side-Channel Attacks on Embedded ECC Implementations**

### By Erick Nascimento and Łukasz Chmielewski, presented at CARDIS 2017

Side-channel attacks are a threat to cryptographic algorithms running on embedded devices. Public-key cryptosystems, including elliptic curve cryptography (ECC), are particularly vulnerable because their private keys are usually long-term. Well known countermeasures like regularity, projective coordinates and scalar randomization, among others, are used to harden implementations against common side-channel attacks like DPA.

Horizontal clustering attacks can theoretically overcome these countermeasures by attacking individual side-channel traces. In practice horizontal attacks have been applied to overcome protected ECC implementations on FPGAs. However, it has not been known yet whether such attacks can be applied to protected implementations working on embedded devices, especially in a non-profiled setting.

Our best attack has success rates of 97.64% and 99.60% for cswap-arith and cswap-pointer, respectively. This means that at most 6 and 2 bits are incorrectly recovered, and therefore, a subsequent brute-force can fix them in reasonable time. Furthermore, our horizontal clustering framework used for the aforementioned attacks can be applied against other

protected implementations.

**Connecting and Improving Direct Sum Masking and Inner Product Masking**

### By Romain Poussier, Qian Guo, François-Xavier Standaert, Sylvain Guilley, Claude Carlet, presented at CARDIS 2017

Direct Sum Masking (DSM) and Inner Product (IP) masking are two types of countermeasures that have been introduced as alternatives to simpler (e.g., additive) masking schemes to protect cryptographic implementations against side-channel analysis. In this paper, we first show that IP masking can be written as a particular case of DSM. We then analyze the improved security properties that these (more complex) encodings can provide over Boolean masking. For this purpose, we introduce a slight variation of the probing model, which allows us to provide a simple explanation to the “security order amplification” for such masking schemes that was put forward at CARDIS 2016. We then use our model to search for new instances of masking schemes that optimize this security order amplification. We finally discuss the relevance of this security order amplification (and its underlying assumption of linear leakages) based on an experimental case study.

**A Systematic Approach to the Side-Channel Analysis of ECC Implementations with Worst-Case Horizontal Attacks**

### By Romain Poussier, Yuanyuan Zhou, François-Xavier Standaert, presented at CHES 2017

The wide number and variety of side-channel attacks against scalar multiplication algorithms makes their security evaluations complex, in particular in case of time constraints making exhaustive analyses impossible. In this paper, we present a systematic way to evaluate the security of such implementations against horizontal attacks. As horizontal attacks allow extracting most of the information in the leakage traces of scalar multiplications, they are suitable to avoid risks of overestimated security levels. For this purpose, we additionally propose to use linear regression in order to accurately characterize the leakage function and therefore approach worst-case security evaluations. We then show how to apply our tools in the contexts of ECDSA and ECDH implementations, and validate them against two targets: a Cortex-M4 and a Cortex-A8 micro-controllers.

**Convolutional Neural Networks with Data Augmentation against Jitter-Based Countermeasures – Profiling Attacks without Pre-Processing**

### By Eleonora Cagli, Cécile Dumas, Emmanuel Prouff, presented at CHES 2017

In the context of the security evaluation of cryptographic implementations, profiling attacks (aka Template Attacks) play a fundamental role. Nowadays the most popular Template Attack strategy consists in approximating the information leakages by Gaussian distributions. Nevertheless this approach suffers from the difficulty to deal with both the traces misalignment and the high dimensionality of the data. This forces the attacker to perform critical preprocessing phases, such as the selection of the points of interest and the realignment of measurements. Some software and hardware countermeasures have been conceived exactly to create such a misalignment.

**Towards Practical Tools for Side Channel Aware Software Engineering: ‘Grey Box’ Modelling for Instruction Leakages**

### By David McCann, Elisabeth Oswald, Carolyn Whitnall, presented at USENIX Security Symposium 2017

Power (along with EM, cache and timing) leaks are of considerable concern for developers who have to deal with cryptographic components as part of their overall software implementation, in particular in the context of embedded devices. Whilst there exist some compiler tools to detect timing leaks, similar progress towards pinpointing power and EM leaks has been hampered by limits on the amount of information available about the physical components from which such leaks originate.

**Categorising and Comparing Cluster-Based DPA Distinguishers**

### By Xinping Zhou, Carolyn Whitnall, Elisabeth Oswald, Degang Sun, Zhu Wang, presented at SAC 2017

Side-channel distinguishers play an important role in differential power analysis, where real world leakage information is compared against hypothetical predictions in order to guess at the underlying secret key. A class of distinguishers which can be described as `cluster-based’ have the advantage that they are able to exploit multi-dimensional leakage samples in scenarios where only loose, `semi-profiled’ approximations of the true leakage forms are available. This is by contrast with univariate distinguishers exploiting only single points (e.g. correlation), and Template Attacks requiring concise fitted models which can be overly sensitive to mismatch between the profiling and attack acquisitions.

**Deliverables**

**Deliverable D1.2: Interim Portfolio of Best Methods and Improved Evaluation Techniques **

This is the second deliverable produced within work package 1 (WP1) of the project. The goal of this report is to act as a portfolio of improved evaluation techniques and their figures of merit.

In the context of our previously introduced detect-map-exploit core framework (see Deliverable 1.1), we start off by discussing the advantages and limitations of the widely-deployed Test Vector Leakage Assessment (TVLA) methodology. Properly redefining the objectives of leakage detection enable us to identify and address existing open questions and take steps towards a better and faster leakage detection. Performing leakage detection efficiently is a central question for side-channel leakage evaluation. Therefore, we compare and contrast two methods of computing moments for detection tests, namely an off-line method and an online (histograms based) method.

memory systems. We conclude that the histograms method is not deemed usable in the current state for realistic examples, as the storage of the histograms using dense arrays uses a huge amount of memory. A benchmark run on a real-world example shows that the scalability of the offline method could still be improved.

Next, we discuss Deep Learning (DL) techniques as black box solvers that are proven to be effective for side-channel evaluation. We provide an overview of the methodology that is commonly applied to side-channel attacks when neural networks are considered as the model for the key recovery. We present

two case studies using convolutional neural networks (CNN): the first on a protected AES implementation (DPA Contest V4), and the second on an ECC implementation protected with misalignment. We study practical issues and limitations that arise with DL-based techniques and outline directions to improve or solve these obstacles by tweaking the different parameters of the analysis. Finally, we propose a novel idea for DL -based SCA in section 6.4 based on a multi-channel approach.

**Deliverable D2.1: Shortcut Formulas for Side Channel Evaluation **

This deliverable surveys a number of approaches to shortcut the effort required to assess implementations and devices with respect to their susceptibility regarding side channel attacks. Our approach aligns with the divide and conquer nature of most side channel attacks and hence we touch on shortcuts that apply the distinguisher statistics (the divide step) and the key rank (the conquer step).

Within REASSURE we set ourselves the challenge to assess how such shortcut approaches could nevertheless be leveraged to improve evaluations. This deliverable hence is the foundational step from which we will continue in two directions. Firstly, within the next 6 months, industrial partners will provide some insight into at what stage shortcuts are particularly appropriate. Secondly, some shortcuts will be implemented within WP3.2 in the context of the REASSURE simulation tool.

**Deliverable D2.2: Interim report on automation **

This document represents interim progress of the REASSURE consortium towards the goal of flexible evaluation methods that are usable by non-domain experts.

We begin by clarifying what we mean by automation: the facility to perform a task with a minimum of user input and interaction (expert or otherwise). This implies that the necessary parameters for a sound, fair and reproducible analysis are derived or estimated from the available data as far as possible. It also requires that the assumptions and limitations of the analysis be made transparent in the output, in order to guide (potentially non-expert) users towards sound conclusions.

On the basis that detection and mapping tasks are the most promising for automation, we then provide a more detailed account of statistical hypothesis testing and of how to perform the multiple comparison corrections and statistical power analyses required to carry out such procedures fairly and transparently. In the case of leakage detection strategies based on simple assumptions, such as those relying on *t*-tests, it is possible to arrive at analytical formulae for the latter; for complex methods, such as those estimating information theoretic quantities, the only options currently available (as far as we are aware) involve referring to empirically-derived indicators from (hopefully) representative scenarios. A number of currently used methods (in particular, those based on ANOVA-like tests, and those based on correlation) sit in a middle ground, where the side-channel literature has not yet fully explored the solutions already proposed in the statistics literature. This represents an avenue for future work, which could potentially make it possible to safely automate a wider range of tests, as well as providing insights into the trade-offs between them.

We discuss the particular challenges of extending known results for univariate, `first order’ leakages to the sorts of higher-order univariate and multivariate leakages arising from protected schemes. Tricks to `shift’ information into distribution means via pre-processing inevitably violate the assumptions which make the statistical power analysis of *t*-tests straightforward. Meanwhile, unless evaluators have control over (or at least access to) any randomness, as well as sufficient details about the implementation specification, the task of searching for jointly leaking tuples can quickly become infeasible as the leakage order increases. We consider that more work needs to be done from a basic methodological perspective before higher-order detection can realistically be considered a candidate for automation.

Finally, we propose that there exists a need to devise measures of `coverage’, inspired by the notion of code coverage in software testing. We offer some suggestions of what this might look like: input-based scores indicating the proportion of the total population sampled for the experiment (under different assumptions about the form of the leakage); intermediate value-based scores indicating the fraction of total sensitive intermediates targeted by the tests; or even strategies whereby profiling information is used to design and test specific corner cases.

We intend to make the formal definition of coverage metrics an avenue for exploration during the remainder of the project, along with other outstanding questions identified in this document. A future deliverable (D2.5) will report on progress made to that end, and finalise our findings and recommendations.

**Deliverable D2.3: Shortcut Formulas for Side-Channel Evaluation**

This deliverable surveys a number of approaches to shortcut the effort required to assess implementations and devices with respect to their susceptibility to side channel attacks. Our approach aligns with the divide and conquer nature of most side channel attacks and hence we touch on shortcuts that apply to the distinguisher statistics (the divide step) and the key rank (the conquer step).

Within REASSURE we set ourselves the challenge to assess how such shortcut approaches could nevertheless be leveraged to improve evaluations. This deliverable attempts to highlight at what stage shortcuts are particularly appropriate. Secondly, some shortcuts will be implemented within WP3.2 in the context of the REASSURE simulation tool.

**Deliverable D 2.4: Report on Instruction Level Profiling**

This document reports on interim progress of the REASSURE consortium towards methodologies to create representative instruction-level profiles of microprocessors of at least medium complexity.

Next we summarise a range of existing tools from industry and academia, including one (ELMO) which we argue best sets the precedent that we plan to follow for REASSURE. We explain the modelling procedure used in the construction of ELMO and report on the results of our own model-building endeavour using data from an alternative implementation of the same board (an ARM Cortex M0). In particular, we show that there are broad similarities between the implementations, but some differences that will require attention if models are expected to port from one to another.

The next steps for REASSURE will be to explore improvements to the model building procedure, as well as its extension to more complex devices, in order to eventually release our own simulator tools to the wider side channel research and evaluation community.

**Deliverable D 2.5: Final report on automation **

This document is an update to deliverable 2.2 detailing further progress of the REASSURE consortium towards the goal of flexible evaluation methods that are usable by non-domain experts.

We begin by clarifying what we mean by automation: the facility to perform a task with a minimum of user input and interaction (expert or otherwise). This implies that the necessary parameters for a sound, fair and reproducible analysis are derived or estimated from the available data as far as possible. It also requires that the assumptions and limitations of the analysis be made transparent in the output, in order to guide (potentially non-expert) users towards sound conclusions.

On the basis that detection and mapping tasks are the most promising for automation, we then provide a more detailed account of statistical hypothesis testing and of how to perform the statistical power analyses and multiple comparison corrections required to carry out such procedures fairly and transparently. In the case of leakage detection strategies based on simple assumptions, such as those relying on t-tests, it is possible to arrive at analytical formulae for the latter; for complex methods, such as those estimating information theoretic quantities, the only options currently available (as far as we are aware) involve referring to empirically-derived indicators from (hopefully) representative scenarios.

We discuss the particular challenges of extending known results for univariate, ‘first order’ leakages to the sorts of higher-order univariate and multivariate leakages arising from protected schemes. Tricks to ‘shift’ information into distribution means via pre-processing inevitably violate the assumptions which make the statistical power analysis of t-tests straightforward. We consider that more work needs to be done from a basic methodological perspective before higher-order detection can realistically be considered a candidate for automation.

Inspired by the notion of code coverage in software testing, we suggest the need for measures of leakage evaluation coverage. We offer some suggestions of what these might look like: input-based scores indicating the proportion of the total population sampled for the experiment (under different assumptions about the form of the leakage); intermediate value-based scores indicating the fraction of total sensitive intermediates targeted by the tests; or even strategies whereby profiling information is used to design and test specific corner cases.

With a non-domain expert end user in mind we then consider how to apply recommendations from this deliverable to the tools produced in work package 3. In particular, we explain how to set the default test parameters for the automated simulation and analysis of traces so as to require minimal input from the developer in an iterative design process.

We lastly consider the automation potential of deep learning as a tool for side-channel analysis. WP1 (see, in particular, D1.2) has addressed deep learning from an efficiency perspective, but its ability to extract information from large amounts of raw data with minimal input on the part of the user makes it highly relevant to this work package also. We describe our efforts towards making best use of its capabilities, including via generative adversarial networks, artificial noise to build more portable neural network models, and a proposal whereby collections of traces are jointly classified as having been generated under a particular key guess or not. We also discuss some of the obstacles to automation, such as the opaque nature of deep learning models, which may make it difficult to draw practically applicable conclusions about the form and location of leakage after an evaluation.

**Deliverable D5.2: Data management plan**

This document represents the first version of the Data Management Plan (DMP) of the REASSURE project. It is a living document that will be updated throughout the project. The document focuses on identifying the type of data that will be openly shared, namely leakage traces acquired and processed during a side-channel analysis of an embedded security device, as well as the intended target audience and the data format.