**Tools**

**Reference Traces **

REASSURE produced a series of datasets obtained by measuring power traces on various physical platforms. These datasets are freely available, and can be used as a basis to test attacks, leakage detection methods, or for any other purpose. Our only requirement is to acknowledge the sources when communicating about subsequent activities (see license files in repositories).

- AES traces on ATmega processor for Deep Learning Testing
- AES traces on ChipWhisperer board
- AES traces on FPGA board
- ECC traces on ARM core

**Software tools **

REASSURE developed various tools to help develop or assess side-channel-resistant implementations. These tools are released under public licenses (see license files in repositories). Here too, our main requirement is to acknowledge the sources when communicating about subsequent activities:

- Inspector Cloud, a cloud-based hardware security solution for Side Channel Analysis
- A secure Implementations of AES-128 for ATmega
- JuliaSCA, a toolbox for side-channel attacks
- ELMO, a side-channel emulator
- Shortcut formulas: a Matlab implementation of the convolution method

**Online trainings**

**Side Channel Analysis (SCA) for IoT developers – A practical introduction**

Click here to start the training

Are you looking for an easy introduction to side channel analysis? Developed by Riscure and powered by the Reassure project, we bring you an online training which is freely and openly available to anyone who is interested in learning more about SCA.

This course is for you if any of the following statements are true:

- You heard about a magic technique of extracting keys from devices simply by looking at them, and want to know how this works
- You are an IoT developer who wants to add crypto to their product
- You listened to talks on side channel analysis, found them confusing, and would like to understand the details
- You are a CtF player who wants to prepare for SCA challenges.

The end goal of this training is to enable you to **protect your devices and applications against basic side-channel analysis attacks**. Your journey will first take you through the theoretical foundations: you will learn what a side channel is, get familiar with practical examples and understand the typical flow of an attack.

**Cardis tutorial’s scripts**

A tutorial devoted to leakage detection was held in November 2018. The material presented at the tutorial, including python scripts, can still be downloaded by following this link.

**Understanding leakage detection**

Click here to start the training

The aim of this course is to help the audience grasp the intuition behind leakage detection methodologies and achieve a sound technical appreciation of how and why they work. In the course, we motivate and describe the current popular practice, including correlation-based tests, and expose some of the limitations, with a special focus on ISO standard 17825. The learning goal of this advanced course is to equip evaluators to carry out leakage detection tests sensibly and interpret the outcomes responsibly.

This course is structured as follows. We start off by introducing the problem, namely the presence of data-dependencies in side-channel measurements, and the most common strategy to exploit such information: Differential Power Analysis (DPA).We then build a case for why statistical methods are necessary and develop the particular rationale behind the t-test before describing it more formally. Finally, we show how the t-test is being applied within the TVLA framework and discuss some of the issues affecting its usefulness.

The online training is available on-demand and free of charge on the project website. The approximate duration of this course is 6 hours.

**Publications**

**Learning when to stop: a mutual information approach to prevent overfitting in profiled side-channel analysis**

### By Guilherme Perin, Ileana Buhan and Stjepan Picek , published in IACR Cryptology ePrint Archive

Today, deep neural networks are a common choice for conducting the profiled side-channel analysis. Such techniques commonly do not require pre-processing, and yet, they can break targets protected with countermeasures. Unfortunately, it is not trivial to find neural network hyper-parameters that would result in such top-performing attacks. The hyper-parameter leading the training process is the number of epochs during which the training happens. If the training is too short, the network does not reach its full capacity, while if the training is too long, the network overfits, and is not able to generalize to unseen examples. Finding the right moment to stop the training process is particularly difficult for side-channel analysis as there are no clear connections between machine learning and side-channel metrics that govern the training and attack phases, respectively.

**Strength in Numbers: Improving Generalization with Ensembles in Profiled Side-channel Analysis**

### By Guilherme Perin, Lukasz Chmielewski and Stjepan Picek , published in IACR Cryptology ePrint Archive

The adoption of deep neural networks for profiled side-channel attacks provides powerful options for leakage detection and key retrieval of secure products. When training a neural network for side-channel analysis, it is expected that the trained model can implement an approximation function that can detect leaking side-channel samples and, at the same time, be insensible to noisy (or non-leaking) samples. This outlines a generalization situation where the model can identify the main representations learned from the training set in a separate test set. In this paper, we first discuss how output class probabilities represent a strong metric when conducting the side-channel analysis. Further, we observe that these output probabilities are sensitive to small changes, like the selection of specific test traces or weight initialization for a neural network.

**Leakage Certification Revisited: Bounding Model Errors in Side-Channel Security Evaluations**

### By Olivier Bronchain, Julien M. Hendrickx, Clément Massart, Alex Olshevsky, François-Xavier Standaert, presented at CRYPTO 2019

Leakage certification aims at guaranteeing that the statistical models used in side-channel security evaluations are close to the true statistical distribution of the leakages, hence can be used to approximate a worst-case security level. Previous works in this direction were only qualitative: for a given amount of measurements available to an evaluation laboratory, they rated a model as “good enough” if the model assumption errors (i.e., the errors due to an incorrect choice of model family) were small with respect to the model estimation errors. We revisit this problem by providing the first quantitative tools for leakage certification. For this purpose, we provide bounds for the (unknown) Mutual Information metric that corresponds to the true statistical distribution of the leakages based on two easy-to-compute information theoretic quantities: the Perceived Information, which is the amount of information that can be extracted from a leaking device thanks to an estimated statistical model, possibly biased due to estimation and assumption errors, and the Hypothetical Information, which is the amount of information that would be extracted from an hypothetical device exactly following the model distribution.

**Provable Order Amplification for Code-based Masking: How to Avoid Non-linear Leakages due to Masked Operations**

### By Weijia Wang, Yu Yu, François-Xavier Standaert, published in IEEE Transactions on Information Forensics and Security

Code-based masking schemes have been shown to provide higher theoretical security guarantees than the Boolean masking. In particular, one interesting feature put forward at CARDIS 2016 and then analyzed at CARDIS 2017 is the so-called security order amplification: under the assumption that the leakage function is linear, it guarantees that an implementation performing only linear operations will have a security order in the bounded moment leakage model larger than d−1, where d is the number of shares.

The main question regarding this feature is its practical relevance. First of all, concrete block ciphers do not only perform linear operations. Second, it may be that actual leakage functions are not perfectly linear (raising questions regarding what happens when one deviates from such assumptions).

**Multi-Tuple Leakage Detection and the Dependent Signal Issue**

### By Olivier Bronchain, Tobias Schneider and François-Xavier Standaert, published in TCHES 2019

Leakage detection is a common tool to quickly assess the security of a cryptographic implementation against side-channel attacks. The Test Vector Leakage Assessment (TVLA) methodology using Welch’s t-test, proposed by Cryptography Research, is currently the most popular example of such tools, thanks to its simplicity and good detection speed compared to attack-based evaluations. However, as any statistical test, it is based on certain assumptions about the processed samples and its detection performances strongly depend on parameters like the measurement’s Signal-to-Noise Ratio (SNR), their degree of dependency, and their density, i.e., the ratio between the amount of informative and non-informative points in the traces.

In this paper, we argue that the correct interpretation of leakage detection results requires knowledge of these parameters which are a priori unknown to the evaluator,and, therefore, poses a non-trivial challenge to evaluators (especially if restricted to only one test).

**Beyond Algorithmic Noise or How to Shuffle Parallel Implementations?**

### By Itamar Levi, Davide Bellizia, François-Xavier Standaert, published in International Journal of Circuit Theory and Applications,

Noise is an important ingredient for side-channel-analysis countermeasures security. However, physical noise is in most cases not sufficient to achieve high security levels. As an outcome, designers traditionally aim to emulate noise by harnessing shuffling in the time-domain and algorithmic noise in the amplitude-domain. On one hand,harnessing algorithmic-noise is limited in architectures/devices which have a limited data-path width. On the other hand, the performance degradation due to shuffling is considerable.

**Gradient Visualization for General Characterization in Profiling Attacks**

### By Loïc Masure, Cécile Dumas and Emmanuel Prouff, presented at COSADE 2019

In Side-Channel Analysis (SCA), several papers have shown that neural networks could be trained to efficiently extract sensitive information from implementations running on embedded devices. This paper introduces a new tool called Gradient Visualization that aims to proceed a post-mortem information leakage characterization after the successful training of a neural network. It relies on the computation of the gradient of the loss function used during the training. The gradient is no longer computed with respect to the model parameters, but with respect to the input trace components. Thus, it can accurately highlight temporal moments where sensitive information leaks. We theoretically show that this method, based on Sensitivity Analysis, may be used to efficiently localize points of interest in the SCA context.

**A Comprehensive Study of Deep Learning for Side-Channel Analysis**

### By Loïc Masure, Cécile Dumas and Emmanuel Prouff, published in TCHES 2019

Recently, several studies have been published on the application of deep learning to enhance Side-Channel Attacks (SCA). These seminal works have practically validated the soundness of the approach, especially against implementations protected by masking or by jittering. Concurrently, important open issues have emerged. Among them, the relevance of machine (and thereby deep) learning based SCA has been questioned in several papers based on the lack of relation between the accuracy , a typical performance metric used in machine learning, and common SCA metrics like the Guessing entropy or the key-discrimination success rate . Also, the impact of the classical side-channel counter-measures on the efficiency of deep learning has been questioned, in particular by the semi-conductor industry. Both questions enlighten the importance of studying the theoretical soundness of deep learning in the context of side-channel and of developing means to quantify its efficiency, especially with respect to the optimality bounds published so far in the literature for side-channel leakage exploitation.

**Share-slicing: Friend or Foe?**

### By Si Gao, Ben Marshall, Dan Page and Elisabeth Oswald, published in TCHES 2019

Masking is a well loved and widely deployed countermeasure against side channel attacks, in particular in software. Under certain assumptions (w.r.t. independence and noise level), masking provably prevents attacks up to a certain security order and leads to a predictable increase in the number of required leakages for successful attacks beyond this order. The noise level in typical processors where software masking is used may not be very high, thus low masking orders are not sufficient for real world security. Higher order masking however comes at a great cost, and therefore a number techniques have been published over the years that make such implementations more efficient via parallelisation in the form of bit or share slicing.

**Key Enumeration from the Adversarial Viewpoint When to Stop Measuring and Start Enumerating?**

### By Melissa Azouaoui, Romain Poussier, François-Xavier Standaert, Vincent Verneuil, presented at CARDIS 2019

In this work, we formulate and investigate a pragmatic question related to practical side-channel attacks complemented with key enumeration. In a real attack scenario, after an attacker has extracted side-channel information, it is possible that despite the entropy of the key has been significantly reduced, she cannot yet achieve a direct key recovery. If the correct key lies within a sufficiently small set of most probable keys, it can then be recovered with a plaintext and the corresponding ciphertext, by performing enumeration.

**Neural Network Model Assessment for Side-Channel Analysis**

### By Guilherme Perin, Baris Ege and Lukasz Chmielewski, published in IACR Cryptology ePrint Archive

Leakage assessment of cryptographic implementations with side-channel analysis relies on two important assumptions: leakage model and the number of side-channel traces. In the context of profiled side-channel attacks, having these assumptions correctly defined is a sufficient first step to evaluate the security of a crypto implementation with template attacks. This method assumes that the features (leakages or points of interest) follow a univariate or multi-variate Gaussian distribution for the estimation of the probability density function. When trained machine learning or neural network models are employed as classifiers for profiled attacks, a third assumption must be taken into account that it the correctness of the trained model or learning parameters. It was already proved that convolutional neural networks have advantages for side-channel analysis like bypassing trace misalignments and defeating first-order masking countermeasures in software implementations.

**A Critical Analysis of ISO 17825 (‘Testing methods for the mitigation of non-invasive attack classes against cryptographic modules’)**

### By Carolyn Whitnall and Elisabeth Oswald, presented at Asiacrypt 2019

The ISO standardisation of `Testing methods for the mitigation of non-invasive attack classes against cryptographic modules’ (ISO/IEC 17825:2016) specifies the use of the Test Vector Leakage Assessment (TVLA) framework as the sole measure to assess whether or not an implementation of (symmetric) cryptography is vulnerable to differential side-channel attacks. It is the only publicly available standard of this kind, and the first side-channel assessment regime to exclusively rely on a TVLA instantiation. TVLA essentially specifies statistical leakage detection tests with the aim of removing the burden of having to test against an ever increasing number of attack vectors. It offers the tantalising prospect of `conformance testing’: if a device passes TVLA, then, one is led to hope, the device would be secure against all (first-order) differential side-channel attacks.

**Fast Side-Channel Security Evaluation of ECC Implementations: Shortcut Formulas for Horizontal Side-Channel Attacks Against ECSM with the Montgomery Ladder**

### By Melissa Azouaoui, Romain Poussier, and François-Xavier Standaert, presented at COSADE 2019

Horizontal attacks are a suitable tool to evaluate the (nearly) worst-case side-channel security level of ECC implementations, due to the fact that they allow extracting a large amount of information from physical observations. Motivated by the difficulty of mounting such attacks and inspired by evaluation strategies for the security of symmetric cryptography implementations, we derive shortcut formulas to estimate the success rate of horizontal differential power analysis attacks against ECSM implementations, for efficient side-channel security evaluations. We then discuss the additional leakage assumptions that we exploit for this purpose, and provide experimental confirmation that the proposed tools lead to good predictions of the attacks’ success.

**Reducing a Masked Implementation’s Effective Security Order with Setup Manipulations And an Explanation Based on Externally-Amplified Couplings**

### By Itamar Levi, Davide Bellizia and François-Xavier Standaert, published in IACR Transactions on Cryptographic Hardware and Embedded Systems, 2019(2)

Couplings are a type of physical default that can violate the independence assumption needed for the secure implementation of the masking countermeasure. Two recent works by De Cnudde et al. put forward qualitatively that couplings can cause information leakages of lower order than theoretically expected. However, the (quantitative) amplitude of these lower-order leakages (e.g., measured as the amplitude of a detection metric such as Welch’s T statistic) was usually lower than the one of the (theoretically expected) d th order leakages. So the actual security level of these implementations remained unaffected. In addition, in order to make the couplings visible, the authors sometimes needed to amplify them internally ( e.g., by tweaking the placement and routing or iterating linear operations on the shares). In this paper, we first show that the amplitude of low-order leakages in masked implementations can be amplified externally, by tweaking side-channel measurement setups in a way that is under control of a power analysis adversary.

**Study of Deep Learning Techniques for Side-Channel Analysis and Introduction to ASCAD Database**

### By Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli and Cécile Dumas, published in IACR Cryptology eprint archive 2018

To provide insurance on the resistance of a system against side-channel analysis, several national or private schemes are today promoting an evaluation strategy, common in classical cryptography, which is focussing on the most powerful adversary who may train to learn about the dependency between the device behaviour and the sensitive data values. Several works have shown that this kind of analysis, known as Template Attacks in the side-channel domain, can be rephrased as a classical Machine Learning classification problem with learning phase. Following the current trend in the latter area, recent works have demonstrated that deep learning algorithms were very efficient to conduct security evaluations of embedded systems and had many advantages compared to the other methods. Unfortunately, their hyper-parametrization has often been kept secret by the authors who only discussed on the main design principles and on the attack efficiencies.

**Study of Deep Learning Techniques for Side-Channel Analysis and Introduction to ASCAD Database (Long Paper)**

### By Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli and Cécile Dumas, published in IACR Cryptology ePrint Archive

To provide insurance on the resistance of a system against side-channel analysis, several national or private schemes are today promoting an evaluation strategy, common in classical cryptography, which is focussing on the most powerful adversary who may train to learn about the dependency between the device behaviour and the sensitive data values. Several works have shown that this kind of analysis, known as Template Attacks in the side-channel domain, can be rephrased as a classical Machine Learning classification problem with learning phase. Following the current trend in the latter area, recent works have demonstrated that deep learning algorithms were very efficient to conduct security evaluations of embedded systems and had many advantages compared to the other methods. Unfortunately, their hyper-parametrization has often been kept secret by the authors who only discussed on the main design principles and on the attack efficiencies.

**Masking Proofs Are Tight and How to Exploit it in Security Evaluations**

### By Vincent Grosso and François-Xavier Standaert, presented at Advances in Cryptology – EUROCRYPT 2018

Evaluating the security level of a leaking implementation against side-channel attacks is a challenging task. This is especially true when countermeasures such as masking are implemented since in this case: (i) the amount of measurements to perform a key recovery may become prohibitive for certification laboratories, and (ii) applying optimal (multivariate) attacks may be computationally intensive and technically challenging. In this paper, we show that by taking advantage of the tightness of masking security proofs, we can significantly simplify this evaluation task in a very general manner.

**How (not) to Use Welch’s T-test in Side-Channel Security Evaluations**

### By François-Xavier Standaert, presented at CARDIS 2018

The Test Vector Leakage Assessment (TVLA) methodology is a qualitative tool relying on Welch’s T-test to assess the security of cryptographic implementations against side-channel attacks. Despite known limitations (e.g., risks of false negatives and positives), it is sometimes considered as a pass-fail test to determine whether such implementations are “safe” or not (without clear definition of what is “safe”). In this note, we clarify the limited quantitative meaning of this test when used as a standalone tool. For this purpose, we first show that the straightforward application of this approach to assess the security of a masked implementation is not sufficient. More precisely, we show that even in a simple (more precisely, univariate) case study that seems best suited for the TVLA methodology, detection (or lack thereof) with Welch’s T-test can be totally disconnected from the actual security level of an implementation.

**Leakage Detection with the χ 2 -Test**

### By Amir Moradi , Bastian Richter , Tobias Schneider and François-Xavier Standaert , published in IACR Transactions on Cryptographic Hardware and Embedded Systems, 2018(1)

We describe how Pearson’s χ 2 test can be used as a natural complement to Welch’s t-test for black box leakage detection. In particular, we show that by using these two tests in combination, we can mitigate some of the limitations due to the moment-based nature of existing detection techniques based on Welch’s t-test (e.g., for the evaluation of higher-order masked implementations with insufficient noise). We also show that Pearson’s χ 2 test is naturally suited to analyze threshold implementations with information lying in multiple statistical moments, and can be easily extended to a distinguisher for key recovery attacks. As a result, we believe the proposed test and methodology are interesting complementary ingredients of the side-channel evaluation toolbox, for black box leakage detection and non-profiled attacks, and as a preliminary before more demanding advanced analyses.

**Lowering the bar: deep learning for side-channel analysis**

### By Guilherme Perin, Baris Ege, Jasper van Woudenberg, white paper

Deep learning can help automate the signal analysis process in power side channel analysis. So far, power side channel analysis relies on the combination of cryptanalytic science, and the art of signal processing. Deep learning is essentially a classification algorithm, which can also be trained to recognize different leakages in a chip. Even more so, we do this such that typical signal processing problems such as noise reduction and re-alignment are automatically solved by the deep learning network. We show we can break a lightly protected AES, an AES implementation with masking countermeasures and a protected ECC implementation. These experiments show that where previously side channel analysis had a large dependency on the skills of the human, first steps are being developed that bring down the attacker skill required for such attacks. This paper is targeted at a technical audience that is interested in the latest developments on the intersection of deep learning, side channel analysis and security.

**Start Simple and then Refine: Bias-Variance Decomposition as a Diagnosis Tool for Leakage Profiling**

### By Liran Lerman, Nikita Veshchikov, Olivier Markowitch, François-Xavier Standaert, published in IEEE Transactions on Computers 2018

Evaluating the resistance of cryptosystems to side-channel attacks is an important research challenge. Profiled attacks reveal the degree of resilience of a cryptographic device when an adversary examines its physical characteristics. So far, evaluation laboratories launch several physical attacks (based on engineering intuitions) in order to find one strategy that eventually extracts secret information (such as a secret cryptographic key). The certification step represents a complex task because in practice the evaluators have tight memory and time constraints. In this paper, we propose a principled way of guiding the design of the most successful evaluation strategies thanks to the (bias-variance) decomposition of a security metric of profiled attacks. Our results show that we can successfully apply our framework on unprotected and protected algorithms implemented in software and hardware.

**Towards Sound and Optimal Leakage Detection Procedure**

### By Liwei Zhang , A. Adam Ding , Francois Durvaux , Francois-Xavier Standaert , and Yunsi Fei, presented at CARDIS 2017

Evaluation of side channel leakage for the embedded crypto systems requires sound leakage detection procedures. We relate the test vector leakage assessment (TVLA) procedure to the statistical minimum p-value (mini-p) procedure, and propose a sound method of deciding leakage existence in the statistical hypothesis setting. To improve detection, an advanced statistical procedure Higher Criticism (HC) is applied. The detection of leakage existence and the identification of exploitable leakage are separated when there are multiple leakage points.

**Very High Order Masking: Efficient Implementation and Security Evaluation**

### By Anthony Journault, François-Xavier Standaert, presented at CHES 2017

In this paper, we study the performances and security of recent masking algorithms specialized to parallel implementations in a 32-bit embedded software platform, for the standard AES Rijndael and the bitslice cipher Fantomas. By exploiting the excellent features of these algorithms for bitslice implementations, we first extend the recent speed records of Goudarzi and Rivain (presented at Eurocrypt 2017) and report realistic timings for masked implementations with 32 shares. We then observe that the security level provided by such implementations is uneasy to quantify with current evaluation tools.

**Private Multiplication over Finite Fields**

### By Sonia Belaïd, Fabrice Benhamouda, Alain Passelègue, Emmanuel Prouff,, Adrian Thillard, and Damien Vergnaud,, presented at CRYPTO 2017

The notion of privacy in the probing model, introduced by Ishai, Sahai, and Wagner in 2003, is nowadays frequently involved to assess the security of circuits manipulating sensitive information. However, provable security in this model still comes at the cost of a significant overhead both in terms of arithmetic complexity and randomness complexity. In this paper, we deal with this issue for circuits processing multiplication over finite fields. Our contributions are manifold. Extending the work of Belaïd, Benhamouda, Passelègue, Prouff, Thillard, and Vergnaud at Eurocrypt 2016, we introduce an algebraic characterization of the privacy for multiplication in any finite field and we propose a novel algebraic characterization for non-interference (a stronger security notion in this setting). Then, we present two generic constructions of multiplication circuits in finite fields that achieve non-interference in the probing model.

**Getting the Most Out of Leakage Detection Statistical Tools and Measurement Setups Hand in Hand**

### By Santos Merino del Pozo and François-Xavier Standaert, presented at COSADE 2017

In this work, we provide a concrete investigation of the gains that can be obtained by combining good measurement setups and efficient leakage detection tests to speed up evaluation times. For this purpose, we first analyze the quality of various measurement setups. Then, we highlight the positive impact of a recent proposal for efficient leakage detection, based on the analysis of a (few) pair(s) of plaintexts. Finally, we show that the combination of our best setups and detection tools allows detecting leakages for a noisy threshold implementation of the block cipher PRESENT after an intensive measurement phase, while either worse setups or less efficient detection tests would not succeed in detecting these leakages. Overall, our results show that a combination of good setups and fast leakage detection can turn security evaluation times from days to hours (for first-order secure implementations) and even from weeks to days (for higher-order secure implementations).

**Applying Horizontal Clustering Side-Channel Attacks on Embedded ECC Implementations**

### By Erick Nascimento and Łukasz Chmielewski, presented at CARDIS 2017

Side-channel attacks are a threat to cryptographic algorithms running on embedded devices. Public-key cryptosystems, including elliptic curve cryptography (ECC), are particularly vulnerable because their private keys are usually long-term. Well known countermeasures like regularity, projective coordinates and scalar randomization, among others, are used to harden implementations against common side-channel attacks like DPA.

Horizontal clustering attacks can theoretically overcome these countermeasures by attacking individual side-channel traces. In practice horizontal attacks have been applied to overcome protected ECC implementations on FPGAs. However, it has not been known yet whether such attacks can be applied to protected implementations working on embedded devices, especially in a non-profiled setting.

Our best attack has success rates of 97.64% and 99.60% for cswap-arith and cswap-pointer, respectively. This means that at most 6 and 2 bits are incorrectly recovered, and therefore, a subsequent brute-force can fix them in reasonable time. Furthermore, our horizontal clustering framework used for the aforementioned attacks can be applied against other

protected implementations.

**Connecting and Improving Direct Sum Masking and Inner Product Masking**

### By Romain Poussier, Qian Guo, François-Xavier Standaert, Sylvain Guilley, Claude Carlet, presented at CARDIS 2017

Direct Sum Masking (DSM) and Inner Product (IP) masking are two types of countermeasures that have been introduced as alternatives to simpler (e.g., additive) masking schemes to protect cryptographic implementations against side-channel analysis. In this paper, we first show that IP masking can be written as a particular case of DSM. We then analyze the improved security properties that these (more complex) encodings can provide over Boolean masking. For this purpose, we introduce a slight variation of the probing model, which allows us to provide a simple explanation to the “security order amplification” for such masking schemes that was put forward at CARDIS 2016. We then use our model to search for new instances of masking schemes that optimize this security order amplification. We finally discuss the relevance of this security order amplification (and its underlying assumption of linear leakages) based on an experimental case study.

**A Systematic Approach to the Side-Channel Analysis of ECC Implementations with Worst-Case Horizontal Attacks**

### By Romain Poussier, Yuanyuan Zhou, François-Xavier Standaert, presented at CHES 2017

The wide number and variety of side-channel attacks against scalar multiplication algorithms makes their security evaluations complex, in particular in case of time constraints making exhaustive analyses impossible. In this paper, we present a systematic way to evaluate the security of such implementations against horizontal attacks. As horizontal attacks allow extracting most of the information in the leakage traces of scalar multiplications, they are suitable to avoid risks of overestimated security levels. For this purpose, we additionally propose to use linear regression in order to accurately characterize the leakage function and therefore approach worst-case security evaluations. We then show how to apply our tools in the contexts of ECDSA and ECDH implementations, and validate them against two targets: a Cortex-M4 and a Cortex-A8 micro-controllers.

**Ridge-based DPA: Improvement of Differential Power Analysis For Nanoscale Chips**

### By Weijia Wang, Yu Yu, François-Xavier Standaert, Junrong Liu, Zheng Guo, Dawu Gu, published in IEEE Transactions on Information Forensics & Security

Differential power analysis (DPA), as a very practical type of side-channel attacks, has been widely studied and used for the security analysis of cryptographic implementations. However, as the development of chip industry leads to smaller technologies, the leakage of cryptographic implementations in nanoscale devices tends to be nonlinear (i.e., leakages of intermediate bits are no longer independent) and unpredictable. These phenomena make some existing side-channel attacks not perfectly suitable, i.e., decreasing their performance and making some common used prior power models (e.g., Hamming weight) to be much less respected in practice. To solve above issues, we introduce the regularization process from statistical learning to the area of side-channel attack and propose the ridge-based DPA.

**Convolutional Neural Networks with Data Augmentation against Jitter-Based Countermeasures – Profiling Attacks without Pre-Processing**

### By Eleonora Cagli, Cécile Dumas, Emmanuel Prouff, presented at CHES 2017

In the context of the security evaluation of cryptographic implementations, profiling attacks (aka Template Attacks) play a fundamental role. Nowadays the most popular Template Attack strategy consists in approximating the information leakages by Gaussian distributions. Nevertheless this approach suffers from the difficulty to deal with both the traces misalignment and the high dimensionality of the data. This forces the attacker to perform critical preprocessing phases, such as the selection of the points of interest and the realignment of measurements. Some software and hardware countermeasures have been conceived exactly to create such a misalignment.

**Towards Practical Tools for Side Channel Aware Software Engineering: ‘Grey Box’ Modelling for Instruction Leakages**

### By David McCann, Elisabeth Oswald, Carolyn Whitnall, presented at USENIX Security Symposium 2017

Power (along with EM, cache and timing) leaks are of considerable concern for developers who have to deal with cryptographic components as part of their overall software implementation, in particular in the context of embedded devices. Whilst there exist some compiler tools to detect timing leaks, similar progress towards pinpointing power and EM leaks has been hampered by limits on the amount of information available about the physical components from which such leaks originate.

**Categorising and Comparing Cluster-Based DPA Distinguishers**

### By Xinping Zhou, Carolyn Whitnall, Elisabeth Oswald, Degang Sun, Zhu Wang, presented at SAC 2017

Side-channel distinguishers play an important role in differential power analysis, where real world leakage information is compared against hypothetical predictions in order to guess at the underlying secret key. A class of distinguishers which can be described as `cluster-based’ have the advantage that they are able to exploit multi-dimensional leakage samples in scenarios where only loose, `semi-profiled’ approximations of the true leakage forms are available. This is by contrast with univariate distinguishers exploiting only single points (e.g. correlation), and Template Attacks requiring concise fitted models which can be overly sensitive to mismatch between the profiling and attack acquisitions.

**Deliverables**

**Deliverable D1.2: Interim Portfolio of Best Methods and Improved Evaluation Techniques **

This is the second deliverable produced within work package 1 (WP1) of the project. The goal of this report is to act as a portfolio of improved evaluation techniques and their figures of merit.

In the context of our previously introduced detect-map-exploit core framework (see Deliverable 1.1), we start off by discussing the advantages and limitations of the widely-deployed Test Vector Leakage Assessment (TVLA) methodology. Properly redefining the objectives of leakage detection enable us to identify and address existing open questions and take steps towards a better and faster leakage detection. Performing leakage detection efficiently is a central question for side-channel leakage evaluation. Therefore, we compare and contrast two methods of computing moments for detection tests, namely an off-line method and an online (histograms based) method.

memory systems. We conclude that the histograms method is not deemed usable in the current state for realistic examples, as the storage of the histograms using dense arrays uses a huge amount of memory. A benchmark run on a real-world example shows that the scalability of the offline method could still be improved.

Next, we discuss Deep Learning (DL) techniques as black box solvers that are proven to be effective for side-channel evaluation. We provide an overview of the methodology that is commonly applied to side-channel attacks when neural networks are considered as the model for the key recovery. We present

two case studies using convolutional neural networks (CNN): the first on a protected AES implementation (DPA Contest V4), and the second on an ECC implementation protected with misalignment. We study practical issues and limitations that arise with DL-based techniques and outline directions to improve or solve these obstacles by tweaking the different parameters of the analysis. Finally, we propose a novel idea for DL -based SCA in section 6.4 based on a multi-channel approach.

**Deliverable D1.3: **White Paper on Evaluation Strategies for AES and ECC

This deliverable is a white paper describing side-channel evaluation strategies of cryptographic implementations, particularly for AES and ECC. Through this paper, the REASSURE consortium provides general guidance and directions to improve how cryptographic implementations are evaluated both practically and soundly.

We illustrate the proposed evaluation strategy with three concrete examples: an AES implementation protected with combined affine masking and shuffled execution, an unprotected ECC implementation and the ECC point randomization countermeasure.

We connect our approach to current evaluation strategies such as the BSI ECC evaluation guidelines and the CC scheme and describe how it can be used as a refinement of the latter to yield more efficient and representative evaluations. We finally discuss how our approach adresses the gap between concrete worst-case security and current evaluation processes.

**Deliverable D1.4: **Final portfolio of best methods and improved evaluation techniques

This document describes the best techniques and methods created or improved by the REASSURE consortium and aimed at optimizing high-security vulnerability assessment. This deliverable describes the steps for which interesting positive results were found. The techniques are organized according to the steps of our evaluation framework, which is the core of our approach to optimize and streamline evaluation methodology.

Section 3 assesses whether low-cost/low-power processors can potentially level classical infrastructures from a cost-efficiency point of view. The considered distributed leakage detection method is based on the central moments method. The two setups that are compared in this case are a cluster of low-power ARM CPUs on one hand, and a High-Performance Computing (HPC) infrastructure on the other. Experiments indicate that the former setup can leverage a parallel architecture in an industrial environment. From a practical point of view, the speed-up can be more than 100 x compared to a fully serial implementation, or around 10 x compared to a multi-threaded implementation.

Section 4 is dedicated to the application of neural networks for side channel analysis and describes tools that give an insight into what the neural network is learning, improving explainability and as a happy consequence generalization. The later can be achieved by accurate hyperparameter selection and we show that using the mutual information transferred to the last layer we can improve attack accuracy by a factor of up to 30%. A second improvement to generalization is achieved by the usage of ensembles, which can lead to up to a 40% improvement of attack performance compared to using the best model alone.

Section 5 explores multi-channel classifiers which prove useful in cases where the leakage model is unknown and it shows evidence of improvements of one order of magnitude.

Section 6 compares the efficiency of profiled linear regression attacks (LRA) with Scatter, a new technique which aims to improve attacks in the case of misaligned traces. Overall, LRA always reaches a success rate close to 1, while Scatter often fails, even with a high number of traces. When both reach a success rate close to 1, Scatter requires at least twice the number of traces.

**Deliverable D1.5: **White Paper: Assurance in Security Evaluations

Security evaluations are complex, and two distinct approaches exist today: conformance style testing as in FIPS 140 and attack driven testing as in CC. Within the REASSURE project we studied attack vectors with regards to their optimality and potential for automation with the aim to improve existing evaluation regimes. By optimality we mean that we can prove that any practical instantiation of an attack reaches its theoretical limit; by potential for automation we mean that it can be executed with minimal user interaction. We comment on which steps offer some potential for automation in this white paper.

Our research points towards the fact that any optimality can only ever be achieved when considering worst-case adversaries. These are adversaries that get full access to implementation details, can select secret parameters, and thereby control countermeasures during an initial profiling phase. The reason for this is that it is only in this setting that we can utilise statistical (or machine/deep learning) tools which are well understood and for which we can assess/argue their optimality. Any attack vector which requires dealing with higher order or multivariate data leads to a loss of theoretical guarantees.

Within the REASSURE proposal for a so-called “backwards” approach for evaluations, we postulate that any evaluation should attempt first to instantiate a worst-case adversary (even if this requires open samples and/or samples with known secrets, or even the developers in their own environment to execute these attacks). In this setting we argue that we have the strongest guarantees for optimality. If necessary an evaluation should then also instantiate a more “practical” attack by relaxing some assumptions. The difference in effort required between the worst case and the best practical attack then gives an indication of the (potential) security gap.

We recommend that any reporting that is based on a points-based system like [46] should make the ratings explicit for the worst-case adversary (separating out the points for identification/profiling and the exploitation) and the “best practical adversary” (again making explicit how points are awarded for the different phases), and it should also be reported how likely it is that the results from identification/profiling translate to other devices.

Finally we observe that the role of formal verification in the context of side channel evaluations is perhaps different from the role that it plays in other contexts. Formal methods prove properties of implementations on an abstract level: this requires assumption of a leakage model. In practice however, devices show multiple, context-dependent leakage characteristics. Therefore formal verification in the context of leakage evaluations only shows that the necessary conditions are fulfilled, but they do not provide evidence about the sufficiency of these conditions. These sufficient conditions can only be ascertained via testing. Consequently, formal verification is suitable for low assurance levels. By contrast, high assurance essentially relates to the risk that (e.g., backwards) security evaluations, even when the evaluator is provided with worst-case capabilities, may be sub-optimal due to inherently heuristic and hard to analyze theoretically attack steps.

**Deliverable D2.1: Shortcut Formulas for Side Channel Evaluation **

This deliverable surveys a number of approaches to shortcut the effort required to assess implementations and devices with respect to their susceptibility regarding side channel attacks. Our approach aligns with the divide and conquer nature of most side channel attacks and hence we touch on shortcuts that apply the distinguisher statistics (the divide step) and the key rank (the conquer step).

Within REASSURE we set ourselves the challenge to assess how such shortcut approaches could nevertheless be leveraged to improve evaluations. This deliverable hence is the foundational step from which we will continue in two directions. Firstly, within the next 6 months, industrial partners will provide some insight into at what stage shortcuts are particularly appropriate. Secondly, some shortcuts will be implemented within WP3.2 in the context of the REASSURE simulation tool.

**Deliverable D2.2: Interim report on automation **

This document represents interim progress of the REASSURE consortium towards the goal of flexible evaluation methods that are usable by non-domain experts.

We begin by clarifying what we mean by automation: the facility to perform a task with a minimum of user input and interaction (expert or otherwise). This implies that the necessary parameters for a sound, fair and reproducible analysis are derived or estimated from the available data as far as possible. It also requires that the assumptions and limitations of the analysis be made transparent in the output, in order to guide (potentially non-expert) users towards sound conclusions.

On the basis that detection and mapping tasks are the most promising for automation, we then provide a more detailed account of statistical hypothesis testing and of how to perform the multiple comparison corrections and statistical power analyses required to carry out such procedures fairly and transparently. In the case of leakage detection strategies based on simple assumptions, such as those relying on *t*-tests, it is possible to arrive at analytical formulae for the latter; for complex methods, such as those estimating information theoretic quantities, the only options currently available (as far as we are aware) involve referring to empirically-derived indicators from (hopefully) representative scenarios. A number of currently used methods (in particular, those based on ANOVA-like tests, and those based on correlation) sit in a middle ground, where the side-channel literature has not yet fully explored the solutions already proposed in the statistics literature. This represents an avenue for future work, which could potentially make it possible to safely automate a wider range of tests, as well as providing insights into the trade-offs between them.

We discuss the particular challenges of extending known results for univariate, `first order’ leakages to the sorts of higher-order univariate and multivariate leakages arising from protected schemes. Tricks to `shift’ information into distribution means via pre-processing inevitably violate the assumptions which make the statistical power analysis of *t*-tests straightforward. Meanwhile, unless evaluators have control over (or at least access to) any randomness, as well as sufficient details about the implementation specification, the task of searching for jointly leaking tuples can quickly become infeasible as the leakage order increases. We consider that more work needs to be done from a basic methodological perspective before higher-order detection can realistically be considered a candidate for automation.

Finally, we propose that there exists a need to devise measures of `coverage’, inspired by the notion of code coverage in software testing. We offer some suggestions of what this might look like: input-based scores indicating the proportion of the total population sampled for the experiment (under different assumptions about the form of the leakage); intermediate value-based scores indicating the fraction of total sensitive intermediates targeted by the tests; or even strategies whereby profiling information is used to design and test specific corner cases.

We intend to make the formal definition of coverage metrics an avenue for exploration during the remainder of the project, along with other outstanding questions identified in this document. A future deliverable (D2.5) will report on progress made to that end, and finalise our findings and recommendations.

**Deliverable D2.3: Shortcut Formulas for Side-Channel Evaluation**

This deliverable surveys a number of approaches to shortcut the effort required to assess implementations and devices with respect to their susceptibility to side channel attacks. Our approach aligns with the divide and conquer nature of most side channel attacks and hence we touch on shortcuts that apply to the distinguisher statistics (the divide step) and the key rank (the conquer step).

Within REASSURE we set ourselves the challenge to assess how such shortcut approaches could nevertheless be leveraged to improve evaluations. This deliverable attempts to highlight at what stage shortcuts are particularly appropriate. Secondly, some shortcuts will be implemented within WP3.2 in the context of the REASSURE simulation tool.

**Deliverable D 2.4: Report on Instruction Level Profiling**

This document reports on interim progress of the REASSURE consortium towards methodologies to create representative instruction-level profiles of microprocessors of at least medium complexity.

Next we summarise a range of existing tools from industry and academia, including one (ELMO) which we argue best sets the precedent that we plan to follow for REASSURE. We explain the modelling procedure used in the construction of ELMO and report on the results of our own model-building endeavour using data from an alternative implementation of the same board (an ARM Cortex M0). In particular, we show that there are broad similarities between the implementations, but some differences that will require attention if models are expected to port from one to another.

The next steps for REASSURE will be to explore improvements to the model building procedure, as well as its extension to more complex devices, in order to eventually release our own simulator tools to the wider side channel research and evaluation community.

**Deliverable D2.6: White Paper: Security Testing Via Simulator**

This white paper guides users of the trace simulator (ELMO) released by the REASSURE consortium in the particular task of security testing code during the design phase. It is intended for software developers who need to write or integrate cryptography on ARM CORTEX-M devices, in particular the M0. We assume familiarity with symmetric encryption (AES), an awareness of side channel analysis (in particular power analysis and masking as a countermeasure), and ARM Thumb assembly.

We next show how to set up and configure ELMO – for simulation generally, and in the context of ‘fixed-versus-random’ leakage detection. We provide some non-technical intuition for choosing the desired error rates of the tests and setting an appropriate sample size.

Finally, we give some case-study examples that show ELMO’s leakage detection capabilities in practice and illustrate the types of problems it is able to flag up. We suggest ways to address the particular problems in question, thus demonstrating a workflow of testing, adjusting, and retesting that we recommend to developers in the code development phase of the design process.

The appendix of this whitepaper lists and reviews a number of existing leakage simulators.

**Deliverable D2.7: Final report on automation**

This document is an update to deliverable 2.5 detailing further progress of the REASSURE consortium towards the goal of flexible evaluation methods that are usable by non-domain experts.

We begin by clarifying what we mean by automation: the facility to perform a task with a minimum of user input and interaction (expert or otherwise). This implies that the necessary parameters for a sound, fair and reproducible analysis are derived or estimated from the available data as far as possible. It also requires that the assumptions and limitations of the analysis be made transparent in the output, in order to guide (potentially non-expert) users towards sound conclusions.

On the basis that detection and mapping tasks are the most promising for automation, we then provide a more detailed account of statistical hypothesis testing and of how to perform the statistical power analyses and multiple comparison corrections required to carry out such procedures fairly and transparently. In the case of leakage detection strategies based on simple assumptions, such as those relying on t-tests, it is possible to arrive at analytical formulae for the latter; for complex methods, such as those estimating information theoretic quantities, the only options currently available (as far as we are aware) involve referring to empirically-derived indicators from (hopefully) representative scenarios.

We discuss the particular challenges of extending known results for univariate, ‘first order’ leakages to the sorts of higher-order univariate and multivariate leakages arising from protected schemes. Tricks to ‘shift’ information into distribution means via pre-processing inevitably violate the assumptions which make the statistical power analysis of t-tests straightforward. We consider that more work needs to be done from a basic methodological perspective before higher-order detection can realistically be considered a candidate for automation.

Inspired by the notion of code coverage in software testing, we suggest the need for measures of leakage evaluation coverage. We offer some suggestions of what these might look like: input-based scores indicating the proportion of the total population sampled for the experiment (under different assumptions about the form of the leakage); intermediate value-based scores indicating the fraction of total sensitive intermediates targeted by the tests; or even strategies whereby profiling information is used to design and test specific corner cases.

With a non-domain expert end user in mind we then consider how to apply recommendations from this deliverable to the tools produced in work package 3. In particular, we explain how to set the default test parameters for the automated simulation and analysis of traces so as to require minimal input from the developer in an iterative design process.

We lastly consider the automation potential of deep learning as a tool for side-channel analysis. WP1 (see, in particular, D1.2) has addressed deep learning from an efficiency perspective, but its ability to extract information from large amounts of raw data with minimal input on the part of the user makes it highly relevant to this work package also. We describe our efforts towards making best use of its capabilities, including via a proposal whereby collections of traces are jointly classified as having been generated under a particular key guess or not, and the inspection of the loss function gradients with respect to the input data during the training phase. We also discuss some of the obstacles to automation, such as the opaque nature of deep learning models, which may make it difficult to draw practically applicable conclusions about the form and location of leakage after an evaluation.

**Deliverable D3.7: Final report on tools**

This document consists of a list of tools that were developed during the course of the REASSURE project by project partners. Tools are categorised as either software, data sets, or simulation tools; they can be internal tools strictly under the control of a single partner, or joint ventures and openly released. We provide an overview of our collective achievements, followed by a list of tools, organised by REASSURE partners. We indicate if a tools is available for public use. Wherever possible we give figures for performance improvements as a result of a tool.

**Deliverable D4.3: Final Report on Standardization**

A declared goal of REASSURE was to have an impact on standardisation and evaluation schemes via influencing stakeholders and decision makers. To do so, the consortium identified relevant targets, e.g. established consortia such as JHAS, ongoing standardisation efforts within ISO, as well as emerging efforts like SESIP, and liaised with them over the past three years.
The consortium provided input to two ISO standards (20085-1 and 20085-2), which matured to publication during the duration of the project. These standards cover technicalities around side channel setups and the calibration of them. The consortium also contributed to the analysis of an existing ISO standard (17825). This standard covers testing methods in the context of side-channel attacks.

Consortium members made several presentations to the JHAS group during the regular group meetings, contributing in particular in the context of the ongoing and rapid developments around the use of deep learning.

The consortium is also represented in EMVCo, and a presentation of project results in the context of deep learning is scheduled.

SGDSN held yearly meetings with CB GIE, which were informed by REASSURE results.

An initiative initially launched by NXP (known as SmartCC) and now run by the Global Platform consortium under the name SESIP (Security Evaluation Standard for IoT Platforms) aims at driving the harmonisation of standardisation in the context of Internet of Things (IoT) devices. REASSURE results, in particular intended for the IoT use case (leakage detection as “conformance style testing”) have been discussed with the relevant NXP liaison people, w.r.t. to implications for the SESIP scheme.

**Deliverable D5.2: Data management plan**

This document represents the first version of the Data Management Plan (DMP) of the REASSURE project. It is a living document that will be updated throughout the project. The document focuses on identifying the type of data that will be openly shared, namely leakage traces acquired and processed during a side-channel analysis of an embedded security device, as well as the intended target audience and the data format.

**Deliverable D5.6: Advanced SCA Training **

This document is a placeholder for the full training (which can be accessed here). In this document we present a high level overview of how the course and the online webinar (https://www.riscure.com/news/understanding-leakage-detection-webinar/) are organized and the impact in terms of number of participants.

The decision to dedicate the advanced SCA training to leakage detection is a consortium decision, as the application of leakage detection techniques is a complex topic, not always well understood, of vital interest in the hardware security evaluation industry.