Tools

Reference Traces

REASSURE produced a series of datasets obtained by measuring power traces on various physical platforms. These datasets are freely available, and can be used as a basis to test attacks, leakage detection methods, or for any other purpose. Our only requirement is to acknowledge the sources when communicating about subsequent activities (see license files in repositories).

Software tools                 

REASSURE developed various tools to help develop or assess side-channel-resistant implementations. These tools are released under public licenses (see license files in repositories). Here too, our main requirement is to acknowledge the sources when communicating about subsequent activities:

Online trainings

Side Channel Analysis and Countermeasures for Software Developers

Click here to start the training

Are you looking for an easy introduction to side channel analysis? Developed by Riscure and powered by the Reassure project, we bring you an online training, a trial version of which is freely and openly available to anyone who is interested in learning more about SCA.

This course is for you if any of the following statements are true:

  • You heard about a magic technique of extracting keys from devices simply by looking at them, and want to know how this works
  • You are an IoT developer who wants to add crypto to their product
  • You listened to talks on side channel analysis, found them confusing, and would like to understand the details
  • You are a CtF player who wants to prepare for SCA challenges.

The end goal of this training is to enable you to protect your devices and applications against basic side-channel analysis attacks. Your journey will first take you through the theoretical foundations: you will learn what a side channel is, get familiar with practical examples and understand the typical flow of an attack.

Cardis tutorial’s scripts

A tutorial devoted to leakage detection was held in November 2018. The material presented at the tutorial, including python scripts, can still be downloaded by following this link.

Understanding leakage detection

Click here to start the training

The aim of this course is to help the audience grasp the intuition behind leakage detection methodologies and achieve a sound technical appreciation of how and why they work. In the course, we motivate and describe the current popular practice, including correlation-based tests, and expose some of the limitations, with a special focus on ISO standard 17825. The learning goal of this advanced course is to equip evaluators to carry out leakage detection tests sensibly and interpret the outcomes responsibly.

This course is structured as follows. We start off by introducing the problem, namely the presence of data-dependencies in side-channel measurements, and the most common strategy to exploit such information: Differential Power Analysis (DPA).We then build a case for why statistical methods are necessary and develop the particular rationale behind the t-test before describing it more formally. Finally, we show how the t-test is being applied within the TVLA framework and discuss some of the issues affecting its usefulness.

The online training is available on-demand and free of charge on the project website. The approximate duration of this course is 6 hours.

Publications

A Systematic Appraisal of Side Channel Evaluation Strategies

By Melissa Azouaoui, Davide Bellizia, Ileana Buhan, Nicolas Debande, Sèbastien Duval, Christophe Giraud, Èliane Jaulmes, François Koeune, Elisabeth Oswald;, François-Xavier Standaert, and Carolyn Whitnall, presented at Security Standardisation Research Conference (SSR) 2020

In this paper we examine the central question that is how well do side channel evaluation regimes capture the true security level of a product. Concretely, answering this question requires considering the optimality of the attack/evaluation strategy selected by the evaluator, and the various steps to instantiate it. We draw on a number of published works and discuss whether state-of-the-art solutions for the different steps of a side-channel security evaluation offer bounds or guarantees of optimality, or if they are inherently heuristic. We use this discussion to provide an informal rating of the steps’ optimality and to put forward where risks of overstated security levels remain.

Side-Channel Countermeasures’ Dissection and the Limits of Closed Source Security Evaluations

By Olivier Bronchain and François-Xavier Standaert, published in TCHES 2020

We take advantage of a recently published open source implementation of the AES protected with a mix of countermeasures against side-channel attacks to discuss both the challenges in protecting COTS devices against such attacks and the limitations of closed source security evaluations. The target implementation has been proposed by the French ANSSI (Agence Nationale de la Sécurité des Systèmes d’Information) to stimulate research on the design and evaluation of side-channel secure implementations. It combines additive and multiplicative secret sharings into an affine masking scheme that is additionally mixed with a shuffled execution. Its preliminary leakage assessment did not detect data dependencies with up to 100,000 measurements. We first exhibit the gap between such a preliminary leakage assessment and advanced attacks by demonstrating how a countermeasures’ dissection exploiting a mix of dimensionality reduction, multivariate information extraction and key enumeration can recover the full key with less than 2,000 measurements.

We then discuss the relevance of open source evaluations to analyze such implementations efficiently, by pointing out that certain steps of the attack are hard to automate without implementation knowledge (even with machine learning tools), while performing them manually is straightforward. Our findings are not due to design flaws but from the general difficulty to prevent side-channel attacks in COTS devices with limited noise. We anticipate that high security on such devices requires significantly more shares
Read more

FENL: an ISE to mitigate analogue micro-architectural leakage

By Si Gao, Ben Marshall, Dan Page and Thinh Pham, published in TCHES 2020

Ge et al. [ GYH18 ] propose the augmented ISA (or aISA), a central tenet of which is the selective exposure of micro-architectural resources via a less opaque abstraction than normal. The aISA proposal is motivated by the need for control over such resources, for example to implement robust countermeasures against micro- architectural attacks. In this paper, we apply an aISA-style approach to challenges stemming from analogue micro-architectural leakage; examples include power-based Hamming weight and distance leakage from relatively fine-grained resources (e.g., pipeline registers), which are not exposed in, and so cannot be reliably controlled via, a normal ISA.

Specifically, we design, implement, and evaluate an ISE named FENL: the ISE acts as a fence for leakage, preventing interaction between, and hence leakage from, instructions before and after it in program order. We demonstrate that the implementation and use of FENL has relatively low overhead, and represents an effective tool for systematically localising and reducing leakag
Read more

Learning when to stop: a mutual information approach to prevent overfitting in profiled side-channel analysis

By Guilherme Perin, Ileana Buhan and Stjepan Picek , published in IACR Cryptology ePrint Archive

Today, deep neural networks are a common choice for conducting the profiled side-channel analysis. Such techniques commonly do not require pre-processing, and yet, they can break targets protected with countermeasures. Unfortunately, it is not trivial to find neural network hyper-parameters that would result in such top-performing attacks. The hyper-parameter leading the training process is the number of epochs during which the training happens. If the training is too short, the network does not reach its full capacity, while if the training is too long, the network overfits, and is not able to generalize to unseen examples. Finding the right moment to stop the training process is particularly difficult for side-channel analysis as there are no clear connections between machine learning and side-channel metrics that govern the training and attack phases, respectively.

In this paper, we tackle the problem of determining the correct epoch to stop the training in deep learning-based side-channel analysis. We explore how information is propagated through the hidden layers of a neural network, which allows us to monitor how training is evolving. We demonstrate that the amount of information, or, more precisely, mutual information transferred to the output layer, can be measured and used as a reference metric to determine the epoch at which the network offers optimal generalization. To validate the proposed methodology, we provide extensive experimental results that confirm the effectiveness of our metric for avoiding overfitting in the profiled side-channel analysis.
Read more

Strength in Numbers: Improving Generalization with Ensembles in Profiled Side-channel Analysis

By Guilherme Perin, Lukasz Chmielewski and Stjepan Picek , published in IACR Cryptology ePrint Archive

The adoption of deep neural networks for profiled side-channel attacks provides powerful options for leakage detection and key retrieval of secure products. When training a neural network for side-channel analysis, it is expected that the trained model can implement an approximation function that can detect leaking side-channel samples and, at the same time, be insensible to noisy (or non-leaking) samples. This outlines a generalization situation where the model can identify the main representations learned from the training set in a separate test set. In this paper, we first discuss how output class probabilities represent a strong metric when conducting the side-channel analysis. Further, we observe that these output probabilities are sensitive to small changes, like the selection of specific test traces or weight initialization for a neural network.

Next, we discuss the hyper-parameter tuning, where one commonly uses only a single out of dozens of trained models, where each of those models will result in different output probabilities. We show how ensembles of machine learning models based on averaged class probabilities can improve generalization. Our results emphasize that ensembles increase the performance of a profiled side-channel attack and reduce the variance of results stemming from different groups of hyper-parameters, regardless of the selected dataset or leakage model.
Read more

Leakage Certification Revisited: Bounding Model Errors in Side-Channel Security Evaluations

By Olivier Bronchain, Julien M. Hendrickx, Clément Massart, Alex Olshevsky, François-Xavier Standaert, presented at CRYPTO 2019

Leakage certification aims at guaranteeing that the statistical models used in side-channel security evaluations are close to the true statistical distribution of the leakages, hence can be used to approximate a worst-case security level. Previous works in this direction were only qualitative: for a given amount of measurements available to an evaluation laboratory, they rated a model as “good enough” if the model assumption errors (i.e., the errors due to an incorrect choice of model family) were small with respect to the model estimation errors. We revisit this problem by providing the first quantitative tools for leakage certification. For this purpose, we provide bounds for the (unknown) Mutual Information metric that corresponds to the true statistical distribution of the leakages based on two easy-to-compute information theoretic quantities: the Perceived Information, which is the amount of information that can be extracted from a leaking device thanks to an estimated statistical model, possibly biased due to estimation and assumption errors, and the Hypothetical Information, which is the amount of information that would be extracted from an hypothetical device exactly following the model distribution.

This positive outcome derives from the observation that while the estimation of the Mutual Information is in general a hard problem (i.e., estimators are biased and their convergence is distribution-dependent), it is significantly simplified in the case of statistical inference attacks where a target random variable (e.g., a key in a cryptographic setting) has a constant (e.g., uniform) probability. Our results therefore provide a general and principled path to bound the worst-case security level of an implementation. They also significantly speed up the evaluation of any profiled side-channel attack, since they imply that the estimation of the Perceived Information, which embeds an expensive cross-validation step, can be bounded by the computation of a cheaper Hypothetical Information, for any estimated statistical model.
Read more

Provable Order Amplification for Code-based Masking: How to Avoid Non-linear Leakages due to Masked Operations

By Weijia Wang, Yu Yu, François-Xavier Standaert, published in IEEE Transactions on Information Forensics and Security

Code-based masking schemes have been shown to provide higher theoretical security guarantees than the Boolean masking. In particular, one interesting feature put forward at CARDIS 2016 and then analyzed at CARDIS 2017 is the so-called security order amplification: under the assumption that the leakage function is linear, it guarantees that an implementation performing only linear operations will have a security order in the bounded moment leakage model larger than d−1, where d is the number of shares.
The main question regarding this feature is its practical relevance. First of all, concrete block ciphers do not only perform linear operations. Second, it may be that actual leakage functions are not perfectly linear (raising questions regarding what happens when one deviates from such assumptions).

Multi-Tuple Leakage Detection and the Dependent Signal Issue

By Olivier Bronchain, Tobias Schneider and François-Xavier Standaert, published in TCHES 2019

Leakage detection is a common tool to quickly assess the security of a cryptographic implementation against side-channel attacks. The Test Vector Leakage Assessment (TVLA) methodology using Welch’s t-test, proposed by Cryptography Research, is currently the most popular example of such tools, thanks to its simplicity and good detection speed compared to attack-based evaluations. However, as any statistical test, it is based on certain assumptions about the processed samples and its detection performances strongly depend on parameters like the measurement’s Signal-to-Noise Ratio (SNR), their degree of dependency, and their density, i.e., the ratio between the amount of informative and non-informative points in the traces.

In this paper, we argue that the correct interpretation of leakage detection results requires knowledge of these parameters which are a priori unknown to the evaluator,and, therefore, poses a non-trivial challenge to evaluators (especially if restricted to only one test).

For this purpose, we first explore the concept of multi-tuple detection,which is able to exploit differences between multiple informative points of a trace more effectively than tests relying on the minimum p-value of concurrent univariate tests.To this end, we map the common Hotelling’sT2-test to the leakage detection setting and, further, propose a specialized instantiation of it which trades computational overheads for a dependency assumption. Our experiments show that there is not one test that is the optimal choice for every leakage scenario. Second, we highlight the importance of the assumption that the samples at each point in time are independent,which is frequently considered in leakage detection, e.g., with Welch’s t-test. Using simulated and practical experiments, we show that (i)this assumption is often violated in practice, and (ii)deviations from it can affect the detection performances, making the correct interpretation of the results more difficult. Finally, we consolidate our findings by providing guidelines on how to use a combination of established andnewly-proposed leakage detection tools to infer the measurements parameters. This enables a better interpretation of the tests’ results than the current state-of-the-art (yet still relying on heuristics for the most challenging evaluation scenarios).
Read more

Beyond Algorithmic Noise or How to Shuffle Parallel Implementations?

By Itamar Levi, Davide Bellizia, François-Xavier Standaert, published in International Journal of Circuit Theory and Applications,

Noise is an important ingredient for side-channel-analysis countermeasures security. However, physical noise is in most cases not sufficient to achieve high security levels. As an outcome, designers traditionally aim to emulate noise by harnessing shuffling in the time-domain and algorithmic noise in the amplitude-domain. On one hand,harnessing algorithmic-noise is limited in architectures/devices which have a limited data-path width. On the other hand, the performance degradation due to shuffling is considerable.

A natural complement to operation-shuffling is the hardware-based intra-cycle shuffling (ICS) which typically shuffles the sample-time of bits within a clock-cycle (instead of micro-processor operations). Such architecture eliminates the performance overhead, due to shuffling within a single cycle, it is algorithm independent, i.e. no need in partitioning of operations, and as it is hardware-based, the data-path width can be tailored to better exploit algorithmic-noise. In this manuscript,we first analyze the noise components in physical designs to better model the algorithmic noise. We then perform an information-theoretic (IT) analysis of both shuffling countermeasures. The last part of the manuscript deals with real world architectures analysis: IT analysis of an AES core implemented over a 32- and 128-bit wide data-path embedded with intra-cycle shuffling and two flavors of shuffling generation(memory-based and on-line permutation generation). The manuscript is concluded by underlying the benefits which can be achieved with the ICS architecture.
Read more

Gradient Visualization for General Characterization in Profiling Attacks

By Loïc Masure, Cécile Dumas and Emmanuel Prouff, presented at COSADE 2019

In Side-Channel Analysis (SCA), several papers have shown that neural networks could be trained to efficiently extract sensitive information from implementations running on embedded devices. This paper introduces a new tool called Gradient Visualization that aims to proceed a post-mortem information leakage characterization after the successful training of a neural network. It relies on the computation of the gradient of the loss function used during the training. The gradient is no longer computed with respect to the model parameters, but with respect to the input trace components. Thus, it can accurately highlight temporal moments where sensitive information leaks. We theoretically show that this method, based on Sensitivity Analysis, may be used to efficiently localize points of interest in the SCA context.

The efficiency of the proposed method does not depend on the particular countermeasures that may be applied to the measured traces as long as the profiled neural network can still learn in presence of such difficulties. In addition, the characterization can be made for each trace individually. We verified the soundness of our proposed method on simulated data and on experimental traces from a public side-channel database. Eventually we empirically show that the Sensitivity Analysis is at least as good as state-of-the-art characterization methods, in presence (or not) of countermeasures.
Read more

A Comprehensive Study of Deep Learning for Side-Channel Analysis

By Loïc Masure, Cécile Dumas and Emmanuel Prouff, published in TCHES 2020

Recently, several studies have been published on the application of deep learning to enhance Side-Channel Attacks (SCA). These seminal works have practically validated the soundness of the approach, especially against implementations protected by masking or by jittering. Concurrently, important open issues have emerged. Among them, the relevance of machine (and thereby deep) learning based SCA has been questioned in several papers based on the lack of relation between the accuracy , a typical performance metric used in machine learning, and common SCA metrics like the Guessing entropy or the key-discrimination success rate . Also, the impact of the classical side-channel counter-measures on the efficiency of deep learning has been questioned, in particular by the semi-conductor industry. Both questions enlighten the importance of studying the theoretical soundness of deep learning in the context of side-channel and of developing means to quantify its efficiency, especially with respect to the optimality bounds published so far in the literature for side-channel leakage exploitation.

The first main contribution of this paper directly concerns the latter point. It is indeed proved that minimizing the Negative Log Likelihood (NLL for short) loss function during the training of deep neural networks is actually asymptotically equivalent to maximizing the Perceived Information introduced by Renauld et al. at EUROCRYPT 2011 as a lower bound of the Mutual Information between the leakage and the target secret. Hence, such a training can be considered as an efficient and effective estimation of the PI, and thereby of the MI (known to be complex to accurately estimate in the context of secure implementations). As a second direct consequence of our main contribution, it is argued that, in a side-channel exploitation context, choosing the NLL loss function to drive the training is sound from an information theory point of view. As a third contribution, classical counter-measures like Boolean masking or execution flow shuffling, initially dedicated to classical SCA, are proved to stay sound against deep Learning based attacks.
Read more

Share-slicing: Friend or Foe?

By Si Gao, Ben Marshall, Dan Page and Elisabeth Oswald, published in TCHES 2019

Masking is a well loved and widely deployed countermeasure against side channel attacks, in particular in software. Under certain assumptions (w.r.t. independence and noise level), masking provably prevents attacks up to a certain security order and leads to a predictable increase in the number of required leakages for successful attacks beyond this order. The noise level in typical processors where software masking is used may not be very high, thus low masking orders are not sufficient for real world security. Higher order masking however comes at a great cost, and therefore a number techniques have been published over the years that make such implementations more efficient via parallelisation in the form of bit or share slicing.

We take two highly regarded schemes (ISW and Barthe et al.), and some corresponding open source implementations that make use of share slicing, and discuss their true security on an ARM Cortex-M0 and an ARM Cortex-M3 processor (both from the LPC series). We show that micro-architectural features of the M0 and M3 undermine the independence assumptions made in masking proofs and thus their theoretical guarantees do not translate into practice (even worse it seems unpredictable at which order leaks can be expected). Our results demonstrate how difficult it is to link theoretical security proofs to practical real-world security guarantees.
Read more

Key Enumeration from the Adversarial Viewpoint When to Stop Measuring and Start Enumerating?

By Melissa Azouaoui, Romain Poussier, François-Xavier Standaert, Vincent Verneuil, presented at CARDIS 2019

In this work, we formulate and investigate a pragmatic question related to practical side-channel attacks complemented with key enumeration. In a real attack scenario, after an attacker has extracted side-channel information, it is possible that despite the entropy of the key has been significantly reduced, she cannot yet achieve a direct key recovery. If the correct key lies within a sufficiently small set of most probable keys, it can then be recovered with a plaintext and the corresponding ciphertext, by performing enumeration.

Our proposal relates to the following question: how does an attacker know when to stop acquiring side-channel observations and when to start enumerating with a given computational effort? Since key enumeration is an expensive (i.e. time-consuming) task, this is an important question from an adversarial viewpoint. To answer this question, we present an efficient (heuristic) way to perform key-less rank estimation, based on simple entropy estimations using histograms.
Read more

Neural Network Model Assessment for Side-Channel Analysis

By Guilherme Perin, Baris Ege and Lukasz Chmielewski, published in IACR Cryptology ePrint Archive

Leakage assessment of cryptographic implementations with side-channel analysis relies on two important assumptions: leakage model and the number of side-channel traces. In the context of profiled side-channel attacks, having these assumptions correctly defined is a sufficient first step to evaluate the security of a crypto implementation with template attacks. This method assumes that the features (leakages or points of interest) follow a univariate or multi-variate Gaussian distribution for the estimation of the probability density function. When trained machine learning or neural network models are employed as classifiers for profiled attacks, a third assumption must be taken into account that it the correctness of the trained model or learning parameters. It was already proved that convolutional neural networks have advantages for side-channel analysis like bypassing trace misalignments and defeating first-order masking countermeasures in software implementations.

However, if this trained model is incorrect and the test classification accuracy is close to random guessing, the correctness of the two first assumptions (number of traces and leakage model) will be insufficient and the security of the target under evaluation can be overestimated. This could lead to wrong conclusions in leakage certifications. One solution to verify if the trained model is acceptable relies on the identifying of input features that the neural network considers as points of interest. In this paper, we implement the assessment of neural network models by using the proposed backward propagation path method. Our method is employed during the profiling phase as a tool to verify what the neural network is learning from side-channel traces and to support the optimization of hyper-parameters. The method is tested against masked AES implementation. One of the main results highlights the importance of L2 regularization for the automated points of interest selection from a neural network.
Read more

A Critical Analysis of ISO 17825 (‘Testing methods for the mitigation of non-invasive attack classes against cryptographic modules’)

By Carolyn Whitnall and Elisabeth Oswald, presented at Asiacrypt 2019

The ISO standardisation of `Testing methods for the mitigation of non-invasive attack classes against cryptographic modules’ (ISO/IEC 17825:2016) specifies the use of the Test Vector Leakage Assessment (TVLA) framework as the sole measure to assess whether or not an implementation of (symmetric) cryptography is vulnerable to differential side-channel attacks. It is the only publicly available standard of this kind, and the first side-channel assessment regime to exclusively rely on a TVLA instantiation. TVLA essentially specifies statistical leakage detection tests with the aim of removing the burden of having to test against an ever increasing number of attack vectors. It offers the tantalising prospect of `conformance testing’: if a device passes TVLA, then, one is led to hope, the device would be secure against all (first-order) differential side-channel attacks.

In this paper we provide a statistical assessment of the specific instantiation of TVLA in this standard. This task leads us to inquire whether (or not) it is possible to assess the side-channel security of a device via leakage detection (TVLA) only. We find a number of grave issues in the standard and its adaptation of the original TVLA guidelines.We propose some innovations on existing methodologies and finish by giving recommendations for best practice and the responsible reporting of outcomes.
Read more

Fast Side-Channel Security Evaluation of ECC Implementations: Shortcut Formulas for Horizontal Side-Channel Attacks Against ECSM with the Montgomery Ladder

By Melissa Azouaoui, Romain Poussier, and François-Xavier Standaert, presented at COSADE 2019

Horizontal attacks are a suitable tool to evaluate the (nearly) worst-case side-channel security level of ECC implementations, due to the fact that they allow extracting a large amount of information from physical observations. Motivated by the difficulty of mounting such attacks and inspired by evaluation strategies for the security of symmetric cryptography implementations, we derive shortcut formulas to estimate the success rate of horizontal differential power analysis attacks against ECSM implementations, for efficient side-channel security evaluations. We then discuss the additional leakage assumptions that we exploit for this purpose, and provide experimental confirmation that the proposed tools lead to good predictions of the attacks’ success.

Reducing a Masked Implementation’s Effective Security Order with Setup Manipulations And an Explanation Based on Externally-Amplified Couplings

By Itamar Levi, Davide Bellizia and François-Xavier Standaert, published in IACR Transactions on Cryptographic Hardware and Embedded Systems, 2019(2)

Couplings are a type of physical default that can violate the independence assumption needed for the secure implementation of the masking countermeasure. Two recent works by De Cnudde et al. put forward qualitatively that couplings can cause information leakages of lower order than theoretically expected. However, the (quantitative) amplitude of these lower-order leakages (e.g., measured as the amplitude of a detection metric such as Welch’s T statistic) was usually lower than the one of the (theoretically expected) d th order leakages. So the actual security level of these implementations remained unaffected. In addition, in order to make the couplings visible, the authors sometimes needed to amplify them internally ( e.g., by tweaking the placement and routing or iterating linear operations on the shares). In this paper, we first show that the amplitude of low-order leakages in masked implementations can be amplified externally, by tweaking side-channel measurement setups in a way that is under control of a power analysis adversary.

Our experiments put forward that the “effective security order” of both hardware (FPGA) and software (ARM-32) implementations can be reduced, leading to concrete reductions of their security level. For this purpose, we move from the detection-based analyzes of previous works to attack-based evaluations, allowing to confirm the exploitability of the lower-order leakages that we amplify. We also provide a tentative explanation for these effects based on couplings, and describe a model that can be used to predict them in function of the measurement setup’s external resistor and implementation’s supply voltage. We posit that the effective security orders observed are mainly due to “externally-amplified couplings” that can be systematically exploited by actual adversaries.
Read more

Study of Deep Learning Techniques for Side-Channel Analysis and Introduction to ASCAD Database

By Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli and Cécile Dumas, published in J. Cryptographic Engineering

To provide insurance on the resistance of a system against side-channel analysis, several national or private schemes are today promoting an evaluation strategy, common in classical cryptography, which is focussing on the most powerful adversary who may train to learn about the dependency between the device behaviour and the sensitive data values. Several works have shown that this kind of analysis, known as Template Attacks in the side-channel domain, can be rephrased as a classical Machine Learning classification problem with learning phase. Following the current trend in the latter area, recent works have demonstrated that deep learning algorithms were very efficient to conduct security evaluations of embedded systems and had many advantages compared to the other methods. Unfortunately, their hyper-parametrization has often been kept secret by the authors who only discussed on the main design principles and on the attack efficiencies.

This is clearly an important limitation of previous works since (1) the latter parametrization is known to be a challenging question in Machine Learning and (2) it does not allow for the reproducibility of the presented results. This paper aims to address these limitations in several ways. First, completing recent works, we propose a comprehensive study of deep learning algorithms when applied in the context of side-channel analysis and we discuss the links with the classical template attacks. Secondly, we address the question of the choice of the hyper-parameters for the class of multi-layer perceptron networks and convolutional neural networks. Several benchmarks and rationales are given in the context of the analysis of a masked implementation of the AES algorithm. To enable perfect reproducibility of our tests, this work also introduces an open platform including all the sources of the target implementation together with the campaign of electromagnetic measurements exploited in our benchmarks. This open database, named ASCAD, has been specified to serve as a common basis for further works on this subject. Our work confirms the conclusions made by Cagli et al. at CHES 2017 about the high potential of convolutional neural networks. Interestingly, it shows that the approach followed to design the algorithm VGG-16 used for image recognition seems also to be sound when it comes to fix an architecture for side-channel analysis.
Read more

Study of Deep Learning Techniques for Side-Channel Analysis and Introduction to ASCAD Database (Long Paper)

By Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli and Cécile Dumas, published in IACR Cryptology ePrint Archive

To provide insurance on the resistance of a system against side-channel analysis, several national or private schemes are today promoting an evaluation strategy, common in classical cryptography, which is focussing on the most powerful adversary who may train to learn about the dependency between the device behaviour and the sensitive data values. Several works have shown that this kind of analysis, known as Template Attacks in the side-channel domain, can be rephrased as a classical Machine Learning classification problem with learning phase. Following the current trend in the latter area, recent works have demonstrated that deep learning algorithms were very efficient to conduct security evaluations of embedded systems and had many advantages compared to the other methods. Unfortunately, their hyper-parametrization has often been kept secret by the authors who only discussed on the main design principles and on the attack efficiencies.

This is clearly an important limitation of previous works since (1) the latter parametrization is known to be a challenging question in Machine Learning and (2) it does not allow for the reproducibility of the presented results. This paper aims to address these limitations in several ways. First, completing recent works, we propose a comprehensive study of deep learning algorithms when applied in the context of side-channel analysis and we discuss the links with the classical template attacks. Secondly, we address the question of the choice of the hyper-parameters for the class of multi-layer perceptron networks and convolutional neural networks. Several benchmarks and rationales are given in the context of the analysis of a masked implementation of the AES algorithm. To enable perfect reproducibility of our tests, this work also introduces an open platform including all the sources of the target implementation together with the campaign of electromagnetic measurements exploited in our benchmarks. This open database, named ASCAD, has been specified to serve as a common basis for further works on this subject. Our work confirms the conclusions made by Cagli et al. at CHES 2017 about the high potential of convolutional neural networks. Interestingly, it shows that the approach followed to design the algorithm VGG-16 used for image recognition seems also to be sound when it comes to fix an architecture for side-channel analysis.
Read more

Masking Proofs Are Tight and How to Exploit it in Security Evaluations

By Vincent Grosso and François-Xavier Standaert, presented at Advances in Cryptology – EUROCRYPT 2018

Evaluating the security level of a leaking implementation against side-channel attacks is a challenging task. This is especially true when countermeasures such as masking are implemented since in this case: (i) the amount of measurements to perform a key recovery may become prohibitive for certification laboratories, and (ii) applying optimal (multivariate) attacks may be computationally intensive and technically challenging. In this paper, we show that by taking advantage of the tightness of masking security proofs, we can significantly simplify this evaluation task in a very general manner.

More precisely, we show that the evaluation of a masked implementation can essentially be reduced to the one of an unprotected implementation. In addition, we show that despite optimal attacks against masking schemes are computationally intensive for large number of shares, heuristic (soft analytical side-channel) attacks can approach optimality efficiently.As part of this second contribution,we also improve over the recent multivariate (aka horizontal) side-channel attacks proposed at CHES 2016 by Battistello et al.
Read more

How (not) to Use Welch’s T-test in Side-Channel Security Evaluations

By François-Xavier Standaert, presented at CARDIS 2018

The Test Vector Leakage Assessment (TVLA) methodology is a qualitative tool relying on Welch’s T-test to assess the security of cryptographic implementations against side-channel attacks. Despite known limitations (e.g., risks of false negatives and positives), it is sometimes considered as a pass-fail test to determine whether such implementations are “safe” or not (without clear definition of what is “safe”). In this note, we clarify the limited quantitative meaning of this test when used as a standalone tool. For this purpose, we first show that the straightforward application of this approach to assess the security of a masked implementation is not sufficient. More precisely, we show that even in a simple (more precisely, univariate) case study that seems best suited for the TVLA methodology, detection (or lack thereof) with Welch’s T-test can be totally disconnected from the actual security level of an implementation.

For this purpose, we put forward the case of a realistic masking scheme that looks very safe from the TVLA point-of-view and is nevertheless easy to break. We then discuss this result in more general terms and argue that this limitation is shared by all “moment-based” security evaluations. We conclude the note positively, by describing how to use moment-based analyses as a useful ingredient of side-channel security evaluations, to determine a “security order”
Read more

Leakage Detection with the χ 2 -Test

By Amir Moradi , Bastian Richter , Tobias Schneider and François-Xavier Standaert , published in IACR Transactions on Cryptographic Hardware and Embedded Systems, 2018(1)

We describe how Pearson’s χ 2 test can be used as a natural complement to Welch’s t-test for black box leakage detection. In particular, we show that by using these two tests in combination, we can mitigate some of the limitations due to the moment-based nature of existing detection techniques based on Welch’s t-test (e.g., for the evaluation of higher-order masked implementations with insufficient noise). We also show that Pearson’s χ 2 test is naturally suited to analyze threshold implementations with information lying in multiple statistical moments, and can be easily extended to a distinguisher for key recovery attacks. As a result, we believe the proposed test and methodology are interesting complementary ingredients of the side-channel evaluation toolbox, for black box leakage detection and non-profiled attacks, and as a preliminary before more demanding advanced analyses.

Lowering the bar: deep learning for side-channel analysis

By Guilherme Perin, Baris Ege, Jasper van Woudenberg, white paper

Deep learning can help automate the signal analysis process in power side channel analysis. So far, power side channel analysis relies on the combination of cryptanalytic science, and the art of signal processing. Deep learning is essentially a classification algorithm, which can also be trained to recognize different leakages in a chip. Even more so, we do this such that typical signal processing problems such as noise reduction and re-alignment are automatically solved by the deep learning network. We show we can break a lightly protected AES, an AES implementation with masking countermeasures and a protected ECC implementation. These experiments show that where previously side channel analysis had a large dependency on the skills of the human, first steps are being developed that bring down the attacker skill required for such attacks. This paper is targeted at a technical audience that is interested in the latest developments on the intersection of deep learning, side channel analysis and security.

Start Simple and then Refine: Bias-Variance Decomposition as a Diagnosis Tool for Leakage Profiling

By Liran Lerman, Nikita Veshchikov, Olivier Markowitch, François-Xavier Standaert, published in IEEE Transactions on Computers 2018

Evaluating the resistance of cryptosystems to side-channel attacks is an important research challenge. Profiled attacks reveal the degree of resilience of a cryptographic device when an adversary examines its physical characteristics. So far, evaluation laboratories launch several physical attacks (based on engineering intuitions) in order to find one strategy that eventually extracts secret information (such as a secret cryptographic key). The certification step represents a complex task because in practice the evaluators have tight memory and time constraints. In this paper, we propose a principled way of guiding the design of the most successful evaluation strategies thanks to the (bias-variance) decomposition of a security metric of profiled attacks. Our results show that we can successfully apply our framework on unprotected and protected algorithms implemented in software and hardware.

Towards Sound and Optimal Leakage Detection Procedure

By Liwei Zhang , A. Adam Ding , Francois Durvaux , Francois-Xavier Standaert , and Yunsi Fei, presented at CARDIS 2017

Evaluation of side channel leakage for the embedded crypto systems requires sound leakage detection procedures. We relate the test vector leakage assessment (TVLA) procedure to the statistical minimum p-value (mini-p) procedure, and propose a sound method of deciding leakage existence in the statistical hypothesis setting. To improve detection, an advanced statistical procedure Higher Criticism (HC) is applied. The detection of leakage existence and the identification of exploitable leakage are separated when there are multiple leakage points.

For leakage detection, the HC-based procedure is shown to be optimal in that, for a given number of traces with given length, it detects existence of leakage at the signal level as low as possibly detectable by any statistical procedure. We provide theoretical proof of the optimality of the HC procedure. Numerical studies show that the HC-based procedure perform as well as the mini-p based procedure when leakage signals are very sparse, and can improve the leakage detection significantly when there are multiple leakages.
Read more

Very High Order Masking: Efficient Implementation and Security Evaluation

By Anthony Journault, François-Xavier Standaert, presented at CHES 2017

In this paper, we study the performances and security of recent masking algorithms specialized to parallel implementations in a 32-bit embedded software platform, for the standard AES Rijndael and the bitslice cipher Fantomas. By exploiting the excellent features of these algorithms for bitslice implementations, we first extend the recent speed records of Goudarzi and Rivain (presented at Eurocrypt 2017) and report realistic timings for masked implementations with 32 shares. We then observe that the security level provided by such implementations is uneasy to quantify with current evaluation tools.

We therefore propose a new “multi-model” evaluation methodology which takes advantage of different (more or less abstract) security models introduced in the literature. This methodology allows us to both bound the security level of our implementations in a principled manner and to assess the risks of overstated security based on well understood parameters. Concretely, it leads us to conclude that these implementations withstand worst-case adversaries with > 2 64 measurements under falsifiable assumptions.
Read more

Private Multiplication over Finite Fields

By Sonia Belaïd, Fabrice Benhamouda, Alain Passelègue, Emmanuel Prouff,, Adrian Thillard, and Damien Vergnaud,, presented at CRYPTO 2017

The notion of privacy in the probing model, introduced by Ishai, Sahai, and Wagner in 2003, is nowadays frequently involved to assess the security of circuits manipulating sensitive information. However, provable security in this model still comes at the cost of a significant overhead both in terms of arithmetic complexity and randomness complexity. In this paper, we deal with this issue for circuits processing multiplication over finite fields. Our contributions are manifold. Extending the work of Belaïd, Benhamouda, Passelègue, Prouff, Thillard, and Vergnaud at Eurocrypt 2016, we introduce an algebraic characterization of the privacy for multiplication in any finite field and we propose a novel algebraic characterization for non-interference (a stronger security notion in this setting). Then, we present two generic constructions of multiplication circuits in finite fields that achieve non-interference in the probing model.

Denoting by d the number of probes used by the adversary, the first proposal reduces the number of bilinear multiplications (i.e., of general multiplications of two non-constant values in the finite field) to only 2d + 1 whereas the state-of-the-art was O(d2). The second proposal reduces the randomness complexity to d random elements in the underlying finite field, hence improving the O(d log d) randomness complexity achieved by Belaïd et al. in their paper. This construction is almost optimal since we also prove that d/2 is a lower bound. Eventually, we show that both algebraic constructions can always be instantiated in large enough finite fields. Furthermore, for the important cases d ∈ {2, 3}, we illustrate that they perform well in practice by presenting explicit realizations for finite fields of practical interest.
Read more

Getting the Most Out of Leakage Detection Statistical Tools and Measurement Setups Hand in Hand

By Santos Merino del Pozo and François-Xavier Standaert, presented at COSADE 2017

In this work, we provide a concrete investigation of the gains that can be obtained by combining good measurement setups and efficient leakage detection tests to speed up evaluation times. For this purpose, we first analyze the quality of various measurement setups. Then, we highlight the positive impact of a recent proposal for efficient leakage detection, based on the analysis of a (few) pair(s) of plaintexts. Finally, we show that the combination of our best setups and detection tools allows detecting leakages for a noisy threshold implementation of the block cipher PRESENT after an intensive measurement phase, while either worse setups or less efficient detection tests would not succeed in detecting these leakages. Overall, our results show that a combination of good setups and fast leakage detection can turn security evaluation times from days to hours (for first-order secure implementations) and even from weeks to days (for higher-order secure implementations).

Applying Horizontal Clustering Side-Channel Attacks on Embedded ECC Implementations

By Erick Nascimento and Łukasz Chmielewski, presented at CARDIS 2017

Side-channel attacks are a threat to cryptographic algorithms running on embedded devices. Public-key cryptosystems, including elliptic curve cryptography (ECC), are particularly vulnerable because their private keys are usually long-term. Well known countermeasures like regularity, projective coordinates and scalar randomization, among others, are used to harden implementations against common side-channel attacks like DPA.
Horizontal clustering attacks can theoretically overcome these countermeasures by attacking individual side-channel traces. In practice horizontal attacks have been applied to overcome protected ECC implementations on FPGAs. However, it has not been known yet whether such attacks can be applied to protected implementations working on embedded devices, especially in a non-profiled setting.

In this paper we mount non-profiled horizontal clustering attacks on two protected implementations of the Montgomery Ladder on Curve25519 available in the μNaCl library targeting electromagnetic (EM) emanations. The first implementation performs the conditional swap (cswap) operation through arithmetic of field elements (cswap-arith), while the second does so by swapping the pointers (cswap-pointer). They run on a 32-bit ARM Cortex-M4F core.
Our best attack has success rates of 97.64% and 99.60% for cswap-arith and cswap-pointer, respectively. This means that at most 6 and 2 bits are incorrectly recovered, and therefore, a subsequent brute-force can fix them in reasonable time. Furthermore, our horizontal clustering framework used for the aforementioned attacks can be applied against other
protected implementations.
Read more

Connecting and Improving Direct Sum Masking and Inner Product Masking

By Romain Poussier, Qian Guo, François-Xavier Standaert, Sylvain Guilley, Claude Carlet, presented at CARDIS 2017

Direct Sum Masking (DSM) and Inner Product (IP) masking are two types of countermeasures that have been introduced as alternatives to simpler (e.g., additive) masking schemes to protect cryptographic implementations against side-channel analysis. In this paper, we first show that IP masking can be written as a particular case of DSM. We then analyze the improved security properties that these (more complex) encodings can provide over Boolean masking. For this purpose, we introduce a slight variation of the probing model, which allows us to provide a simple explanation to the “security order amplification” for such masking schemes that was put forward at CARDIS 2016. We then use our model to search for new instances of masking schemes that optimize this security order amplification. We finally discuss the relevance of this security order amplification (and its underlying assumption of linear leakages) based on an experimental case study.

A Systematic Approach to the Side-Channel Analysis of ECC Implementations with Worst-Case Horizontal Attacks

By Romain Poussier, Yuanyuan Zhou, François-Xavier Standaert, presented at CHES 2017

The wide number and variety of side-channel attacks against scalar multiplication algorithms makes their security evaluations complex, in particular in case of time constraints making exhaustive analyses impossible. In this paper, we present a systematic way to evaluate the security of such implementations against horizontal attacks. As horizontal attacks allow extracting most of the information in the leakage traces of scalar multiplications, they are suitable to avoid risks of overestimated security levels. For this purpose, we additionally propose to use linear regression in order to accurately characterize the leakage function and therefore approach worst-case security evaluations. We then show how to apply our tools in the contexts of ECDSA and ECDH implementations, and validate them against two targets: a Cortex-M4 and a Cortex-A8 micro-controllers.

Ridge-based DPA: Improvement of Differential Power Analysis For Nanoscale Chips

By Weijia Wang, Yu Yu, François-Xavier Standaert, Junrong Liu, Zheng Guo, Dawu Gu, published in IEEE Transactions on Information Forensics & Security

Differential power analysis (DPA), as a very practical type of side-channel attacks, has been widely studied and used for the security analysis of cryptographic implementations. However, as the development of chip industry leads to smaller technologies, the leakage of cryptographic implementations in nanoscale devices tends to be nonlinear (i.e., leakages of intermediate bits are no longer independent) and unpredictable. These phenomena make some existing side-channel attacks not perfectly suitable, i.e., decreasing their performance and making some common used prior power models (e.g., Hamming weight) to be much less respected in practice. To solve above issues, we introduce the regularization process from statistical learning to the area of side-channel attack and propose the ridge-based DPA.

We also apply the cross-validation technique to search for the most suitable value of the parameter for our new attack methods. Besides, we present theoretical analyses to deeply investigate the properties of ridge-based DPA for nonlinear leakages. We evaluate the performance of ridge-based DPA in both simulation-based and practical experiments, comparing to the state-to-the-art DPAs. The results confirm the theoretical analysis. Further, our experiments show the robustness of ridge-based DPA to cope with the difference between the leakages of profiling and exploitation power traces. Therefore, by showing a good adaptability to the leakage of the nanoscale chips, the ridge-based DPA is a good alternative to the state-to-the-art ones.
Read more

Convolutional Neural Networks with Data Augmentation against Jitter-Based Countermeasures – Profiling Attacks without Pre-Processing

By Eleonora Cagli, Cécile Dumas, Emmanuel Prouff, presented at CHES 2017

In the context of the security evaluation of cryptographic implementations, profiling attacks (aka Template Attacks) play a fundamental role. Nowadays the most popular Template Attack strategy consists in approximating the information leakages by Gaussian distributions. Nevertheless this approach suffers from the difficulty to deal with both the traces misalignment and the high dimensionality of the data. This forces the attacker to perform critical preprocessing phases, such as the selection of the points of interest and the realignment of measurements. Some software and hardware countermeasures have been conceived exactly to create such a misalignment.

In this paper we propose an end-to-end profiling attack strategy based on the Convolutional Neural Networks: this strategy greatly facilitates the attack roadmap, since it does not require a previous trace realignment nor a precise selection of points of interest. To significantly increase the performances of the CNN, we moreover propose to equip it with the data augmentation technique that is classical in other applications of Machine Learning. As a validation, we present several experiments against traces misaligned by different kinds of countermeasures, including the augmentation of the clock jitter effect in a secure hardware implementation over a modern chip. The excellent results achieved in these experiments prove that Convolutional Neural Networks approach combined with data augmentation gives a very efficient alternative to the state-of-the-art profiling attacks.
Read more

Towards Practical Tools for Side Channel Aware Software Engineering: ‘Grey Box’ Modelling for Instruction Leakages

By David McCann, Elisabeth Oswald, Carolyn Whitnall, presented at USENIX Security Symposium 2017

Power (along with EM, cache and timing) leaks are of considerable concern for developers who have to deal with cryptographic components as part of their overall software implementation, in particular in the context of embedded devices. Whilst there exist some compiler tools to detect timing leaks, similar progress towards pinpointing power and EM leaks has been hampered by limits on the amount of information available about the physical components from which such leaks originate.

We suggest a novel modelling technique capable of producing high-quality instruction-level power (and/or EM) models without requiring a detailed hardware description of a processor nor information about the used process technology (access to both of which is typically restricted). We show that our methodology is effective at capturing differential data-dependent effects as neighbouring instructions in a sequence vary. We also explore register effects, and verify our models across several measurement boards to comment on board effects and portability. We confirm its versatility by demonstrating the basic technique on two processors (the ARM Cortex-M0 and M4), and use the M0 models to develop ELMO, the first leakage simulator for the ARM Cortex M0.
Read more

Categorising and Comparing Cluster-Based DPA Distinguishers

By Xinping Zhou, Carolyn Whitnall, Elisabeth Oswald, Degang Sun, Zhu Wang, presented at SAC 2017

Side-channel distinguishers play an important role in differential power analysis, where real world leakage information is compared against hypothetical predictions in order to guess at the underlying secret key. A class of distinguishers which can be described as `cluster-based’ have the advantage that they are able to exploit multi-dimensional leakage samples in scenarios where only loose, `semi-profiled’ approximations of the true leakage forms are available. This is by contrast with univariate distinguishers exploiting only single points (e.g. correlation), and Template Attacks requiring concise fitted models which can be overly sensitive to mismatch between the profiling and attack acquisitions.

This paper collects together—to our knowledge, for the first time—the various different proposals for cluster-based DPA (concretely, Differential Cluster Analysis, First Principal Components Analysis, and Linear Discriminant Analysis), and shows how they fit within the robust `semi-profiling’ attack procedure proposed by Whitnall et al. at CHES 2015. We provide discussion of the theoretical similarities and differences of the separately proposed distinguishers as well as an empirical comparison of their performance in a range of (real and simulated) leakage scenarios and with varying parameters. Our findings have application for practitioners constrained to rely on `semi-profiled’ models who wish to make informed choices about the best known procedures to exploit such information.

Read more

Deliverables

Deliverable D1.2: Interim Portfolio of Best Methods and Improved Evaluation Techniques

This is the second deliverable produced within work package 1 (WP1) of the project. The goal of this report is to act as a portfolio of improved evaluation techniques and their figures of merit.

In the context of our previously introduced detect-map-exploit core framework (see Deliverable 1.1), we start off by discussing the advantages and limitations of the widely-deployed Test Vector Leakage Assessment (TVLA) methodology. Properly redefining the objectives of leakage detection enable us to identify and address existing open questions and take steps towards a better and faster leakage detection. Performing leakage detection efficiently is a central question for side-channel leakage evaluation. Therefore, we compare and contrast two methods of computing moments for detection tests, namely an off-line method and an online (histograms based) method.

We consider both methods for parallelised architectures, and compare results based on two types of architectures: one based on Intel x86 server-grade processors, and one based on ARM A53 processors. In addition, we present experiments performed to implement distributed memory leakage detection computation using a High-Performance Computing (HPC) facility in an industrial context. The implementation uses MPI for distributing the memory and OpenMP for shared
memory systems. We conclude that the histograms method is not deemed usable in the current state for realistic examples, as the storage of the  histograms using dense arrays uses a huge amount of memory. A benchmark run on a real-world example shows that the scalability of the offline method could still be improved.

Next, we discuss Deep Learning (DL) techniques as black box solvers that are proven to be effective for side-channel evaluation. We provide an overview of the methodology that is commonly applied to side-channel attacks when neural networks are considered as the model for the key recovery. We present
two case studies using convolutional neural networks (CNN): the first on a protected AES implementation (DPA Contest V4), and the second on an ECC implementation protected with misalignment. We study practical issues and limitations that arise with DL-based techniques and outline directions to improve or solve these obstacles by tweaking the different parameters of the analysis. Finally, we propose a novel idea for DL -based SCA in section 6.4 based on a multi-channel approach.

Read more

Deliverable D1.3: White Paper on Evaluation Strategies for AES and ECC

This deliverable is a white paper describing side-channel evaluation strategies of cryptographic implementations, particularly for AES and ECC. Through this paper, the REASSURE consortium provides general guidance and directions to improve how cryptographic implementations are evaluated both practically and soundly.

First, we begin by describing our alternative structured evaluation. It is a backwards approach based on first defining a worst-case adversary and relaxing particular capabilities to reach the best practical attack possible. This first step of the evaluation stems from the fact that defining a worst-case attack strategy is commonly easier than defining the best practical attack strategy. This allows for a sound and well-defined starting point for any implementation under investigation. Accordingly, we then detail the different capabilities that must be taken into account and described during an evaluation and additionally discussed when arguing upon the feasability of attack strategies with relaxed capabilities. Next, we describe the three main evaluation steps following the Detect-Map-Exploit framework.

We illustrate the proposed evaluation strategy with three concrete examples: an AES implementation protected with combined affine masking and shuffled execution, an unprotected ECC implementation and the ECC point randomization countermeasure.

We connect our approach to current evaluation strategies such as the BSI ECC evaluation guidelines and the CC scheme and describe how it can be used as a refinement of the latter to yield more efficient and representative evaluations. We finally discuss how our approach adresses the gap between concrete worst-case security and current evaluation processes.

Read more

Deliverable D1.4: Final portfolio of best methods and improved evaluation techniques

This document describes the best techniques and methods created or improved by the REASSURE consortium and aimed at optimizing high-security vulnerability assessment. This deliverable describes the steps for which interesting positive results were found. The techniques are organized according to the steps of our evaluation framework, which is the core of our approach to optimize and streamline evaluation methodology.

Section 2 covers the methodology and best practices for building a measurement setup. Naturally, the choice of equipment and the setup parameters directly impact the outcome of the analysis. However, additional tests should be performed in corner cases of the proper working conditions of a device. An improvement of up to x 100 is reported when comparing the best/worst-case scenarios.

Section 3 assesses whether low-cost/low-power processors can potentially level classical infrastructures from a cost-efficiency point of view. The considered distributed leakage detection method is based on the central moments method. The two setups that are compared in this case are a cluster of low-power ARM CPUs on one hand, and a High-Performance Computing (HPC) infrastructure on the other. Experiments indicate that the former setup can leverage a parallel architecture in an industrial environment. From a practical point of view, the speed-up can be more than 100 x compared to a fully serial implementation, or around 10 x compared to a multi-threaded implementation.

Section 4 is dedicated to the application of neural networks for side channel analysis and describes tools that give an insight into what the neural network is learning, improving explainability and as a happy consequence generalization. The later can be achieved by accurate hyperparameter selection and we show that using the mutual information transferred to the last layer we can improve attack accuracy by a factor of up to 30%. A second improvement to generalization is achieved by the usage of ensembles, which can lead to up to a 40% improvement of attack performance compared to using the best model alone.

Section 5 explores multi-channel classifiers which prove useful in cases where the leakage model is unknown and it shows evidence of improvements of one order of magnitude.

Section 6 compares the efficiency of profiled linear regression attacks (LRA) with Scatter, a new technique which aims to improve attacks in the case of misaligned traces. Overall, LRA always reaches a success rate close to 1, while Scatter often fails, even with a high number of traces. When both reach a success rate close to 1, Scatter requires at least twice the number of traces.

Read more

Deliverable D1.5: White Paper: Assurance in Security Evaluations

Security evaluations are complex, and two distinct approaches exist today: conformance style testing as in FIPS 140 and attack driven testing as in CC. Within the REASSURE project we studied attack vectors with regards to their optimality and potential for automation with the aim to improve existing evaluation regimes. By optimality we mean that we can prove that any practical instantiation of an attack reaches its theoretical limit; by potential for automation we mean that it can be executed with minimal user interaction. We comment on which steps offer some potential for automation in this white paper.

Considering conformance style testing as an evaluation methodology, it is clear that it cannot offer any guarantees regarding optimality: leakage detection in a black box setting is extremely challenging to set up correctly. We have found that the current standard ISO 17825 requires improvement, and this white paper provides some information regarding more sensible parameter choices.

Our research points towards the fact that any optimality can only ever be achieved when considering worst-case adversaries. These are adversaries that get full access to implementation details, can select secret parameters, and thereby control countermeasures during an initial profiling phase. The reason for this is that it is only in this setting that we can utilise statistical (or machine/deep learning) tools which are well understood and for which we can assess/argue their optimality. Any attack vector which requires dealing with higher order or multivariate data leads to a loss of theoretical guarantees.

Within the REASSURE proposal for a so-called “backwards” approach for evaluations, we postulate that any evaluation should attempt first to instantiate a worst-case adversary (even if this requires open samples and/or samples with known secrets, or even the developers in their own environment to execute these attacks). In this setting we argue that we have the strongest guarantees for optimality. If necessary an evaluation should then also instantiate a more “practical” attack by relaxing some assumptions. The difference in effort required between the worst case and the best practical attack then gives an indication of the (potential) security gap.

We recommend that any reporting that is based on a points-based system like [46] should make the ratings explicit for the worst-case adversary (separating out the points for identification/profiling and the exploitation) and the “best practical adversary” (again making explicit how points are awarded for the different phases), and it should also be reported how likely it is that the results from identification/profiling translate to other devices.

Finally we observe that the role of formal verification in the context of side channel evaluations is perhaps different from the role that it plays in other contexts. Formal methods prove properties of implementations on an abstract level: this requires assumption of a leakage model. In practice however, devices show multiple, context-dependent leakage characteristics. Therefore formal verification in the context of leakage evaluations only shows that the necessary conditions are fulfilled, but they do not provide evidence about the sufficiency of these conditions. These sufficient conditions can only be ascertained via testing. Consequently, formal verification is suitable for low assurance levels. By contrast, high assurance essentially relates to the risk that (e.g., backwards) security evaluations, even when the evaluator is provided with worst-case capabilities, may be sub-optimal due to inherently heuristic and hard to analyze theoretically attack steps.

Read more

Deliverable D2.1: Shortcut Formulas for Side Channel Evaluation

This deliverable surveys a number of approaches to shortcut the effort required to assess implementations and devices with respect to their susceptibility regarding side channel attacks. Our approach aligns with the divide and conquer nature of most side channel attacks and hence we touch on shortcuts that apply the distinguisher statistics (the divide step) and the key rank (the conquer step).

We notice that shortcuts make significant assumptions about leakage characteristics (in particular independence of leakages and equal variances) that do not hold in many of the challenging device evaluation scenarios. In addition, being able to characterise the signal and noise is not always possible: early on in a design this information is not yet available, whereas later on in an evaluation within a set scheme the evaluator often cannot turn off countermeasures to establish the nature of the “original” signal. When it comes to key rank computations it has been shown that the variance of the key rank is huge in those cases where the key rank is most relevant. The tightness by which the average rank is estimated is thus not so important for as long as the estimate also gives some information about the spread of the rank.

Within REASSURE we set ourselves the challenge to assess how such shortcut approaches could nevertheless be leveraged to improve evaluations. This deliverable hence is the  foundational step from which we will continue in two directions. Firstly, within the next 6 months, industrial partners will provide some insight into at what stage shortcuts are particularly appropriate. Secondly, some shortcuts will be implemented within WP3.2 in the context of the REASSURE simulation tool.

Read more

Deliverable D2.2: Interim report on automation

This document represents interim progress of the REASSURE consortium towards the goal of flexible evaluation methods that are usable by non-domain experts.

We begin by clarifying what we mean by automation: the facility to perform a task with a minimum of user input and interaction (expert or otherwise). This implies that the necessary parameters for a sound, fair and reproducible analysis are derived or estimated from the available data as far as possible. It also requires that the assumptions and limitations of the analysis be made transparent in the output, in order to guide (potentially non-expert) users towards sound conclusions.

Deliverable 1.1 identifies three main components of a typical evaluation process: measurement and (raw) trace processing, detection/mapping, and exploitation. These seldom proceed in a linear workflow, as information gained in later steps is used to repeat and refine earlier steps. For example, it can be hard to confirm the suitability of acquired traces without actually performing some sort of detection or attack, so that the measurement stage may need to be revisited in the light of observed outcomes. Such backtracking creates challenges for automation, in particular implying the need for improved quality metrics at every stage of evaluation — a matter of ongoing investigation.

On the basis that detection and mapping tasks are the most promising for automation, we then provide a more detailed account of statistical hypothesis testing and of how to perform the multiple comparison corrections and statistical power analyses required to carry out such procedures fairly and transparently. In the case of leakage detection strategies based on simple assumptions, such as those relying on t-tests, it is possible to arrive at analytical formulae for the latter; for complex methods, such as those estimating information theoretic quantities, the only options currently available (as far as we are aware) involve referring to empirically-derived indicators from (hopefully) representative scenarios. A number of currently used methods (in particular, those based on ANOVA-like tests, and those based on correlation) sit in a middle ground, where the side-channel literature has not yet fully explored the solutions already proposed in the statistics literature. This represents an avenue for future work, which could potentially make it possible to safely automate a wider range of tests, as well as providing insights into the trade-offs between them.

We discuss the particular challenges of extending known results for univariate, `first order’ leakages to the sorts of higher-order univariate and multivariate leakages arising from protected schemes. Tricks to `shift’ information into distribution means via pre-processing inevitably violate the assumptions which make the statistical power analysis of t-tests straightforward. Meanwhile, unless evaluators have control over (or at least access to) any randomness, as well as sufficient details about the implementation specification, the task of searching for jointly leaking tuples can quickly become infeasible as the leakage order increases. We consider that more work needs to be done from a basic methodological perspective before higher-order detection can realistically be considered a candidate for automation.

Finally, we propose that there exists a need to devise measures of `coverage’, inspired by the notion of code coverage in software testing. We offer some suggestions of what this might look like: input-based scores indicating the proportion of the total population sampled for the experiment (under different assumptions about the form of the leakage); intermediate value-based scores indicating the fraction of total sensitive intermediates targeted by the tests; or even strategies whereby profiling information is used to design and test specific corner cases.

We intend to make the formal definition of coverage metrics an avenue for exploration during the remainder of the project, along with other outstanding questions identified in this document. A future deliverable (D2.5) will report on progress made to that end, and finalise our findings and recommendations.

Read more

Deliverable D2.3: Shortcut Formulas for Side-Channel Evaluation

This deliverable surveys a number of approaches to shortcut the effort required to assess implementations and devices with respect to their susceptibility to side channel attacks. Our approach aligns with the divide and conquer nature of most side channel attacks and hence we touch on shortcuts that apply to the distinguisher statistics (the divide step) and the key rank (the conquer step).

We notice that shortcuts make significant assumptions about leakage characteristics (in particular independence of leakages and equal variances) that do not hold in many of the more challenging device evaluation scenarios. In addition, it is not always possible to characterise the signal and noise as  required: early on in a design this information is not yet available, whereas later on in an evaluation within a set scheme the evaluator often cannot turn off countermeasures to establish the nature of the “original” signal. When it comes to key rank computations it has been shown that the variance of the key rank is huge in those cases where the key rank is most relevant. The tightness by which the average rank is estimated is thus not so important as long as the estimate also gives some information about the spread of the rank.

Within REASSURE we set ourselves the challenge to assess how such shortcut approaches could nevertheless be leveraged to improve evaluations. This deliverable attempts to highlight at what stage shortcuts are particularly appropriate. Secondly, some shortcuts will be implemented within WP3.2 in the context of the REASSURE simulation tool.

Read more

Deliverable D 2.4: Report on Instruction Level Profiling

This document reports on interim progress of the REASSURE consortium towards methodologies to create representative instruction-level profiles of microprocessors of at least medium complexity.

We begin by reviewing possible approaches to modelling leakage, ranging from detailed transistor level simulations derived from back-annotated netlists, through ‘profiled’ models fitted to sample data, to simulations based on simplifying assumptions about typical device behaviour. We identify the requirements for our planned REASSURE simulator, which will need to be able to process source code as input and should be flexible with respect to the (profiled or unprofiled) models by which it produces leakage predictions.

Next we summarise a range of existing tools from industry and academia, including one (ELMO) which we argue best sets the precedent that we plan to follow for REASSURE. We explain the modelling procedure used in the construction of ELMO and report on the results of our own model-building endeavour using data from an alternative implementation of the same board (an ARM Cortex M0). In particular, we show that there are broad similarities between the implementations, but some differences that will require attention if models are expected to port from one to another.

The next steps for REASSURE will be to explore improvements to the model building procedure, as well as its extension to more complex devices, in order to eventually release our own simulator tools to the wider side channel research and evaluation community.

Read more

Deliverable D2.6: White Paper: Security Testing Via Simulator

This white paper guides users of the trace simulator (ELMO) released by the REASSURE consortium in the particular task of security testing code during the design phase. It is intended for software developers who need to write or integrate cryptography on ARM CORTEX-M devices, in particular the M0. We assume familiarity with symmetric encryption (AES), an awareness of side channel analysis (in particular power analysis and masking as a countermeasure), and ARM Thumb assembly.

We begin by explaining the different options and challenges for leakage simulation. Broadly speaking, there are two key aspects to simulation: the accurate tracking of data flow (at some appropriate level) and the mapping of the flow to some meaningful prediction of the power consumption (or other side-channel). Unlike most of the other existing tools, which focus more on one or other of these ends, ELMO (as we describe) makes a combined effort towards both, and was therefore chosen as the starting point for development within the project.

We next show how to set up and configure ELMO – for simulation generally, and in the context of ‘fixed-versus-random’ leakage detection. We provide some non-technical intuition for choosing the desired error rates of the tests and setting an appropriate sample size.

Finally, we give some case-study examples that show ELMO’s leakage detection capabilities in practice and illustrate the types of problems it is able to flag up. We suggest ways to address the particular problems in question, thus demonstrating a workflow of testing, adjusting, and retesting that we recommend to developers in the code development phase of the design process.

The appendix of this whitepaper lists and reviews a number of existing leakage simulators.

Read more

Deliverable D2.7: Final report on automation

This document is an update to deliverable 2.5 detailing further progress of the REASSURE consortium towards the goal of flexible evaluation methods that are usable by non-domain experts.

We begin by clarifying what we mean by automation: the facility to perform a task with a minimum of user input and interaction (expert or otherwise). This implies that the necessary parameters for a sound, fair and reproducible analysis are derived or estimated from the available data as far as possible. It also requires that the assumptions and limitations of the analysis be made transparent in the output, in order to guide (potentially non-expert) users towards sound conclusions.

Deliverable 1.1 identifies three main components of a typical evaluation process: measurement and (raw) trace processing, detection/mapping, and exploitation. These seldom proceed in a linear workflow, as information gained in later steps is used to repeat and refine earlier steps. For example, it can be hard to confirm the suitability of acquired traces without actually performing some sort of detection or attack, so that the measurement stage may need to be revisited in the light of observed outcomes. Such backtracking creates challenges for automation, in particular implying the need for improved quality metrics at every stage of evaluation – a matter of ongoing investigation.

On the basis that detection and mapping tasks are the most promising for automation, we then provide a more detailed account of statistical hypothesis testing and of how to perform the statistical power analyses and multiple comparison corrections required to carry out such procedures fairly and transparently. In the case of leakage detection strategies based on simple assumptions, such as those relying on t-tests, it is possible to arrive at analytical formulae for the latter; for complex methods, such as those estimating information theoretic quantities, the only options currently available (as far as we are aware) involve referring to empirically-derived indicators from (hopefully) representative scenarios.

We discuss the particular challenges of extending known results for univariate, ‘first order’ leakages to the sorts of higher-order univariate and multivariate leakages arising from protected schemes. Tricks to ‘shift’ information into distribution means via pre-processing inevitably violate the assumptions which make the statistical power analysis of t-tests straightforward. We consider that more work needs to be done from a basic methodological perspective before higher-order detection can realistically be considered a candidate for automation.

Inspired by the notion of code coverage in software testing, we suggest the need for measures of leakage evaluation coverage. We offer some suggestions of what these might look like: input-based scores indicating the proportion of the total population sampled for the experiment (under different assumptions about the form of the leakage); intermediate value-based scores indicating the fraction of total sensitive intermediates targeted by the tests; or even strategies whereby profiling information is used to design and test specific corner cases.

With a non-domain expert end user in mind we then consider how to apply recommendations from this deliverable to the tools produced in work package 3. In particular, we explain how to set the default test parameters for the automated simulation and analysis of traces so as to require minimal input from the developer in an iterative design process.

We lastly consider the automation potential of deep learning as a tool for side-channel analysis. WP1 (see, in particular, D1.2) has addressed deep learning from an efficiency perspective, but its ability to extract information from large amounts of raw data with minimal input on the part of the user makes it highly relevant to this work package also. We describe our efforts towards making best use of its capabilities, including via a proposal whereby collections of traces are jointly classified as having been generated under a particular key guess or not, and the inspection of the loss function gradients with respect to the input data during the training phase. We also discuss some of the obstacles to automation, such as the opaque nature of deep learning models, which may make it difficult to draw practically applicable conclusions about the form and location of leakage after an evaluation.

Read more

Deliverable D3.7: Final report on tools

This document consists of a list of tools that were developed during the course of the REASSURE project by project partners. Tools are categorised as either software, data sets, or simulation tools; they can be internal tools strictly under the control of a single partner, or joint ventures and openly released. We provide an overview of our collective achievements, followed by a list of tools, organised by REASSURE partners. We indicate if a tools is available for public use. Wherever possible we give figures for performance improvements as a result of a tool.

Deliverable D4.3: Final Report on Standardization

A declared goal of REASSURE was to have an impact on standardisation and evaluation schemes via influencing stakeholders and decision makers. To do so, the consortium identified relevant targets, e.g. established consortia such as JHAS, ongoing standardisation efforts within ISO, as well as emerging efforts like SESIP, and liaised with them over the past three years.

The consortium provided input to two ISO standards (20085-1 and 20085-2), which matured to publication during the duration of the project. These standards cover technicalities around side channel setups and the calibration of them. The consortium also contributed to the analysis of an existing ISO standard (17825). This standard covers testing methods in the context of side-channel attacks.

Consortium members made several presentations to the JHAS group during the regular group meetings, contributing in particular in the context of the ongoing and rapid developments around the use of deep learning.

The consortium is also represented in EMVCo, and a presentation of project results in the context of deep learning is scheduled.

SGDSN held yearly meetings with CB GIE, which were informed by REASSURE results.

An initiative initially launched by NXP (known as SmartCC) and now run by the Global Platform consortium under the name SESIP (Security Evaluation Standard for IoT Platforms) aims at driving the harmonisation of standardisation in the context of Internet of Things (IoT) devices. REASSURE results, in particular intended for the IoT use case (leakage detection as “conformance style testing”) have been discussed with the relevant NXP liaison people, w.r.t. to implications for the SESIP scheme.

Read more

Deliverable D5.2: Data management plan

This document represents the first version of the Data Management Plan (DMP) of the REASSURE project. It is a living document that will be updated throughout the project. The document focuses on identifying the type of data that will be openly shared, namely leakage traces acquired and processed during a side-channel analysis of an embedded security device, as well as the intended target audience and the data format.

Deliverable D5.6: Advanced SCA Training

This document is a placeholder for the full training (which can be accessed here). In this document we present a high level overview of how the course and the online webinar (https://www.riscure.com/news/understanding-leakage-detection-webinar/) are organized and the impact in terms of number of participants.

The decision to dedicate the advanced SCA training to leakage detection is a consortium decision, as the application of leakage detection techniques is a complex topic, not always well understood, of vital interest in the hardware security evaluation industry.