Simple and tight device-independent security proofs

Proving security of device-independent (DI) cryptographic protocols has been regarded to be a complex and tedious task. In this work we show that a newly developed tool, the"entropy accumulation theorem"of Dupuis et al., can be effectively applied to give fully general proofs of DI security. At a high level our technique amounts to establishing a reduction to the scenario in which the untrusted device operates in an identical and independent way in each round of the protocol. This makes the proof much simpler and yields significantly better, essentially tight, quantitative results when considering general quantum adversaries, compared to what was known before. As concrete applications we give simple and modular security proofs for DI quantum key distribution and randomness expansion protocols based on the CHSH inequality. For both tasks we establish essentially optimal key rates and noise tolerance. As loophole-free Bell tests are finally being realised, our results considerably decrease the gap between theory and experiments, thereby marking an important step towards practical DI protocols and their implementations.


Introduction
Device-independent cryptographic protocols aim at achieving an unprecedented level of security -with guarantees that hold (almost) irrespective of the quality, or trustworthiness, of the complex physical devices used to implement the protocol [ER14]. Security in such protocols is based on the statistics observed by the honest parties when running the protocol, which allow them to decide whether the possibly faulty or even malicious devices used pose any security risk.
As an example, consider the task of device-independent quantum key distribution (DIQKD). In a DIQKD protocol the honest parties, called Alice and Bob, share a two-component device (where one component is held by Alice and the other by Bob). The manufacturer of the device, who can be incompetent or even malicious 1 , claims that by interacting with the device according to the protocol the honest parties will eventually produce (with high probability) identical and secret keys, unknown even to her, as the result of alleged measurements made by the device on the quantum state it contains. The device is far too complex for Alice and Bob to open and asses whether it works as claimed and, maybe, even the manufacturer herself cannot guarantee that its actions are exact and non-faulty at all times. Alice and Bob must therefore treat the device as a black box with which they can only interact according to the protocol, and the protocol must be "strong enough" such that its security can be proven based on those interactions alone.
It is already well known that device-independent protocols are made possible due to the phenomena of quantum non-locality and the so-called Bell inequalities (see [Sca13,BCP + 14] for excellent reviews on the topic). In general, a Bell inequality [Bel64] can be thought of as a game 2 played by the honest parties using the device they share. The game has a special "feature" -any classical strategy (i.e., convex combination of deterministic strategies) describing the behaviour of the device can result in at most a winning probability ω c such that ω c < 1, while there exists a quantum strategy (i.e., measuring some entangled state) achieving a greater winning probability ω q > ω c . Hence, if the honest parties observe that using their device they can win the game with probability ω q then they can conclude that their device must be non-local. The value of the winning probability in a game is directly related to the amount of secret randomness produced during the game [PAM + 10, AMP12]. This can be used to prove security of the relevant protocol.
For many cryptographic tasks proving security amounts to bounding the knowledge that an adversary (a malicious party, or an eavesdropper) can gain about the output of the protocol. This knowledge, or uncertainty, is modelled using the conditional min-entropy, which for the case of a classical output corresponds to the maximum probability with which the adversary may guess the output of the protocol. In the case of QKD, for example, the output is the raw key K, and proving security is essentially equivalent 3 to establishing a lower bound on the smooth conditional min-entropy H ε min (K|E), where E is the quantum system held by Eve, which can be initially correlated to the device producing K (for formal definitions see Section 2).
Evaluating the smooth min-entropy of a large system is often difficult, especially in the device-independent setting where not much is known about the way K is produced. One assumption commonly used to simplify this task is that the bits of K = K 1 , . . . , K n are created in an independent and identical way and hence K itself is an independent and identically distributed (i.i.d.) random variable. That is, it is assumed that the device held by Alice and Bob makes the same measurements on the same quantum states in every round of the protocol i ∈ [n]. This means that the device is initialised with some (unknown) state which has a tensor product structure ρ ⊗n AB and that the measurements have a tensor product structure as well. In that case, the total entropy in K can be easily related to the sum of the entropies in each round separately. 4 A bound on the entropy accumulated in one round can usually be derived using the expected winning probability in the game played in that round (i.e., the Bell violation), which in turn can be easily estimated during the protocol in the i.i.d. case using standard Chernoff-type bounds since the same game is just being played repeatedly with the same strategy.
Unfortunately, even though quite convenient (and, in many cases, seemingly necessary) for the analysis, the i.i.d. assumption is a very strong one in the device-independent scenario. 5 In particular, under such an assumption the device cannot use any internal memory (i.e., its actions in one round cannot depend on the previous rounds) or even display time-dependent behaviour (due to inevitable imperfections for example).
When considering device-dependent QKD protocols, such as the BB84 protocol [BB84], and other devicedependent quantum protocols, de Finetti theorems [Ren07,CKR09] can usually be used to reduce the task of proving security in the most general case to that of proving security with an i.i.d. assumption as described above. Unfortunately, the same theorems cannot be used in the device-independent case (one reason, for example, is that they depend on the dimension of the underlying states, which is unknown in the deviceindependent setting), and while other de Finetti-type theorems [CT09,AFR15] were developed for the case in which the states and measurements are unknown it is still not clear how to use them in DIQKD security proofs. Hence, in the device-independent setting, one could not simply reduce the general security proof to the one proven under the i.i.d. assumption.
Without this assumption about the device, however, not much is a priori known about the structure of K, the expected winning probability in one round of the protocol, nor the way the total entropy of K is accumulated one round after the other (as the device might correlate the different rounds in an almost arbitrary way). Therefore, security proofs that estimated H ε min (K|E) directly for the most general case had to use far more complicated techniques and statistical analysis compared to the i.i.d. case. 6

Results and contributions
In this work we demonstrate that a newly developed tool, the "entropy accumulation theorem" [DFR16], can be effectively applied to give proofs of security with essentially tight parameters for a broad range of device-independent protocols. Our main contribution is a general framework, consisting of a flexible protocol and analysis, for applying the EAT to establish quantitative results on device-independent security.
Our technique amounts to establishing a reduction to the i.i.d. setting, with the major advantage that the reduction is virtually lossless in terms of parameters. As a consequence we are able to extend the tight results known for, e.g., DIQKD under the i.i.d. assumption to the most general setting. Our technique is simpler and allows for more modular protocols than previous ad-hoc approaches. As more and more deviceindependent protocols are being considered, it is highly important to have a simple, yet strong, technique to prove security.
The significance of our quantitative results is twofold. First, they establish the a priori surprising result that general quantum adversaries do not force weaker rates compared to those achieved in less general scenarios. That is, it is possible to achieve rate vs. noise tradeoffs which are as good as those achieved in much more restricted settings such as under the i.i.d. assumption. Second, as loophole-free Bell tests (a necessity for device-independent cryptography) are finally being realised [HBD + 15], our rates further and considerably decrease the gap between theory and experiments, thereby marking another important step towards practical device-independent protocols and their implementations.
We provide two concrete applications. To begin with, we consider a DIQKD protocol based on the CHSH inequality and prove its security. The achieved key rates and noise tolerance are significantly higher than previous works. For large enough number of rounds n, the key rate as a function of the noise tolerance essentially coincides with the optimal result of [PAB + 09], derived for the restricted i.i.d. and asymptotic case. In particular, as in [PAB + 09], we show that the protocol can tolerate up to the optimal error rate of 7.1% while still producing a positive key rate. (For comparison 7 , in [VV14] the maximal noise tolerance was 1.6%). Moreover, the achieved key rates are comparable to those achieved in device-dependent QKD protocols [SR08a,SR08b] already starting from n = 10 6 . (For further details and curves see Section 5.5.2).
As a second application we consider a randomness expansion protocol based on the CHSH inequality. Here as well, we obtain an expansion rate which is essentially the same as the optimal rate achieved in [PAM + 10] in the case of classical adversaries only, while our result holds against quantum adversaries. This is much better than the rates obtained in previous works [VV12,MS14a,MS14b].
Main ideas of the proof. The main tool used in the proof is the above mentioned entropy accumulation theorem (EAT) [DFR16]. The EAT can be understood as some kind of a chain rule for the conditional smooth min-entropy, which quantifies how entropy "accumulates" across many random variables generated through a certain iterative quantum processes as long as it fulfils a number of conditions (see Section 2.6 for the exact statement).
Our technical contribution consists in showing how the EAT can be used to provide a complete analysis of security in the device-independent setting, with the additional benefit of yielding essentially optimal parameters. Towards this we take the following three main steps. First, we propose a modular hypothetical protocol that can be considered as a "skeleton" for many device-independent cryptographic tasks. This protocol is fine-tuned such that the entropy generated throughout it can be analysed using the EAT, by ensuring that the appropriate conditions are met, and such that it can be used as a building block in the analysis of more complex protocols while preserving strong quantitative results. Next, we combine the results of [PAB + 09], derived for the i.i.d. case, with our protocol to get a good lower-bound on the generated entropy rate when using the CHSH inequality. Finally, we show how the hypothetical protocol can be related to actual more complex protocols of interest, such as DIQKD protocols. This relation is used to prove security of a DIQKD protocol that we propose, with essentially optimal key rate and noise tolerance.

Related work
The idea of basing the security of cryptographic protocols (QKD especially) on the violation of Bell inequalities originates in the celebrated work of Ekert [Eke91]. Later, Mayers and Yao [MY98] recognised that by using Bell inequalities the underlying devices must not be trusted. Barrett et al. [BHK05] were the first to combine both ideas together and derive a proof of security for QKD in the device-independent scenario. Their security proof holds even in the presence of a super-quantum adversary, limited only by the non-signalling principle. The protocol of [BHK05], however, could not tolerate any amount of noise and produced just one secret bit when using the device many times.
Following these initial works a long line of research [AGM06, AMP06, SGB + 06, ABG + 07, Mas09, PAB + 09, HRW10, HR10, MPA11, MRC + 14] led to protocols, and proof techniques, that establish nonvanishing key rates with a positive noise tolerance in the i.i.d. setting, against quantum or super-quantum adversaries (the former typically leading to better rates and noise tolerance). Most relevant for our work are the results of Pironio et al. [PAB + 09], where security of a DIQKD protocol was proven in the asymptotic limit, i.e., when the device is used n → ∞ times, and under the i.i.d. assumption described above. Their protocol is based on the CHSH inequality [CHSH69]. To the best of our knowledge, they provide the best known rates under these assumptions.
The simpler tasks of randomness certification and expansion, first considered in [Col06], are the first to have received a complete proof of security in the device-independent setting. Pironio et al. [PAM + 10] showed, in the non-i.i.d. setting, that a quadratic expansion was possible, but their analysis was limited to the case of classical side information. Security against quantum side information was established in [VV12], where it was shown that exponential expansion is possible. The analysis of [VV12], however, does not tolerate noise in the devices; this was solved in [MS14a].
The maximum amount of randomness that can be generated from one system violating a specific Bell inequality by a given amount has been well-studied. Pironio et al. [PAM + 10] give tight bounds for the CHSH game; see, e.g., [DPA13, LBS + 14] for recent works exploring different aspects of the question. However, when using the device repeatedly, in the non-i.i.d. setting, few works give explicit rates; to the best of our knowledge the only quantitative results available are from [MS14b] (see also [PM13,FGS13] for an analysis in the noni.i.d case but under the assumption that the adversary holds only classical side information), and remain relatively weak in comparison to the best one may expect from the known results under the i.i.d. assumption.
For the more challenging scenario of QKD, security in the non-i.i.d setting was first established in [VV14]; see also [RUV13], who give a secure protocol but with vanishing rate and no noise tolerance. A more recent proof of security by Miller and Shi [MS14a] is closest to our results in that it bounds the amount of entropy generated in the protocol in a round-by-round fashion, similar in spirit (but technically very different) from our use of the EAT (see Section 2.6 for a description). The security proofs of the existing works are quite complex and achieve relatively low key rates and noise tolerance (if any).
Structure of the paper. The paper is organised as follows. We start with some preliminaries in Section 2. In Section 3 we show how the EAT can be used in device-independent protocols for a general Bell inequality. Then, in Section 4 we explicitly calculate and plot the entropy rates for the case of the CHSH inequality. We continue in Sections 5 and 6 with our DIQKD and randomness expansion protocols, respectively. We end in Section 7 with some open questions.
2 Preliminaries 2.1 General notation All logarithms are in base 2. Random variables (RV) are denoted by capital letters while specific values are denoted by small letters. We denote vectors in bold face; for example, X = X 1 , . . . , X n is a vector of RV. Sets are denoted with calligraphic fonts.
Given a value c = c 1 , . . . , c n ∈ C n , where C is a finite alphabet, we denote by freq c the probability distribution over C defined by freq c (c) = |{i|ci=c}| n forc ∈ C. If ρ CE is a state classical on C we write Pr [c] ρ to denote the probability that ρ assigns to c.
For m ∈ N + , ρ Um denotes the completely mixed state on m qubits and I is the identity operator. For convenience all important parameters, constants, and random variables used in the paper are listed in the tables in the appendix.

Entropies and Markov chains
Entropies and conditional entropies. h is used for the binary entropy function h(p) = −p log(p) − (1 − p) log(1 − p). The von Neumann entropy H(ρ) of a quantum state ρ is given by H(ρ) = −Tr(ρ log ρ). Given a bipartite state ρ AE ∈ H A ⊗ H E the conditional von Neumann entropy is defined as H(A|E) ρAE = H(ρ AE ) − H(ρ E ). When the state on which the entropy is evaluated is clear from the context we drop the subscript and write H(A|E).
Min-entropy. Given a state classical on A, ρ AE = a p a |a a| ⊗ ρ a E , the conditional min-entropy is where p guess (A|E) is the maximum probability of guessing A given the quantum system E: and the maximum is taken over all POVMs {M a E } a on E. For any quantum state ρ AE , H(A|E) ≥ H min (A|E). The smooth conditional min-entropy with smoothness parameter ε of a state ρ AE is defined to be Max-entropy. The quantum smooth max-entropy of a state ρ AE is given by We will also use the closely related H ε 0 entropy. For classical X and Y distributed according to P XY H 0 (X|Y) = max y log Supp P X|Y=y , where Supp P X|Y=y = {x|P X|Y=y (x) > 0}. Its smooth version is given by where the minimum ranges over all events Ω with probability at least 1 − ε.

Non-local games
We consider general two-player non-local games G and treat them as equivalent to bipartite Bell inequalities. In a game G, the two players, Alice and Bob, share a bipartite quantum state. Given a question for Alice and a question for Bob, they can choose how to measure their parts of the state, and then use the measurements outcomes to supply an answer each. They win if their answers fulfil a pre-defined requirement, called the winning criterion. More formally, the game G is defined via sets of questions and answers for Alice and Bob, X , Y and A, B, a distribution over X × Y (we will generally assume this is a product distribution), and a winning criterion w : X × Y × A × B → {0, 1}. 9 A strategy for the players in a game G is specified by a bipartite state ρ QAQB , where Alice holds register Q A and Bob register Q B , and local measurements that each player performs on his or her register in order to determine the answer to the given question. We use ω ∈ [0, 1] to denote the winning probability of a strategy in the game G.
The CHSH game. We use a variant of the CHSH game previously used in [PAB + 09, VV14] in the context of DIQKD. In this game Alice has two possible inputs X = {0, 1} and Bob three possible inputs Y = {0, 1, 2}. The output sets are A = B = {0, 1}. The winning condition is the following: 10 The optimal quantum strategy for this game is the same as in the standard CHSH game [CHSH69], except that if Bob's input is a 2 he applies the same measurement as Alice's s measurement on input 0. Since the underlying state is maximally entangled this ensures that their outputs will always match when (x, y) = (0, 2).
Conditioned on Bob's input not being 2, the game played is the CHSH game. We denote by β ∈ [2, 2 √ 2] the value of the CHSH Bell violation of a given state and measurements. The wining probability is denoted by ω ∈ 3 4 , 2+ √ 2 4 and the relation ω = 1/2 + β/8 holds. β = 2 is the optimal classical violation while β = 2 √ 2 is the quantum one.

Untrusted device
In a device-independent protocol the honest parties interact with an untrusted device. We now explain what is meant by this term and what are the assumptions regarding such a device. For simplicity we consider the case of two honest parties, Alice and Bob, but this can be extended to more parties in the obvious way. A device D is modelled by a tripartite apparatus (including both state and measurements devices), distributed between Alice, Bob, and the adversary Eve. We think of the device as being prepared by Eve, and hence we call it untrusted. This allows Eve, in particular, to keep a purification of Alice and Bob's quantum state in a quantum register in her possession. 11 Although the device is untrusted we always assume that the following requirements hold (some of these requirements can be verified).
The device can be used to run the considered protocol. That is, Alice and Bob can interact with D according to the relevant protocol. 12 Alice and Bob's components of D implement the protocol by making sequential measurements on quantum states. In each round of the protocol, we say that the device is implementing some strategy for the game G being played. The device may have memory, and thus apply a different strategy each time the game is played, depending on the previous rounds. Therefore, the 9 A general Bell inequality would allow for an R-valued w; we will not need this here. 10 For the inputs (x, y) = (1, 2) one can set either w CHSH = 1 or 0 (it is not relevant later on); for completeness we choose w CHSH = 1 in this case, following previous works. 11 We emphasise that Eve is not required to measure her quantum state at any particular point. During the run of the considered protocol, Eve can eavesdrop on all the classical communication between the honest parties, and can later choose to measure her quantum register depending on this information.
12 For an example of a protocol, see Protocol 1 below. measurement operators may change in each round, and the state on which the measurements are performed may be the post-measurement state from the previous round, a new state, or any combination of these two. We sometimes use the terminology honest device or honest implementation. A device is said to be honest if it implements the protocol by using a certain pre-specified strategy. In that case, the actions of the device are known and fixed (noise can still be present), but Eve may still hold a purification of the quantum states inside of Alice and Bob's components of D.
Communication (signalling) between the components of the device. The communication between Alice, Bob, and Eve's components is restricted in the following way: 1. Alice and Bob's components of D cannot signal to Eve's component.

Alice and Bob can decide when to allow communication (if any) between their components. This
ensures that the underlying quantum state of Alice and Bob's components of the device is (at least) bipartite and that the measurements made in the two components, in each round, are in tensor product with one another.
3. Alice and Bob can decide when to receive communication (if any) from Eve's component.
The requirement of a tripartite device is necessary for device-independent cryptography. It is of course necessary to assume that Alice and Bob's components can at no point send any information to Eve, as asserted in Item 1 above, as otherwise the device could directly send all the raw data it generated. Moreover, Alice and Bob's components must be (at least) bipartite, as follows from Item 2, so that the violation of the considered Bell inequality will be meaningful and imply security.
Allowing some communication between the components of the device, as described in Items 2 and 3, is advantageous to actual implementations of certain protocols. To be specific, we consider the following scenario. In-between different rounds of the protocol, Alice and Bob's components of the device are allowed to communicate freely. During the execution of a single round, however, no communication is allowed. In particular, when the game is being played, there is no communication between the components once the honest parties' inputs are chosen and until the outputs are supplied by the device. 13 Furthermore, in-between rounds Eve may send information to the device, but not receive any from it. In actual implementations this implies that entanglement can be distributed "on the fly" for each round of the protocol, instead of maintaining large quantum memories.

Security definitions
DIQKD. A DIQKD protocol (see Section 5 for a description of an explicit protocol) consists of an interaction between two trusted parties, Alice and Bob, and an untrusted device as defined in Section 2.4. At the end of the protocol each party outputs a key,K A for Alice andK B for Bob. The goal of the adversary, Eve, is to gain as much information as possible about Alice and Bob's keys without being detected (i.e., in the case where the protocol is not being aborted).
Correctness, secrecy, and overall security of a protocol are defined as follows (see also [PR14,Bea15]): Definition 1 (Correctness). A DIQKD protocol is said to be ε corr -correct, when implemented using a device D, if Alice and Bob's keys,K A andK B respectively, are identical with probability at least 1 − ε corr . That is, Definition 2 (Secrecy). A DIQKD protocol is said to be ε sec -secret, when implemented using a device D, if, conditioned on not aborting the protocol, Alice's keyK A is ε sec -close to a uniformly random key, even given Eve's side information ρ E . That is, for a key of length l, If a protocol is ε corr -correct and ε sec -secret (for a given D), then it is ε s QKD -correct-and-secret for any ε s QKD ≥ ε corr + ε sec .
Definition 3 (Security). A DIQKD protocol is said to be (ε s QKD , ε c QKD , l)-secure if: 1. (Soundness) For any implementation of the device D, either it aborts with probability greater than 1 − ε s QKD or it is ε s QKD -correct-and-secret. 2. (Completeness) There exists an honest implementation of the device D such that the protocol does not abort with probability greater than 1 − ε c QKD . The protocols that we consider below take into account possible noise in the honest implementation. That is, even when there is no adversary at all, the actual implementation of the devices might not be perfect. Thus, the completeness of the protocol implies its robustness to the desired amount of noise.
Lastly, a remark regarding the composability of this security definition is in order. A security definition is said to be composable [Can01,BOM04,PR14] if it implies that the protocol can be arbitrarily used and composed with other protocols (proven secure by themselves), without compromising security. Obviously, if Alice and Bob wish to use the keys they produced in the DIQKD protocol in some other cryptographic protocol (i.e., they compose the two protocols), it is necessary for them to use protocols which were proven to have composable security.
For the case of (device-dependent ) QKD, Definition 3 was rigorously proven to be composable [PR14]. This suggests that the same security definition should also be the relevant one in the device-independent context and, indeed, as far as we are aware, it is the definition that has been used in all prior works on deviceindependent cryptography. Nevertheless, the claim that Definition 3 is composable for device-independent protocols as well has never been rigorously proven, and the result of [BCK13] suggests that this is not the case when the same devices are reused in the composition. We still use this definition as it seems like the most promising security definition to date.
Randomness expansion. In the task of randomness expansion there is a single user interacting sequentially with an untrusted device. At the start of the interaction the user is presented with a source R ∈ {0, 1} r of uniformly random bits. The user then interacts sequentially with the device in a deterministic way (the only sources of randomness being the initial string R and any randomness which may be present in the devices' outputs). At the end of the protocol the user returns a string Z ∈ {0, 1} m of m bits that is statistically close to uniform, conditioned on R as well as any side information of the adversary. (See Section 6 for a concrete example of a randomness expansion protocol.) More formally, we require the following.
Definition 4 (Security of randomness expansion). A protocol is called an (ε c RE , ε s RE )-secure r → m randomness expansion protocol 14 if, given r uniformly random bits: 1. (Soundness) For any implementation of the device D, either it aborts with probability greater than 1 − ε s RA , or, conditioned on not aborting, it returns a classical string Z ∈ {0, 1} m such that where E is a quantum register that may initially be correlated with D.

(Completeness)
There exists an honest implementation of the device such that the protocol does not abort with probability greater than 1 − ε c RE . As in the case of DIQKD, this security definition was not proven to be composable in general.
14 All parameters ε c RE , ε s RE , r and m will in general be function of a parameter n that also parametrises the protocol and the number of rounds of interactions between the user and the device.

The entropy accumulation theorem
The main tool used in this work is the EAT [DFR16, Theorem 4.4]. Below we give the necessary details in a notation appropriate for our work (although less general than the original EAT).
We work with channels with the following properties: Definition 5 (EAT channels). EAT channels N i : has the property that the classical value C i can be measured from the marginal σ AiBiIi without changing the state.
3. For any initial state ρ 0 Definition 6 (Tradeoff functions). Let N 1 , . . . , N N be a family of EAT channels. Let C denote the common alphabet of C 1 , . . . , C n . A function f min from the set of probability distributions p over C to the real numbers is called a min-tradeoff function for , where the infimum is taken over all input states of N i for which the marginal on C i of the output state is the probability distribution p.
Similarly, a function f max from the set of probability distributions p over C to the real numbers is called for all i ∈ [n], where the supremum is taken over all input states of N i for which the marginal on C i of the output state is the probability distribution p.
be EAT channels as in Definition 5, ρ ABICE = (Tr Rn • N n • · · · • N 1 ) ⊗ I E ρ R0E be the final state, Ω an event defined over C n , p Ω the probability of Ω in ρ, and ρ |Ω the final state conditioned on Ω. Let ε s ∈ (0, 1). For f min a min-tradeoff function for {N i }, as in Definition 6, and any t ∈ R such that f min (freq c ) ≥ t for any c ∈ C n for which Pr [c] ρ |Ω > 0, Similarly, for f max a max-tradeoff function for {N i } as in Definition 6 and any t ∈ R such that f max (freq c ) ≤ t for any c ∈ C n for which Pr [c] ρ |Ω > 0, 15 The infimum and supremum over the empty set are defined as plus and minus infinity, respectively.
To gain a bit of intuition on how Theorem 7 is going to be used note the following. The event Ω will usually be the event of the considered protocol not aborting (or a closely related event). The relevant state for which the smooth min-or max-entropy is going to be evaluated is ρ |Ω . To use the theorem, it should be possible to define some EAT channels {N i } that produce the final state ρ from the initial state ρ R0 by applying the channels sequentially; these channels are not necessarily the channels used in the actual protocol to produce ρ. The tradeoff functions can be seen as a bound on the entropy accumulated in one round i, and, if such a bound t exists, then Theorem 7 asserts that the total amount of entropy, accumulated in all rounds i = 1 to n together, is roughly n times t. It is in this sense that the theorem essentially allows us to perform a reduction to the i.i.d. setting.
3 Device-independent entropy accumulation protocol The main task in proving security of DIQKD and other protocols is to prove a bound on the (smooth) min-entropy of the raw data held by Alice and Bob, conditioned on all the information available to the adversary Eve. The goal of this section is to show how the EAT (Theorem 7) can be used in a general device-independent setting to achieve such a bound.
For this we consider the entropy accumulation protocol, described as Protocol 1 below. Although we call it a "protocol", one should see it more as a mathematical tool which allows us to use the EAT rather than an actual protocol to be implemented. 16 To be more specific, the EAT channels (as in Definition 5) will be defined via the steps made in the entropy accumulation protocol. The relevance of the protocol stems from the fact that the final state at the end of the protocol, on which a smooth min-entropy bound can be proven using the EAT, is the same state as (or can easily be related to) the final state in the actual protocol to be executed (depending on the specific application).

The protocol
Protocol 1 is used to generate raw data for Alice and Bob by using an untrusted device D. It is based on an arbitrary non-local game G as defined in Section 2.3, together with a definition of test and generation inputs for Alice and Bob. The test inputs, X t ⊂ X and Y t ⊂ Y, are used by the parties during the test rounds (T i = 1 below) from which the Bell violation is estimated, while the generation inputs, X g ⊂ X and Y g ⊂ Y, are used in the other rounds (the sets are not necessarily disjoint). Ideally, one should use a game G for which Alice and Bob's outputs are perfectly correlated (or anti-correlated) with sufficiently high probability when the parties use the generation inputs. 17 We now define the EAT channels using the rounds of the protocol (where one round includes Steps 2-6 in Protocol 1). For this, the following notation is used. For every i ∈ {0} ∪ [n], the (unknown) quantum state of the device D shared by Alice and Bob after round i of the protocol is denoted by ρ i QAQB . We denote the register holding this state by R i . In particular, R 0 ≡ Q A Q B at the start of the protocol. At Step 4 in Protocol 1, the quantum state of the devices is changed from ρ i−1 QAQB in R i−1 to ρ i QAQB in R i by the use of the device. 18 Our EAT channels are then N i : by the CPTP map describing the i-th round of Protocol 1, as implemented by the untrusted device D (see Figure 1). We prove in Lemma 9 below that they indeed satisfy the conditions given in Definition 5.
In the following we are interested in the state of Alice, Bob, and Eve after the n-th round of the protocol, both before and after Alice and Bob decide whether to abort or not in Step 7. The state before Step 7 is

Protocol 1 Entropy accumulation protocol
Arguments: G -two-player non-local game X g , X t ⊂ X -generation and test inputs for Alice Y g , Y t ⊂ Y -generation and test inputs for Bob D -untrusted device of (at least) two components that can play G repeatedly n ∈ N + -number of rounds γ ∈ (0, 1] -expected fraction of test rounds ω exp -expected winning probability in G for an honest (perhaps noisy) implementation δ est ∈ (0, 1) -width of the statistical confidence interval for the estimation test Alice and Bob choose T i ∈ {0, 1} at random such that Pr(T i = 1) = γ.

3:
If T i = 0 Alice and Bob choose inputs X i ∈ X g and Y i ∈ Y g respectively. If T i = 1 they choose inputs X i ∈ X t and Y i ∈ Y t .

4:
Alice and Bob use D with X i , Y i and record their outputs as A i and B i respectively.

5:
(Optional symmetrisation step:) Alice and Bob choose together a (random) value F i , and respectively update their outputs A i , B i depending on F i .

6:
If T i = 0 then Bob updates B i to B i =⊥, and they set C i =⊥. If T i = 1 they set The initial quantum state shared by Alice, Bob, and Eve is ρ 0 QAQB E and the sequence of maps N i creates the state ρ n QAQB EO . denoted by In Step 7 Alice and Bob decide whether they should abort the protocol or not according to the estimated Bell violation in the test rounds. Let Ω denote the event that they do not abort 19 , i.e., The final state, conditioned on not aborting, is denoted by ρ ABXYTCE|Ω or just ρ |Ω to ease notation. Below we bound the entropy which is accumulated in this state during the rounds of the protocol.

Completeness
Suppose that Alice and Bob execute Protocol 1 with a device D which performs i.i.d. measurements on a tensor product state ρ ⊗n QAQB such that the winning probability achieved in game G by the device D executed on a single state ρ QAQB is ω exp . We call any such implementation an honest implementation. The following lemma bounds the probability of Protocol 1 aborting in an honest implementation.
Lemma 8. Protocol 1 is complete with completeness error ε c EA ≤ 2 exp(−γ min n), where That is, the probability that the protocol aborts for an honest implementation of the devices D is at most ε c EA . Proof. Alice and Bob abort in Step 7 when the estimated Bell violation is not sufficiently high. In the honest implementation C i are i.i.d. RVs with E [C i ] = ω exp . Therefore, we can use Hoeffding's inequality. For any γ min ∈ [0, 1], One can therefore choose γ min such that the completeness error ε c EA is minimised.

Soundness
The EAT, Theorem 7, almost immediately provides a general lower bound on the amount of entropy generated by Protocol 1. We state the result as Lemma 9 below; in Section 4 we will obtain a more refined bound based on an instantiation of the protocol with the game G taken to be the CHSH game.
Lemma 9. Let D be any device, and for i ∈ [n] let implemented by the i-th round of Protocol 1. Let ρ be the state generated by the protocol (as defined in Equation (1)), Ω the event that the protocol does not abort (as defined in Equation (2)), and ρ |Ω the state conditioned on Ω. Let f min be a real-valued function defined on the set of probability distributions p over the alphabet {⊥, 0, 1} of C i such that where the infimum over an empty set is defined as infinity. Then, for any ε EA , ε s ∈ (0, 1), either the protocol aborts with probability 1 − Pr(Ω) ≥ 1 − ε EA or, where t = min Proof. In order to apply the EAT we first verify that the conditions stated in Definition 5 are fulfilled. Using that C i is a function of A i , B i , X i , and Y i the first two conditions in Definition 5 clearly hold. Moreover, the Markov chain condition holds as well since the values of X i+1 , Y i+1 , T i+1 , and F i+1 are chosen independently of everything else at each round. To conclude, note the event Ω of the protocol not aborting implies that the fraction of successful game rounds is at least ω exp − δ est , i.e., for any c for which Pr [c] ρ |Ω > 0.
The main work remaining for a successful use of Protocol 1 for entropy generation consists in obtaining a good lower bound in Equation (5), i.e., devising an appropriate min-tradeoff function f min satisfying Equation (4). In order to understand the task to be accomplished note that N i defines X i , Y i , T i , and F i , so although the infimum in Equation (4) is taken over all states σ the distributions of X i , Y i , T i , and F i are fixed. Moreover, the infimum is only taken over states with N i (σ) Ci = p, a condition which fixes the Bell violation achieved by σ under the bipartite measurement performed by the device. This is precisely the sense in which the EAT can be understood as providing a reduction to the i.i.d. case.
Lower bounds of the form of Equation (4) of different quality can be obtained depending on the specific Bell inequality employed in the protocol. A general method consists in using the chain rule to write Note that here the random variable F i depends on the (optional) symmetrisation step, and was introduced precisely to enable an easier lower bound on the quantities above; we will show how it can be used in the specific case of the CHSH game in the next section. A bound using the min-entropy H min , instead of H itself, is not tight in general, and one can expect to lose quite a lot by performing the relaxation above (see for example Figure 2). The advantage, however, is that a lower bound on H min (A i |X i Y i T i F i R ′ ) Ni(σ) can be found using general techniques based on the semidefinite programming (SDP) hierarchies of [NPA08]. For a slightly better bound one should not drop the second term in Equation i−1 can be taken from [MPA11]; both bounds are tight. For non-optimal Bell violation the min-entropy is significantly lower than the entropy.

A bound for the CHSH game
In this section we devise a specific min-tradeoff function f min which, through an application of Lemma 9, leads to a concrete bound on the entropy generated by Protocol 1 when the game G is the CHSH game (described in Section 2.3).
We use Protocol 1 with the following choices: In order to fully specify the protocol it suffices to describe the symmetrisation step. In this step, Alice and Bob choose together a uniform bit F i , and they both flip their output bits if and only if F i = 1. This symmetrisation is helpful in the proof of the main theorem below. The downside is that it costs a lot of randomness to implement, which can be problematic for some applications such as randomness expansion. At the end of the section we show that the step is in fact not necessary in any real implementation of the protocol.
The rate η opt as a function of the expected Bell violation ω exp is plotted in Figure 3 for 20 γ = 0 and several choices of values for ε EA , δ est , and n. For comparison, we also plot in Figure 3 the asymptotic rate (n → ∞) under the assumption that the state of the device is an (unknown) i.i.d. state ρ ⊗n QAQB E . In this case, the quantum asymptotic equipartition property [TCR09, Theorems 1 and 9] implies that the optimal rate is the Shannon entropy accumulated in one round of the protocol (as given in Equation (11)). This rate, appearing as the dashed line in Figure 3, is an upper bound on the entropy that can be accumulated. One can see that as the number of rounds in the protocol increases our rate η opt approaches this optimal rate.
Proof of Theorem 10. Based on Lemma 9, it will suffice to define a min-tradeoff function f min such that Equation (4) is satisfied. Using the chain rule, Due to the bipartite requirement on the untrusted device D used to implement the protocol and since Alice's actions (and her device's) are independent of Bob's choice of Y i and T i for the case X i = 0 we have 21 Using the definition of the conditional entropy one can rewrite H (A i |F i R ′ , X i = 0) Ni(σ) as follows: where ) and the last equality follows from the symmetrisation step, Step 5. Using that F i and A i are independent (even conditioned on X i ), For states leading to a CHSH violation of β ∈ [2, 2 √ 2] (for inputs restricted to {0, 1} × {0, 1}) a tight bound on χ (A i : R ′ |F i , X i = 0) was derived in [PAB + 09, Section 2.3]: Since ω = 1 8 β + 1 2 , for ω ∈ 3 4 , 2+ √ 2 4 (i.e. a violation in the quantum regime) we get Combining this bound with Equation (9) we conclude that for all p such that ω(p) ∈ 3 4 , 2+ Define a function g on [0, 1] by From Equation (11) it follows that any choice of f min (p) that is differentiable and satisfies f min (p) ≤ g(ω(p)) for all p will satisfy Equation (4).
For ω = 2+ √ 2 4 the derivative of g is infinite. For the final bound of the EAT to be meaningful f min should be chosen such that ▽f min ∞ is finite. To remedy this problem we choose f min by "cutting" the function g and "gluing" it to a linear function at some point ω t (which is later optimised), while keeping the function differentiable. By doing this we ensure that the gradient of f min is bounded, at the cost of losing a bit of entropy for ω > ω t . Towards this, denote We then make the following choice 22 for the min-tradeoff function f min (see Figure 4): From the definition of a and b in Equation (13) this function is differentiable and fulfils the condition given in Equation (4). Furthermore, by definition for any choice of ω t it holds that ▽f min (·, ω t ) ∞ ≤ a(ω t ).

Plugging this into Equation (15) the theorem follows.
We end this section by showing how the particular implementation of the symmetrisation step, Step 5, of Protocol 1 made here for the CHSH game can be ignored in any implementation of the protocol. For this, rewrite Equation (27) more formally as 23 22 Note that f min is nonpositive for ω(p) ≤ 3/4, but this regime is not relevant as it would lead to the protocol aborting; the extension of f min to that range of values is only for mathematical convenience. Also note that formally, using the notation of Lemma 9, f min should be a function on probability distribution p defined over {0, 1, ⊥}. Nevertheless, we write it as a function of ω for clarity, and the relation between ω and P is given by ω (P) =

Pr[P=1]
1−Pr [P=⊥] . 23 Previously for ease of notation we wrote AB for the flipped outputs; here we denote the same bits as g F (AB) to make the flipping operation explicit.
where g F is the function that flips the bits according to F. Since for any fixed value of F , g F is a deterministic function it follows from [SR08b, Lemma 1] that for any ε s ≥ 0, Combining Equations (16) and (17) proves the following corollary.
Corollary 11. Under the same assumptions as Theorem 10, but for an implementation of Protocol 1 in which the symmetrisation step, Step 5, is omitted, for any ε EA , ε s ∈ (0, 1), either the protocol aborts with probability greater than 1 − ε EA or where η opt is defined in Equation (7).

The protocol
Our protocol for DIQKD is described as Protocol 2 below. An honest implementation is described in Section 5.2.
In the first part of the protocol Alice and Bob use their devices to produce the raw data, similarly to what is done in the entropy accumulation protocol, Protocol 1 (with the game G equal to the CHSH game, as in Section 4). The main difference is that Bob's outputs always contains Bob's i-th measurement outcome (instead of being set to ⊥ in all rounds for which T i = 0); to make the distinction explicit we denote Bob's outputs in Protocol 2 with a tilde,B.
In the second part of the protocol Alice and Bob apply classical post-processing steps to produce their final keys. We choose classical post-processing steps that optimise the key rate, but which may not be optimal in other aspects, e.g., computation time. The protocol and the analysis can easily be adapted for other choices of classical post-processing.
We now describe the three post-processing steps, error correction, parameter estimation and privacy amplification, in detail. 24 Error correction. Alice and Bob use an error correction protocol EC to obtain identical raw keys K A and K B from their bits A,B. In our analysis we use a protocol, based on universal hashing, which minimises the amount of leakage to the adversary [BS93,RW05] (see also Section 3.3.2 in [Bea15] for details). To implement this protocol Alice chooses a hash function and sends the chosen function and the hashed value of her bits to Bob. We denote this classical communication by O. Bob uses O, together with his prior knowledgeBXYT, to compute a guessÂ for Alice's bits A. If EC fails to produce a good guess Alice and Bob abort; in an honest implementation this happens with probability at most ε c EC . If Alice and Bob do not abort then they hold raw keys K A = A and K B =Â and K A = K B with probability at least 1 − ε EC .
Due to the communication from Alice to Bob leak EC bits of information are leaked to the adversary. The following guarantee follows for the described protocol [RW05]: for ε c EC = ε ′ EC + ε EC and where H A|BXYT can be 24 We remark that in Step 2 of Protocol 2 Alice and Bob choose T i together (or exchange its value between them) in every round of the protocol and choose their inputs accordingly. This is in contrast to choosing Alice and Bob's input from a product distribution and then adding a sifting step, as usually done in QKD protocols. It follows from our proof technique that making T i public as we do dose not compromise the security of the protocol. bounded by above using the asymptotic equipartition property [TCR09] (see Equation (26) below for the explicit bound in that case). If a larger fraction of errors occur when running the actual DIQKD protocol (for instance due to adversarial interference) the error correction might not succeed, as Bob will not have a sufficient amount of information to obtain a good guess of Alice's bits. If so this will be detected with probability at least 1 − ε EC and the protocol will abort.
Parameter estimation. After the error correction step, Bob has all of the relevant information to perform parameter estimation from his data alone, without any further communication with Alice. UsingB and K B , He aborts if the observed Bell violation is too low, that is, if j C j < (ω exp − δ est ) · j T j .
As Bob does the estimation using his guess of Alice's bits, the probability of aborting in this step in an honest implementation, ε c PE , is bounded by since conditioned on the error correction protocol succeeding the probability of aborting in the honest case is exactly as in the entropy accumulation protocol (Protocol 1).
Privacy amplification. Finally, Alice and Bob use a (quantum-proof) privacy amplification protocol PA (which takes some random seed S as input) to create their final keysK A andK B of length ℓ, which are close to ideal keys, i.e., uniformly random and independent of the adversary's knowledge. For simplicity we use universal hashing [RK05] as the privacy amplification protocol in the analysis below. Any other quantum-proof strong extractor, e.g., Trevisan's extractor [DPVR12], can be used for this task and the analysis can be easily adapted.
The secrecy of the final key depends only on the privacy amplification protocol used and the value of H εs min (A|XYTOE), evaluated on the state at the end of the protocol, conditioned on not aborting. For universal hashing, for every ε PA , ε s ∈ (0, 1) a secure key of maximal length is produced with probability 25 at least 1 − ε PA − ε s . The main theorem of this section is the following security result for Protocol 2: Theorem 12. The DIQKD protocol given in Protocol 2 is (ε s QKD , ε c QKD , ℓ)-secure according to Definition 3, with ε s QKD ≤ ε EC + ε PA + ε s + ε EA , ε c QKD ≤ ε c EC + ε c EA + ε EC , and where η opt is specified in Equation (7).
In Section 5.5 we plot the resulting key rates, ℓ/n, for different choices of parameters. Theorem 12 follows by combining Lemmas 13 and 15 that we prove in the following sections.
Protocol 2 CHSH-based DIQKD protocol Arguments: D -untrusted device of two components that can play CHSH repeatedly n ∈ N + -number of rounds γ ∈ (0, 1] -expected fraction of test rounds ω exp -expected winning probability in an honest (perhaps noisy) implementation δ est ∈ (0, 1) -width of the statistical confidence interval for the estimation test EC -error correction protocol which leaks leak EC bits and has error probability ε EC PA -privacy amplification protocol with error probability ε PA 1: For every round i ∈ [n] do Steps 2-4:

4:
Alice and Bob use D with X i , Y i and record their outputs as A i andB i respectively.
5: Error correction: Alice and Bob apply the error correction protocol EC, communicating O in the process. If EC aborts they abort the protocol. Otherwise, they obtain raw keys denoted by K A and K B .
6: Parameter estimation: UsingB and K B , Bob sets C i = w CHSH K B i ,B i , X i , Y i for the test rounds and C i =⊥ otherwise. He aborts if j C j < (ω exp − δ est ) · j T j . 7: Privacy amplification: Alice and Bob apply the privacy amplification protocol PA on K A and K B to create their final keysK A andK B of length ℓ as defined in Equation (22).

The honest implementation
The honest (but possibly noisy) implementation of the protocol is one where the device D performs in every round i of the protocol the measurements M ai xi ⊗ M bi yi on Alice and Bob's state ρ QAQB . The state and measurements are such that the winning probability achieved in the CHSH game in a single round is ω exp . 26 For the measurements (X i , Y i ) = (0, 2) we denote the quantum bit error rate, i.e., the probability that A i = B i while using these measurements, by Q. Thus, in the honest case we assume the device D behaves in an i.i.d. manner (and in particular an i.i.d. noise model for the quantum channels used in the protocol): it is initialised in an i.i.d. bipartite state, ρ ⊗n QAQB , on which it makes i.i.d. measurements. As an example, one possible realisation of such an implementation is the following. Alice and Bob share the two-qubit Werner state ρ QAQB = (1−ν) |φ + φ + |+νI/4 for |Φ + = 1/ √ 2 (|00 + |11 ) and ν ∈ [0, 1]. The state ρ QAQB arises, e.g., from the state |Φ + after going through a depolarisation channel. For every i ∈ [n], Alice's measurements X i = 0 and X i = 1 correspond to σ z and σ x respectively and Bob's measurements Y i = 0, Y i = 1, and Y i = 2 to σz +σx √ 2 , σz −σx √ 2 and σ z respectively. The winning probability in the CHSH game and Q = ν 2 .

Completeness
The completeness of the protocol follows from the honest i.i.d. implementation described in Section 5.2 and the completeness of the entropy accumulation protocol as shown in Section 3.2.
Lemma 13. Protocol 2 is complete with completeness error ε c QKD ≤ ε c EC +ε c EA +ε EC . That is, the probability that the protocol aborts for an honest implementation of the device D is at most ε c QKD . Proof. There are two steps at which Protocol 2 may abort: after the error correction (Step 5) or in the Bell violation estimation (Step 6). By the union bound, the total probability of aborting is at most the 26 Note that in our notation, the noise that affects the winning probability in the CHSH game is already included in ωexp. probability of aborting in each of these steps. Using this and Equation (20) we get:

Soundness
To establish soundness, first note that by definition, as long as the protocol does not abort it produces a key of length ℓ. Therefore it remains to verify correctness, which depends on the error correction step, and security, which is based on the privacy amplification step. To prove security we start with Lemma 14, in which we assume that the error correction step is successful. We then use it to prove soundness in Lemma 15. LetΩ denote the event of Protocol 2 not aborting and the EC protocol being successful, and let ρ ABXYTOE|Ω be the state at the end of the protocol, conditioned on this event.
Success of the privacy amplification step relies on the min-entropy H εs min (A|XYTOE)ρ |Ω being sufficiently large. The following lemma connects this quantity to H εs 4 min (AB|XYTE) ρ |Ω , on which a lower bound is provided by Corollary 11.
Lemma 14. For any device D, letρ be the state generated in Protocol 2 right before the privacy amplification step, Step 7. Letρ |Ω be the state conditioned on not aborting the protocol and success of the EC protocol. Then, for any ε EA , ε EC , ε s ∈ (0, 1), either the protocol aborts with probability greater than 1 − ε EA − ε EC or Proof. Consider the following events: 1. Ω: the event of not aborting in the entropy accumulation protocol, Protocol 1. This happens when the Bell violation, calculated using Alice and Bob's outputs (and inputs), is sufficiently high.
2.Ω: Suppose Alice and Bob run Protocol 1, and then execute the EC protocol. The eventΩ is defined by Ω and K B = A.
3.Ω: the event of not aborting the DIQKD protocol, Protocol 2, and K B = A.
The state ρ |Ω then denotes the state at the end of Protocol 1 conditioned onΩ.
As we are only interested in the case where the EC protocol outputs the correct guess of Alice's bits, that is K B = A (which happens with probability 1 − ε EC ), we haveρ AXYTE|Ω = ρ AXYTE|Ω (noteB and B were traced out fromρ and ρ respectively). Hence, Using the chain rule given in [Tom15, Lemma 6.8] together with Equation (24) we get that where the first inequality is due to the chain rule [Tom15, Equation (6.57)] and the second is due to strong sub-additivity of the smooth max-entropy.
Using Lemma 14, we prove that Protocol 2 is sound.
Lemma 15. For any device D letρ be the state generated using Protocol 2. Then either the protocol aborts with probability greater than 1 − ε EA − ε EC or it is (ε EC + ε PA + ε s )-correct-and-secret while producing keys of length ℓ, as defined in Equation (22).
Proof. Denote all the classical public communication during the protocol by J = XYTOS where S is the seed used in the privacy amplification protocol PA. Denote the final state of Alice, Bob, and Eve at the end of Protocol 2, conditioned on not aborting, byρK AKB JE|Ω . We consider two cases. First assume that the EC protocol was not successful (but did not abort). Then Alice and Bob's final keys might not be identical. This happens with probability at most ε EC .
Otherwise, assume the EC protocol was successful, i.e., K B = A. In that case, Alice and Bob's keys must be identical also after the final privacy amplification step. That is, conditioned on K B = A,K A =K B .
We continue to show that in this case the key is also secret. The secrecy depends only on the privacy amplification step, and for universal hashing a secure key is produced as long as Equation (21) holds. Hence, a uniform and independent key of length ℓ as in Equation (22) is produced by the privacy amplification step unless the smooth min-entropy is not high enough (i.e., the bound in Equation (23) does not hold) or the privacy amplification protocol was not successful, which happens with probability at most ε PA + ε s . According to Lemma 14, either the protocol aborts with probability greater than 1 − ε EA − ε EC , or the entropy is sufficiently high for us to have Combining both cases above we get that Protocol 2 is sound (that is, it produces identical and secret keys of length ℓ for Alice and Bob) with soundness error at most ε EC + ε PA + ε s .

Key rate analysis
Theorem 12 establishes a relation between the length ℓ of the secure key produced by our protocol and the different error terms. As this relation, given in Equation (22), is somewhat hard to visualise, we analyse the key rate r = ℓ/n for some specific choices of parameters and compare it to the key rates achieved in devicedependent QKD with finite resources [SR08a,SR08b] and DIQKD with infinite resources and a restricted set of attacks [PAB + 09].
The key rate depends on the amount of leakage of information due to the error correction step, which in turn depends on the honest implementation of the protocol. We use the honest i.i.d. implementation described in Section 5.2 and assume that in the honest case the state of each round is the two-qubit Werner state ρ QAQB = (1−ν) |φ + φ + |+νI/4 (and the measurements are as described in Section 5.2). The quantum bit error rate is then Q = ν 2 and the expected winning probability is ω exp = 2+ . We emphasise that this is an assumption regarding the honest implementation and it does not in any way restrict the actions of the adversary (and, in particular, the types of imperfections in the device). Furthermore, the analysis done below can be adapted to any other honest implementation of interest.

Leakage due to error correction
To compare the rates we first need to explicitly upper bound the leakage of information due to the error correction protocol, leak EC . As mentioned before, this can be done by evaluating H The non-asymptotic version of the asymptotic equipartition property [TCR09, Theorem 9] (see also [Tom12,Result 5]) tells us that for τ = 2 2 Hmax(Ai|BiXiYiTi) + 1 and δ(ε ′ EC , τ ) = 4 log τ 2 log 8/ε ′ 2 EC . For the honest implementation of Protocol 2 where the first equality follows from the definition of conditional entropy and the second from the way T i is chosen in Protocol 2. The last equality holds since for generation rounds the error rate (i.e., the probability that A i andB i differ) in the honest case is Q and for test rounds givenB i , X i and Y i Bob can guess A i correctly with probability ω exp .
We thus have

Key rate curves
In Figure 5 the key rate r is plotted as a function of the quantum bit error rate Q for several values of n. For n = 10 15 the curve already essentially coincides with the key rate achieved in the asymptotic i.i.d. case, that is, when restricting the adversary to collective attacks [PAB + 09, Equation (12)] (see also Figure 2 therein).
As the key rate for the asymptotic i.i.d. case was shown to be optimal in [PAB + 09] (for practically the same protocol) it acts as an upper bound on the key rate and the amount of tolerable noise for the general case considered in this work. Hence, for large enough n our key rate becomes optimal and the protocol can tolerate up to the maximal error rate Q = 7.1%.
In an asymptotic analysis (i.e., with infinite resources n → ∞) it is well understood that the soundness and correctness errors ε s QKD , ε c QKD should tend to zero as n increases. However, in the non-asymptotic scenario The key rate r = ℓ/n as a function of the number of rounds n for several values of the quantum bit error rate Q. For Q = 0.5%, 2.5%, and 5% the achieved key rates are approximatly r = 87%, 53%, and 22% respectively. The following values for the error terms were chosen: ε EC = 10 −10 , ε s QKD = 10 −7 and ε c QKD = 10 −2 .
considered here these errors are always finite. We therefore fix some values for them which are considered to be realistic and relevant for actual applications. We choose the parameters such that the security parameters are at least as good (and in general even better) as in [SR08a], such that a fair comparison can be made. All other parameters are chosen in a consistent way while (roughly) optimising the key rate.
In Figure 6 r is plotted as a function of n for several values of Q. As can be seen from the figure, the achieved rates are significantly higher than those achieved in previous works. Moreover, they are practically comparable to the key rates achieved in device-dependent QKD (see Figure 1 in [SR08a]). The main difference between the curves for the device-dependent case and the independent one is the minimal value of n which is required for a positive key rate. (That is, for the protocols considered in [SR08a] one can get a positive key rate with less rounds.) It is possible that by further optimising the parameters a positive key rate can also be achieved in our setting in the regime n = 10 4 − 10 5 for the different error rates.

Randomness expansion
We show how the entropy accumulation protocol can be used to perform randomness expansion. This can be achieved based on any non-local game for which one is able to prove a good bound in Equation (4). For concreteness we focus on the CHSH game, for which an explicit bound is provided by Corollary 11. Although the protocol can be used to achieve larger expansion factors, we give specific bounds that optimise the linear output rate, under the assumption that a small linear number of uniformly random bits is available to the experimenter for the execution of the protocol.
In order to minimise the amount of randomness required to execute the protocol we adapt the main entropy accumulation protocol, Protocol 1, by deterministically choosing inputs in the generation rounds from X g = {0} and Y g = {0}. In particular there is no use for the input 2 to the B device, and no randomness is required for the generation rounds. 27 Aside from the last step of randomness amplification the remainder of the protocol is essentially the same as Protocol 1 (in its instantiation with the CHSH game considered in Section 4). The complete protocol is described as Protocol 3.

Protocol 3 Randomness expansion protocol
Arguments: D -untrusted device of two components that can play G repeatedly n ∈ N + -number of rounds γ ∈ (0, 1] -expected fraction of test rounds ω exp -expected winning probability in G for an honest (perhaps noisy) implementation δ est ∈ (0, 1) -width of the statistical confidence interval for the estimation test Bob chooses a random bit T i ∈ {0, 1} such that Pr(T i = 1) = γ.

3:
If T i = 0 Alice and Bob choose (X i , Y i ) = (0, 0). If T i = 1 they choose uniformly random inputs

4:
Alice and Bob use D with X i , Y i and record their outputs as A i and B i respectively.

5:
6: Alice and Bob abort if j C j < (ω exp − δ est ) · j T j . 7: They return Ext(AB, Z) where Ext is the extractor from Lemma 18 and Z is a uniformly random seed.
Corollary 11 provides a lower bound on the min-entropy generated by the protocol. Given we are concerned here not only with generating randomness, but also with expanding the amount of randomness initially available to users of the protocol, we now evaluate the total number of random bits that is needed to execute Protocol 3.
Input randomness. Random bits are required to select which rounds are generation rounds, i.e. the random variable T, to select inputs to the devices in the testing rounds, i.e. those for which T i = 0, and to select the seed for the extractor in Step 7.
The random variables T i are chosen independently according to a biased Bernoulli(γ) distribution. The following lemma shows that approximately γ 3 n uniformly random bits are sufficient to generate the T i , provided one allows for the possibility of a small deviation error.
Lemma 16. Let γ > 0. There is an efficient procedure such that for any integer n, given r = 6γn uniformly random bits as inputs the procedure either aborts, with probability at most ε SA = exp(−Ω(γ 3 log −2 γn)), or outputs n bits T 1 , . . . , T n whose distribution is within statistical distance at most ε SA of n i.i.d. Bernoulli(γ) random variables.
Proof. It is well-known that using the interval algorithm [H + 97] it is possible to sample exactly from m i.i.d. Bernoulli(γ) random variables using an expected number of random bits at most h(γ)m + 2; furthermore the maximum number of random bits needed is at most Cm log γ −1 for some constant C.
In order to obtain a bound on the maximum number of random bits used that holds with high probability, let α = h(γ) and partition {1, . . . , n} into at most t = ⌈αn⌉ chunks of m = ⌈1/α⌉ consecutive integers each. Suppose we repeat the interval algorithm for each chunk. Let N i be the number of uniform bits used to generate the T j associated with the i-th chunk. Then by the above E[N i ] ≤ h(γ)m + 2 and N i ≤ Cm log γ −1 . Applying the Hoeffding inequality, for some constants C ′ , C ′′ > 0 and given our choice of α.
the purposes of randomness expansion this does not even require communication as we may assume the parties are co-located.
Remark 17. If one is willing to settle for a bound on the number of uniform bits used in expectation then using the procedure from [H + 97] it is possible to exactly sample n i.i.d. Bernoulli(γ) random variables using an expected number of random bits at most h(γ)n + 2.
It remains to account for the random bits required to generate inputs in the testing rounds, for which T i = 0. By Hoeffding's inequality there are at most 2γn such rounds except with probability exp(−Ω(γ 2 n)) ≤ ε SA for large enough n. Together with Lemma 16 we conclude that 10γn uniformly random bits are sufficient to execute the protocol with a probability of success (up to but not including step 6) at least 1 − e −Ω(γ 3 )n . We also note that if one is only concerned with the expected number of random bits used then (h(γ) + γ)n + 2 bits are sufficient.
Extraction. In the last step of the protocol, Step 7, the user applies a quantum-proof extractor to AB in order to produce a random string that is close to being uniformly distributed. This step requires the use of an additional seed of uniformly random bits. We use the following construction based on Trevisan's extractor, designed to maximise the output length while not using too much seed.
Lemma 18. For any δ > 0 there is a c = c(δ) > 0 such that the following holds. For all large enough integer n and any k ≥ δn there is an efficient procedure Ext : {0, 1} 2n × {0, 1} d → {0, 1} m such that d = ⌈δn⌉ and m = ⌈k − 9 log k⌉, and is such that for ε EX = exp(−c(n/ log n) 1/2 ) and any classical-quantum state ρ AE such that H min (A|E) εEX ρ ≥ k it holds that where Z ∈ {0, 1} d is a uniformly distributed random seed and ρ Um , ρ U d are totally mixed states on m and d bits respectively.
Proof. We use the construction given in [DPVR12, Corollary 5.1]. To get the parameters stated here we note that provided c is chosen small enough with respect to δ our choice of ε EX ensures that the seed length d = O(log 2 (n/ε EX ) log m) can be made smaller than δn. The conclusion on the trace distance follows from the guarantee of strong extractor given by [DPVR12, Corollary 5.1] using an argument similar to the proof of [AFPS15, Lemma 17].
We state the results of the above discussions as the following theorem stating the guarantees of the randomness expansion protocol.
Proof. Let D be any device and ρ the state (as defined in Equation (1)) generated right before Step 6 of Protocol 3. Let Ω (as defined in Equation (2)) be the event that the protocol does not abort, and ρ |Ω the state conditioned on Ω. Then, applying Corollary 11 we obtain that for any ε EA , ε s ∈ (0, 1), either the protocol aborts with probability greater than 1 − ε EA or H εs min (AB|XYTE) ρ |Ω > n · η opt (ε s , ε EA ) , where η opt is defined in Equation (7). 28 Given the bound Equation (27), the guarantee on m claimed in the theorem follows from Lemma 18.
Finally, completeness of the protocol follows directly from completeness of the entropy accumulation protocol, Protocol 1, as stated in Lemma 8.
Assuming a choice δ = γ, the number of random bits required in the protocol scales linearly, roughly as ∼ 3γn. For γ → 0 the values of η opt plotted in Figure 3 give a good idea of the rate of randomness expansion that can be achieved from Protocol 3 for different choices of the security parameters. The rate is to be compared to the optimal rate that was shown achievable for randomness expansion in the case of classical adversaries only in [PAM + 10] (Figure 2; see also [PM13]). In the range of small constant γ > 0 we obtain essentially the same rate, but our result holds against quantum adversaries. The rate is much better than the ones obtained in [VV12,MS14a].

Open questions
Several questions are left open.
1. Is it possible to get a better dependency of the rate curves on the number of rounds n? As can be seen from Figures 3 and 5 our rate curves approach (and essentially coincide) with the optimal curves as n increases. One thing that can perhaps still be further optimised is the dependency on n, or in other words, how fast the curves approach the optimal curve. The explicit dependency on n given in Equation (22) is already close to optimal, but the numerical analysis used to plot the curves can be made somewhat better for the range of n = 10 4 − 10 6 . Although this seems like a minor issue, it can make actual implementations more feasible.
2. Are there similar protocols, based on a different Bell inequality, that can lead to better entropy rates? To apply our proof to other Bell inequalities one should find a good bound on the min-tradeoff function, as done in Equation (11) for the CHSH inequality. For many Bell inequalities such bounds are known, but for the min-entropy instead of the von Neumann entropy. In most cases using a bound on the min-entropy will result in far from optimal rate curves. Therefore, to adapt our protocol to other Bell inequalities one should probably bound the min-tradeoff function using the von Neumann entropy directly. Unfortunately, we do not know of any general technique to achieve such tight bounds.
3. Are there other protocols, e.g., with two-way classical post-processing, which achieve better key rates?
The optimality of our key rates is only with respect to the structure of the considered protocol. Final key length in the DIQKD protocol Given in Eq. (22) ε c QKD Completeness error of the DIQKD protocol ε c QKD ≤ ε c EC + ε c EA + ε EC ε s QKD Soundness error of the DIQKD protocol ε s QKD ≤ ε EC + ε PA + ε s ε SA Error probability of the input sampling procedure used Given in Lemma 16 in the randomness expansion protocol ε EX Error probability of the extractor used in the Given in Lemma 18 randomness expansion protocol