# [Busec] Notes from RWC

Sarah Scheffler sscheff at bu.edu
Sat Jan 7 18:27:44 EST 2017

OOPS

On Sat, Jan 7, 2017 at 6:13 PM Sarah Scheffler <sscheff at bu.edu> wrote:

> Hey all,
>
> I've gotten four separate requests for my notes from RWC, so here they
> are.  They're not terribly consistently formatted or well-edited, but there
> you go.
>
> Cheers,
> Sarah
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs-mailman.bu.edu/pipermail/busec/attachments/20170107/da16f167/attachment-0001.html>
-------------- next part --------------

RWC 2018 is in Zurich!!!!!  Jan 10-12

Mentions of the old US export regulations:
3

Colored hair count:
9, maybe 10 but I think one was a repeat

# Rich Salz (Akamai, OpenSSL dev team) - OpenSSL
rsalz@{akamai.com,openssl.org}

* Lots of issues with how they managed code.  Testing basically meant "does this commit run on my machine? good, take the patch".  Perl scripts generating assembly code, etc etc.  Basically no money, codebase was ridiculously arcane and difficult to learn or read, no community building, no ability to have a plan or keep to it.
* I like this guy.  Makes fun of his own slides, refers to The Event in openssl history.  (The CVE That Must Not Be Named)
* First real CVE that had a logo and a name. Useful because you were sure you were talking about the same thing.  Encourage researchers to come up with ~~~names~~~, not just websites/logos/CVEs.  Having a crisis actually helped!  Being colocated helped.  And then *keeping up* the meetings.
* Recovery led to new blood (and new enthusiasm).  Got actual release policies, security policies, coding style...  Actual consensus as to where to put the curly braces!  Recovery stuff: transparency - build community, documentation, making website less wordy.  Positive feedback loop - when something isn't a black hole, people contribute!  Moving to modern practices, code coverage of tests, continuous integration, fuzzing (openssl is oooooold)
* Tension between "safe crypto" and "everyone's crypto".
* Stuff to come: ~~~fix the RNG~~~

# Thai Duong (Google Security) - Project Wycheproof ("witch-uh-proof"); scaling crypto testing thaidn@ some google domain???

* Lots of in-house crypto at google (for products like android, cloud, etc; for protoocls like e2e encryption, database enc, url signing, verified boot; and for APIs like AEAD, MAC, hybrid encryption, digital signatures).  And only third party in OpenSSL, OpenJDK, Bouncy Castle, Conscrypt.....
* Good crypto implementation guidelines are hard to come by!!!
* Wycheproof: 80+ open source unit tests uncovering 40+ bugs in popular implementations of ECC, RSA, DH, DSA, AEAD (most tests "defense in depth"). There are also test runners for bouncy castle, spongy castle, openjdk
* Wouldn't it be nice: common crypto interfaces for C++, Python, Go, Javascript, etc.  Need robust and readable interfaces, allow switching algorithms in an existing application.  Show crypto properties right in the code.  never ask users to provide critical input.
* Most libraries do not allow the user to easily switch algorithms

# X.509 sucks - L Jean Camp (we met her at starbucks!) - Indiana University

* They used a dataset of certificates people used on a daily basis - this is not all IPv4, this is many top phishing sites (?????)
* The small blue one is SHA 1 (yay) the disappearing grey one is MD5 (yayyy), but SHA256 goes up and then goes back *down*
* Big competition for phishing is cloud providers...... - different CAs provide phishing than the big 3 overall... - date of issue, date seen, lack of extensions, other features... you can distinguish them very well with random forest and some other methods.
* The good news: if you have TLS you're probably not a phishing site.  The bad news: if you're a social engineer you probably won't bother
* All of this does NOT include cloud providers........................
* TLS phishing is small compared to overall payment fraud, but is growing.  Low-expertise victims, though high impact to those individuals... at least I think that's what she's getting at?
* Example of misguided response: a lot of people, post-Heartbleed, got their CA to resign their cert but didn't change keys.  A few people downgraded (?)
* "There are people who went from SHA1 to MD5.  We're bad and we could be worse!"
* Google Play requires lifetime of 25 years, does not require revocation information (!)
* Sen.se: Verifies connection to server.  Then sends data through second unsecured web socket.
* PKI IoT; they're putting this together in the seattle area around august
* We have the infrastructure, but we don't have the interfaces, we don't have the incentive structure.

# Quan Nguyen cryptanalysis of json web signature/encryption (go-jose) in some GCM implementation (didn't quite catch it)

* bugs in OpenSSL's GCM wrapper.  Something about Square, Inc
* Encryption.signing input is mostly under "our" control, whereas decryption.signature verification input is always under attacker's control
* Json token provides (multiple) signatures; ECDH, CBC-HMAC encryption.  Header, payload, signature.
* RFC7515, JWK (JSON Web Key) header parameter is public key that corresponds to key used to sign
* Apparently you can stick the HMAC key in there instead and it doesn't complain!
* Turn integer overflow into HMAC bypass.  Shift boundary between AAD and nonce.
* Sometimes if only one signatures valid, verify() returns payload... I didn't quite get all the details of this, but I guess you can muck with the stuff inside the payload and it'll still return if one of the signatures are invalid?  Could be a revocation key or something?
* I hope these slides are available somewhere because it's all going by very quickly
* OpenSSL GCM's wrapper - input auth_tag.size() rather than hardcoded 16; auth_tag is what you get on the wire so it's under attacker's control.  Auth tag trucnation attack: attacker sends 1 byte auth_tag.  After 2^32 blocks, counter will wrap around and cause counter collision - this leaks plaintext and auth key????????? wtf?

----------------------------------------------------------------------------------------------

# Sharon - NSEC5 - provably preventing DNSSEC Zone Enumeration (implemented, optimized)

* DNSSEC in a nutshell: provide authentication to prevent a MITM from mucking with the connection between the DNS resolver and the authoritative nameserver
* NSEC5: a proposal forDNSSEC authenticated denial of existence (q.bu.edu does not exist) - also has integrity even if the nameserver is compromsed, and prevents offline zone enumeration
* NSEC: zone enumeration attack: Old denial of existence mechanisms: RFC4034.  Take consecutive pairs and sign them (a.com, c.com; c.com, z.com; z.com, a.com); zone signing key ZSK.  Resolver can pretty easily learn all domain names in the zone.  You might not want to just let anyone learn all the names in your zone...
* NSEC3: Hash each name (in this, SHA1).  Sign each pair of hashes.  Attacker can get all hashes in the same way as before, and you can then just do an offline dictionary attack similar to password cracking.
* So now we have to move to online signing.  Give secret ZSK to nameserver (means if nameserver is compromised, you're screwed).  Take +1, -1 of each hash value and return that.  Hash values are independent of any names actually in the zone  "NSEC3 white lies"
* For any scheme that both prevents offline zone enumeration and provides integrity against outsiders, nameservers must compute a public-key operation for each negative response.
* NSEC5 prevents offline zone enumeration while still providing integrity against compromised nameserver
* NSEC5: Replace the hash H with a verifiable random function VRF that resolvers cannot compute offline.  It's a keyed hash that's a public key version.  <- find out what this is
* Old: NSEC5-RSA: H(Pi(a.com)) = 9ae3e.  Pi is deterministic RSA signature, H is SHA256.
* New: NSEC5-ECC: Pi is from a 2013 paper, they prove it's a VRF; For 256-it elliptic curves, Pi gives 641-bit outputs.
* Send H(Pi(q.com)).  Also send "proof" that is just Pi(q.com).  Does proof match query?
* Proof vales are unique given query and public VRF key
* Attacker can send proof, but doesn't knkow secret ZSK, so can't forge NSEC5 pairs!  There's nothing to replay!
* Elliptic curve versions of both NSEC3 and NSEC5 fit in a single packet (yay!)
* Replay from a time when there was no record?  Answer: not really my problem.  They have time records.....
* Comment from audience: "Offline signing is cool, but most people can't do it reliably." <- hmmmmm

# Daniel Franke (Akamai) - NTP

* "There are more things that use NTP than are dreamt of in your philosophy"
* Authentication options: nothing, symmetric (slap a MAC on there, which depends on you sharing a key with your server...), autokey (thoroughly broken)
* Main things in packet (which is basically entirely header) - origin (client sent), receive (server receive), transmit (server response sent), destination (client receives response)
* RTT = delta = (t4-t1)-(t3-t2).   theta = t_server - t_client \approx ((t2-t1)+(t3-t4))/2 - this is based on assumption that latency is symmetric, and you can imagine that an attacker could break that assumption pretty easily
* Mode - most NTP uses client(3)/server(4).  But there's also a symmetric mode (where two things sync to each other), and broadcast (Daniel is not a fan of this mode and he doesn't know where it gets used).  "Private mode" 7 is mostly used in DDos amplification.
* Symmetric auth problems: MD5(Key||message) - not as bad as it could be, because length extensions infeasible; Replay protection - i mean it's just a timestamp so you could use anything; you could use random values! though birthday paradox squishes this a bit); Same key used both ways which means you can replay a received packet as a response packet; Until 2015, any packet that contained a MAC would accept!  An error flag was sent but nothing ever checked the error flag!
* Autokey - basically it wasn't a bad attempt in the 80s/90s but it sucks and the creator has widely said to stop using it.
* NTP crypto difficulties: merely delaying a packet can change its semantics; can't tell if certificates have expired (chicken-egg), but you can at least query multiple servers.; TLS can't deal with broadcast mode, answer to this is to not use broadcast mode;
* Important to allow NTP servers to be stateless.  NTP scales ridiculously - a busy NTP server has a HUGE amount of clients - 10s of millions - and if you need even a small amount of state you're gonna have hella memory pressure.
* Network Time Security
* Client/server mode: TLS handshake on a separate port; one volley of TLS application data to negotiate algorithms and to provide cookies to client, then close connection.  Servers discard state.  Export pair of AEAD keys derived from master secret.  Via NTP extension fields, client sends cookie and nonce, authenticated as additional data.  Plaintext nulll.  Server echoes back nonce, sends fresh encrypted cookie to protect client privacy (adv can't see same cookies sent as client moves between networks).  Analogous to how TLS 1.3 does this.
* Shoutout to Aanchal!
* Roughtime - not NTP or even close, targets precision to +/-10s, no handshake.  Main benefit: publicly verifiable proof of misbehaving servers (blockchain-ish)
* aw heck i missed his github.  Never mind got it later: https://github.com/dfoxfranke/nts
* DNSSEC signatures sometimes very long - state.gov has a year????  Audience commenter: don't do that.
* Is it really too slow to just sign the responses?  Yes.

# Levchin Prize for Real World Cryptography

* Max Levchin is a pretty cool dude (celebrity in startup world; paypal, yelp, big support for crypto, etc etc, he funds the Levchin prize in perpetuity, his head will be here in 100 years)
* Joan Daemen - development of AES and SHA3 (!!!!!)
* Thanks NIST for choosing a proposal by two Belgians over a lot of prominent US company proposals
* Moxie Marlinspike, Trevor Perrin - for Signal protocol! :D
* Totally tries to justify the "we access your address book" thing (which imo is fine but i'm still amused)

----------------------------------------------------------------------------------------------

# Joppe W Bos - Grey box crypto

* Black box.  Endpoints are trustedd parties, attacker observes data being transferred
* Grey box: Hardware implementations tend to leak key-correlated implementation- passive: time, power, em radiation; active: inject faults, modify hardware, modify environment
* White box: Adversary owns device running the software
* Original use case for white box crypto: DRM.  Streaming content, protecting DVDs, etc.  Then, recent trend: Host Card Emulation (HCE) to communicate using Near Field Communication (NFC) - replace the secure element with software.  Protection of the cryptographic key is white-box implementation.
* 2014, Visa+Mastercard.
* Why not normal crypto software code?  Entropy attack - locate unusually high entropy of cryptographic key in a memory dump using (e.g.) sliding windows (Playing 'Hide and Seek' with Stored Keys, Financial Cryptography 1999); S-box blanking attack - locate publicly defined S-boxes in the binary and rewrite it with all 0's such that S(x)=0 for any x.
* Is WB theoretically possible?  "Ideal" WB AES implementation is one big lookup table; requires 2^92 TB storage.  Generic idea: transform cipher into network of randomized key-instantiated lookup tables.
* In practice, the white box is only a small part of the software implementation (strong code obfuscation; binary is "glued" to the environment; Engineering Code Obfuscation (Christian Collberg EC 2016)
* All WB implementation efforts have been broken, but the attacks involve time-consuming reverse-engineering; identify WB scheme + target specific lookup tables; *then* alg attack
* These authors used GB methods on WB implmeentation, were automatic (CHES challenge), without knowledge of implementation choices (only algorithm), ignore code obfuscation (?)
* Idea: Pin or Valgrind (dynamic binary instrumentation tools -> visual representation 0> traces to find correlation); record all instructions + mem accesses
* Can read code (instructions).  If they're unrolled, can read data reads.  And if that's unrolled, can still read the stack (and it's pretty darn hard to hide access patterns on the stack)
* Differential computation analysis: naive approach: port white-box to a smartcard and measure power consumption
* HW analogy: this is like probing each bus-line individually without any error.
* They tried WB implementations and thought they wouldn't leak any side-channel information, but they did work!  Intuition: encodings do not sufficiently hide correlations when the correct key is used
* Countermeasure: random masks/delay (but attacker could disable entropy source), static random data; threshold implementation; detect whether you're working within valgrind/pin (not great); DCA might fail when using larger encodings, larger lookup tables, but this makes thinsg way slower and possibly infeasibly slow, and anyway the algebraic attacks will still work; riscure has software fault attacks;
* TL;DR software-only solutions are becoming more popular, but those rely heavily on WB crypto, and the security of all(?) of these are questionable at best

# NIST and post-quantum crypto, Rene Peralta

* "Large" quantum computers would break most public key (RSA, DHKE, ECC), symmetric crypto would be affected but not broken (longer key sizes)
* Most importantly, full transition to alternatives takes a long time, probably 10+ years, so the time to worry about this is now.
* NIST's PQC project: monitor QC/QA progress.  Find, standardize quantum-resistant alternatives for PKE, key-agreement, digital signatures.  Also important: ensure *transparency* of process and legitimacy of outcome.
* Choice of PQC algs not a competition; NIST hopes for community consensus.  Evaluation criteria not set in stone, it will evolve over the next few years.
* Call for proposals is out!  csrc.nist.gov/groups/ST/post-quantum-crypto/cfp-program.html
* PQC-forum at nist.gov
* Current things: Signatures: hash based, code-based, lattice-based, multivariate...; PKE: lattice-based, code-based, multivariate...; Key agreement: PKE, lattice-based, isogeny-based...
* Current status: speed looks good (surprising, yay!), key sizes may increase significantly, signature sizes look big, possibly significant increase in ciphertext size for short plaintexts (not sure what this would do to industry), we need industry to do an impact assessment, and that impact assessment needs to happen *now*.  When only a few candidates are left, that's too late for an impact assessment.
* Ongoing discussion regarding security levels and derived parameterization
* Nov 30, 2017: deadline for submissions.  Early 2018 - workshop.  3-5 years - analysis phase (1-2 workshops), 2 years later - draft standards.  This looks long-term to industry but pretty short-term as far as govt is concerned.

# Tancrede Lepoint - Cryptogrpahic Suite for Algebrayic Lattices - CRYSTALS

* Lattice crypto in strongSwan - NTRUEncrypt, BLISS signature, NewHope key exchange
* today: key exchange.  ClientHello ---- ServerHello; CetificateChain; ServerKeyExchange ---- ClientKeyExchange; ---- ClientComputeKey; Finished ---- ServerComputeKey; Finished.  Now they have a shared key.  ServerKeyExchange: key gen, send public key pk.  Client key exchange: sample random value, encrypt value using pk.  Send ciphertext c.  ClientComputeKey: key = KDF(value).  ServerComputeKey: decrypt c to get value, KDF(value)
* People use RLWE over LWE (that is, people use rings) because we're now in polynomials of ints rather than entire matrices of ints.
* CRYSTALS: simplicity (no reconciliation, no Gaussian sampling, CCA-secure KEM, no NTRU assump), modularity (easy to increase security, KEM can be used for encryption (KEM-DEM), KE, AKE)
* KEM - Kyber; digital signature - Dilithium
* Module lattices: "more general" than ring lattices - finitely generated modules over ring of ints in a number field
* Code WILL exist but doesn't yet: https://github.com/pq-crystals/kyber
* If we build it in right now
* Performs on par/better than other lattice-based schemes when plugged into ./openssl speed
* they say they have https://pq-crystals.org but their cert is broken :c
* Internships - side-channel protection aspects of post-quantum crypto - Belgium (NXP)
* post-quantum IOT -
* Post quantum signatures for V2V communication and secure post-quantum impl - wwhyte at securityinnovaction.com

# Valeria Nikolaenko: New Hope and Frodo (the post-quantum KE algs)

* Primitives used in TLS: public key crypto (RSA, DH, DSA, ECDH, ECDSA), symmetric key crypto (AES-128), hash functions (SHA-256, SHA3-256) - all PKC here no longer secure with quantum; symm+hash need longer keys but there's no exponential increase
* Important note - if someone in the future gets a quantum computer, and has logs of messages we send today, they WILL be able to read them.  Also, as mentioned previously, standardization takes a long time
* LWE - new foundation for key agreement (Ax+e \approx_c random).  Can expand this to n being a matrix (nxm) rather than a vector (n).  Frodo reduces to this.
* RLWE - (NTRU is similar) - can send only first row and other party can reconstruct (and actually you can do more optimization on top of that).  NewHope reduces to this.
* C      S
Y, E'      X, E
<-- AX+E ---
--- YA+E' -->
\approx YAX	\approx YAX
* Has nonzero probability of failure, but we can minimize this probability with some parameters
* Use fresh A every time (A := prg(seed)) <--- not sure how both parties get the same A? don't they need the same seed?
* (R)LWE considered to be quantum resistant; has worst- to average-case reductions; different than factoring, DH, in that you don't have to pick matrices according to specific standards, random is fine; and can also use this to get other things like FHE
* Cramer ducas Wesolowski'16 - recent quantum poly-time alg for sub-eponential gamma: Ideal-SVP-gamma < Ring-LWE < NewHope
* I'm pretty sure they missed a golden opportunity to name their RLWE thing Frodo; instead RLWE is NewHope and Frodo is just LWE.
* NewHope and Frodo are both fast compared to RSA3072, slow compared to ECDHA,NTRU; and they generate a lot more traffic
* Switching to hybrids: e.g. ECDHE+NewHope (idea: session key secure if at least one problem is hard. i don't believe that until i see implementation but it's not impossible)

# Isogeny-based KE (Michael Naehrig).

* Nist: P-256, P-384, P-521; IRTF/CFRG: Curve25519, Curve448; Widely used in TLS, OpenSSL, OpenSSH, Signal...
* Subgroups as secrets.   Assign 2-power torsion to Alice, and one to Bob (?).  Also works for 3.
* Isogeny: Homomorphic map between two elliptic curves.  More formally, non-constant rational map that is a group homomorphism.  Finite subgroup corresponds to a unique (up to isomorphism) curve and isogeny within that kernel.  Degree of seperable isogeny is number of elements in its kernel, same as its degree as a rational map
* https://www.microsoft.com/en-us/research/project/sidh-library/
* Basically new KE that uses these things called isogenies, which I haven't thought through enough to understand all the way, but they seem not unreasonable.
* Lattice based schemes all have this problem where if you use them statically (rather than ephemerally) you run into issues.
* "In the interest of keeping elliptic curves in cryptography, please take a look at isogenies."
* Needs to be broken (or attempted to be broen), needs to be sped up, and there are some open problems like public key validation and efficient signatures

----------------------------------------------------------------------------------------------

# Mike Hanburg (Rambus) - Strobe protocol framework

* "Talking about a new crypto protocol framework called Strobe.  It makes the screen black which is why it's called strobe."
* Simple protocols and handshakes, encrypt, MAC, hash, sign...; simple and easy to analyze, non-terrible performance; can be an instance of NIST cSHAKE agorithm (?)
* As an embedded engineer, he's seen a lot of one-off protocols or researchware.  "Why not TLS?" sometimes TLS doesn't work for your scenario (maybe one-way, maybe need stateless, maybe code size limited...)
* Academic protocols are great but something gets lost between "moral idea" -> math -> implementation
* You can have a provably secure protocol, but nitty gritty details get in the way (what if you have to hash a tuple of non-fixed-length things?)
* TLS 1.2 is another example
* The modern solution: hash all the things! (And by modern we mean probably 20 years but it's taken this long to get this used)
* Strobe idea: all things pass through strobe, and it updates all ur hashes.  Idea between two parties: if a message is changed on the wire, the next MAC will fail.
* Strobe ops: key/rekey, hash, PRF, send/recv in the clear, send/recv encrypted with key derived from everything sent so far (!), send/verify MAC, ratchet
* All of these described by four features
* Pattern is a duplex sponge construction (Keccak, etc)
* Goal: output of Strobe is a random oracle; input is all previous operations; (H("abc") != H("a","bc"); includes operation type.data; includes intended use of output)
* Operations with metadata: Output depends on intended use; maybe strobe-level ops are not quite enough.  Direct support in library for metadata operations.  Work the same but they're marked as metadata ops, and they specify what the next one is gonna be.  I think?  You're probably doing something like this anyway, sending things like tag length, so that the receiver can parse it.
* https://strobe.sourceforge.io

# Patrick Longa - FourQ (ECC) based crypto - high performance, low-power

* Nist ECC workshop 2015 - real motivation for work in CFRG is better performance and side-channel resistance of new curves.
* Consistent speedup (2.4x-2.9x) over Curve25519
* TBH I didn't super follow this; if it ends up being useful I should probably just watch the video
* SchnorrQ: signatures

----------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------

Day 2.

# High-throughput MPC - Yehuda Lindell

* 3 parties, >=2 honest
* Low latency = GC (constant rounds); high throughput = SS (low bandwidth, simple computations)
* Their semi-honest thing really is pretty speedy; they made a thing where you only have to share a single bit per AND gate (and you get XOR for free)
* Active directory breach in Kerberos will kill entire thing - Kerberos with protected credentials
* They rewrote Kerberos ticket-granting server and client to work with CTR mode (CBC cannot be parallelled).  Every user login requires 32 encryptions.
* User latency: 200 ms, single core 3,000 logins per second, 20 cores 41,000 per second (for a single server).
* Malicious security - prevent corrupted party from changing values, force corrupted party to behave honestly
* Lots of multiplication triples, open and check them.
* "A lot of work on this protocol was optimizing every single bit that was sent".  Bottleneck became cache misses, so they made a cache-efficient shuffling method that is suitable for cut-and-choose + combinatorial analysis
* Most optimized protocol sends 7 bits per AND gate.  Another variant has 10 bits per AND gate but has better online performance
* It's better than 1/7 of semi-honest protocol (mostly due to more optimization efforts focused here)

# MPC at Google - Ben Kreuter

* What is separating the academic work from the work that Google is doing in this field
* MPC on consumer devices
* Academic view - most interest is in generic protocols.  Garbled circuits, OT-based, linear secret sharing, FHE, malicious model security.  Major results: security models (simulation paradigm, non-malleable, concurrent, UC, etc.  Protocols exist for any functionality (feasibility results), one honest party can be secure against any number of malicious parties (GMW compiler), complete system implementations - compiler, runtime, etc.....)
* Security definitions do give us a way to reason about security in designs, and feasibility results let us know we're not spinning our wheels.  Impossibility results let us know what would be a waste of time.
* "More than once I have spotted a security problem by trying to write out a simulator!"
* Security models in practice:
* Semihonest is okay (parties can be bound by contracts).
* Forward secrecy - results from before an attack remain safe
* Privacy often more important than correctness (but privacy without correctness is hard to define)
* Covert model sometimes good enough (<- most useful to achieve for business-to-business settings)
* Malicious model desirable but costly (for people you can't bring to court (lol))
* Non-malleable security is sometimes needed
* Think COST, not SPEED !!!!!!!!!
* Academic work tends to focus on end-to-end running time, or throughput.  Costs: resources used for MPC cannot be used for other things.  Network has more contention therefore costlier than CPU.  Latency is costly too (application-specific).  MPC *could* be used if benefits > costs
* Google has lots and lots of servers, other party has questionable amounts of CPU/net/etc resources.  Contention on network.
* Challenges:
* "Consumer devices" could be malicious virtual machines - Sibyl attack
* Hard to deal with - remote attestation is not universally available and
* Security against malicious majority = cannot survive aborting parties
* Non-colluding service provider? Not *impossible* but......
* Sugar beets (?)
* Do all providers benefit equally? How do we resolve inequality?
* What if one provider goes out of business?
* Use another cloud provider???
* Business-to-business MPC: "Half the money I spend on advertising is wasted; the trouble is I don't know which half." lol, this guy sold tractors and wasn't sure who saw ads and then bought them
* Set intersection functionality is super duper useful111!!!!!  Good for aggregate information about users/customers in common (e.g. want to compute f(intersection); use data is too sensitive to share with anyone
* Ads attribution.  Stores know who buys their products and what they spend.  Google knows who sees which ads.  Can determine effectiveness of ads using both data sets.  (Only need to know set intersection ~size~, must not reveal elements in intersection.  Also want to compute revenue from intersection - additive homomorphism)
* Non-technical advantage: easy to explain!  Have to convince managers, lawyers, software engineers, etc etc etc
* First have to go and undo everything they've learned about crypto
* They compared their protocol against a generic garbled circuit approach, and they had issues with gluing and it wasn't actually any faster.; Google protocol only had to send bits per set member.  Generic GC approach bit vector cost grows with size of universe (rather than size of input sets)
* Related protocols: need to agree on common identifiers without revealing user data.
* AUDIT MECHANISM - think covert model
* Arbitrary lawyer-imposed things.
* Future stuff: malicious, non-malleable, more functionality than just additions (maybe GC)
* MPC on consumer devices.  Training machine learning models.  Merge results from many local models to create better model for everyone  "merge"=linear combination of local models
* Local model can leak information (what you typed)
* Aggregation is lienar, basically a weighted average..... Mask vectors in such a way that if you add all them together, the masks will cancel
* Idea: each party will encrypt its vector with a OTP.  Must handle device failures.
* Differential privacy maybe?  open question: efficient distributed noise generation?  Theoretically possible but seems to be expensive.  Needs to be secure against malicious. <---

# Privacy-preserving classification using deep learning - Herve/ Chabanne

* Training phase: weights w_j of coordinates x_j are learned (to minimize some cost function).  Classification phase: assignment of label (0/1) h(x^) for a new input vector x^
* Layers of perceptrons.  Convolutional NN is an extension of multi-layer perceptron with different layers...  Weighted-sum, max/mean pooling, activation function
* They use HE for privacy.  Efficiency = (# non-linear layrs) * (multiplicative depth of non-linear layer)
* CryptoNets - Applying Neural Networks to Encrypted Data with High Throughput And Accuracy.  High multiplicative depth layer couldn't previously be evaluated with FHE (ReLU, Max pooling); cryptonets replaces ReLU by square function and max pooling by sum pooling
* More stuff happened but I'm no neural nets person, watch video if need more info.

----------------------------------------------------------------------------------------------

# e2e in Messenger - Jon Millican (FB)

* Messenger by itself: Identity is indexed by fb account, this identity is accessible across all devices, message history always available.  Devices could be shared between people (web browsers, also mobile)
* e2e messenger desiderata: unreadable by provider, third parties. perfect forward secrecy. Ideally also future secrecy.  Verifiable identities.  Messages in device storage restricted to authenticated account.
* => Need well-defined set of endpoints.  Need consistent verifiable identity.  No message history on new devices.  Account data isolated in storage.
* Other apps: WhatsApp, Signal, Viber.  These are indexed by phone number, secondary devices still under same identity, vendor controls client capabilities.  Messenger is not like this.
* Options: 1. completely change messenger; 2. build encryption as a separate feature and not force anyone into it.  They did option 2.
* Off-the-shelf crypto (what tools?)
* Each thread has one device per person.
* Single device per account.
* They just went with iOS and Android, didn't bother with web re: javascript and other stuff.
* Signal protocol (though it had a different name at the time).
* Abuse reporting.  This is a really good point!  It seems they include a "franking tag" that allows FB to later verify that a certain plaintext corresponds to a certain ciphertext
* Rewatch this talk later - I got distracted by thinking about the abuse reporting thing and whether or not it breaks privacy (i think it doesn't; it's basically explicitly-non-deniable encrypion which i think is cool)
* Securing data on a shared device - provide account key from FB associated with a given login; wiped from mem on the phone when you logout.  This is cool
* Account key from fb -> storage key -> { private keys, [thread storage keys] -> {message data, session keys}}
* They want to be able to add multiple devices.  How to enroll, un-enroll, under what circumstances should you trust?
* Lololol javascript crypto code (seriously i need to find more recent stuff for this)
* jmillican at fb.com @JonMillican

# Snapchat.  Moti Yung.

* We're talkin' 'bout snapchat.  Cover specification and design of security aspects of memories (not entire story) released by snapchat in june 2016.  A cloud self-storage protocol with security against both servers and other users.
* User must be mobile (login from any device), made accessible to user only, not viewable by servers.....
* PBK -> too weak.  Enforce 128-bit entropy password? no usability. PW encrypts strong key at device -> no mobility. Cloud is a hardware device SHM (secure hardware module): user to SHM end-to-end-protection; not so easy change of servers...
* They made something kind of like password-protected secret sharing... User deposits a secret by sharing and protects itself via password in distribution and recovery.  Like PAKE (password authenticated key exchange) with multi servers.
* They made their own solution, they added symmetric key crypto.  Authentication flows in systems so we can use them (user password as first factor)......... slide too fast again :c
* Have initial 3-party design - two servers and a user (?) - wait i thought they were trying to do stuff between two users but what
* Numerous extensions for lifetime of system (more servers, hardening...)
* "Flexible key management" (whatever that means). Snaps encrypted with a snap key, master key encrypts all key; ciphertext of key attached to the content.  Share and reconstruct master (symmetric) key.  Wait who even has this master key????? Snapchat? Why not just encrypt all the images under master key if you're just gonna need it to decrypt the other key anyway.  Speed?  Encrypting twice doesn't make it harder to crack, does it?  Wub wub I have confusion.
* 3-out-of-3 reconstruction where not all keys are equally random (for some reason?)
* So they're basically doing cloud storage for stuff that only they want to access
* Give/take.  User creates and gives key to servers.
* Wait so. They make a key, give it to the server, then later get back the stuff.  No sending it to anyone, just look at it yourself later? Fine, I guess...
* http://i.imgur.com/Kp6RN00.png
* He says "people say multi-server based security is theoretical". I'm not sure what that is and who says that, unless it's like... he thinks secret sharing doesn't exist? idk man.
* Something about political correctness at the end I didn't quite catch (darn) where the heck did that come from ?

# DMCA - Mitch Stoltz from EFF (oh man oh man hype)

* HYPE HYPE HYPE DMCA talk by EFF lawyer!!!!
* DMCA passed in 1998.  "17 U.S.C. s 1201 (1)(A) No person shall circumvent a technological measure that effectively controls access to a [copyrighted] work..."
* Copyrighted - generally any kind of creative work - INCLUDING SOFTWARE
* "17 U.S.C. s 1201 (3)(A) 'circumvent a technological measure' means to descramble a scrambled work, to decrypt an encrypted work, or otherwise to avoid, bypass, remove, deactivate, or impair a technological measure, without the authority of the copyright owner.""
* Made with 1998 technology in mind.  So really what they had in mind were encryption of DVDs, and any kind of digial audio encryption.  None of which are really relevant anymore.
* "17 U.S.C. s 1201 (b)(1) No person shall manufacture, import, offer to the public, provide, or otherwise traffic in any technology, product, service, device, component, or part thereof..." - this has been used AGAINST RESEARCHERS of encryption in general! - against a lot of the kinds of work we've done here!!!  Law that was written to add an extra level of safeguard to DVDs was very quickly (2000,2001) used against security and encryption research!
* Prof Edward Felten's research team and SDMI, 2600 magazine and DeCSS, Dimitry Sklyarov arrested at DEFCON, and lots of self-censorship.
* Courts at the time really dismissed arguments about freedom of speech, etc.
* Unintended consequences report: https://www.eff.org/wp/unintended-consequences-16-years-under-dmca
* Narrow permanent exemption: "s 1201 (g)(2) it is not a violation... for a person to circumvent a technological measure... in the course of an act of good faith encryption research"
* "... factors include whether it was disseminated in a manner reasonably calculated to advance the state of knowledge of development of encryption technology, versus whether it was disseminiated in a manner that facilitates infringement under this title or a violation of applicable law other than this section"
* Lawyers love stuff like this, they see lots of billable hours in here.
* "Whether the person is engaged in a legitimate course of study, is employed, or is appropriately trained or experienced, in the field of encryption technology..."
* There's a similar one for "security testing".
* Language slightly wishy washy, it's not one or the other and like what if someone does something bad with what you do....... hmm
* Way to in theory protect against these legal threats: every 3 years there's a copyright summit that lasts like 1.5 years (sheesh).  Temporary exemptions created:
* A device or machine primarily designed for use by individual consumers (including voting machines); A motorized land vehicle; or A medical device designed for whole or partial implantation in patients or a corresponding personal monitoring system, that is not and will not be used by patients or for patient care"
* Seems to basically be whatever things researchers brought at the time?
* "Accessing a computer program solely for purposes of good-faith testing, investigation and/or correction of a security flaw or vulnerability, where such activity is carried out in a controlled nevironment designed to avoid any harm to individuals or the public, and where the information derived from the activity is used primarily to promote the security or safety o........"
* Basically congress could fix this but congress can't even tie their shoes
* Lawyers have a hammer and they want to use lawsuits.  This makes sense in US because courts have ability to strike down laws that are unconstitutional
* Green v DOJ: constitutional challenge
* Copyright + US constitution: initial courts in CA were very skeptical of free speech argument. But supreme court decided a few copyright cases.  They said "copyright law has some built-in first amendment safeguards". Fair use, and idea-expression dichotomy.  Now including interoperable software
* EFF's case says DMCA 1201 "goes beyond traditional contours of copyright protection" and violates 1st amendment
* Dr. Matthew Green, Andrew "bunnie" Huang, Alphamax LLC: want DOJ to say DMCA 1201 doesn't accomodate fair use, says it's overboard, and says 3-year exemption process is an unconstitutional speec licensing regime.  Basically they say it violates free speech.
* The encryption that prevents home video going over HDMI cable - HDPC I think?????
* Basically: Code is speech.  DMCA 1201 restricts research *and discussion of research*.  Circumvention is a "necessary predicate" to speech.  Basically, any law that restricts free speech has to be written as narrowly as possible.  DMCA 1201 bans far more speech than necessary, especially because *it's not limited to illegal behavior*.  Also arguing that the rulemaking is a speech licensing regime that doesn't pass muster.  Too slow "you can apply for an exception in 2018!", too arbitrary "basically up to whoever's deciding at the time whether it's allowed or not, no guidance in law itself".
* mitch at eff.org @mitchstoltz
* "Even the most minimal of encryption is enough to count.  Probably a label that says 'please don't steal' isn't enough, but probably ROT13 is"
* It talks about *encryption* research, not cryptographic research! And that's "scrambling or unscrambling", it doesn't include any of the other crypto tools in the toolbox.  !!!!!

----------------------------------------------------------------------------------------------

# Lightning talks!

* thanks 2 bristol peeps
* Nico Vanserverein?? Director of something at Linux foundation, they try to make open source software more secure.  One way they do this is by paying people to write OS software that actually is good!  OpenSSL, GnuPG, BouncyCastle.  Google "linux foundation core infrastructure initiative" and email "nico at linuxfoundation.org". He doesn't guarantee that he'll give you money but pls submit grant proposals especially for useful crypto code
* Daniel from ACLU - ACLU is super pumped about crypto.  Tech fellowship  /2017/tech-fellow  /2017/summer-tech-intern # follow up on this!!!!!!!1
* Guy from EC2 says he's got a new instance type called F1 that have FPGAs (ooh fancy).  They're closed but if you do research you can talk.  If you're working on crypto research (esp formal verification) those are things they like to fund.  AWS is hiring on a ton of crypto stuff.  They have their own TLS, IPSec implementations, etc etc.  Tiny crypto, enormous crypto, yay.
* Phillipe and he works for CloudSomethingorother.  Talked about it at ccc.  Wants to collaborate, crypto at cloudfor.com (?)
* Aaron tom from Galois.  1. they're hiring, 2. they've got cool open source toooools.  60-some people in Portland OR.  Want to make critical systems more trustworthy.  Cryptol - quick and consise to write descriptions of crypto algorithms and then do analysis on the properties of those algs.  Also a thing that takes those descriptions and proves that a concrete implementation fulfills it.  cryptol.net, galois.com.  and he's lonely he wants people to talk to him.
* InfosecGlobal. VPN sutff it sounds like. Some side-channel defense.  Sounds like they rehearsed this approximately 11 times.  They want post-quantum research, especially isogeny-based stuff and hash-based sigs.
* Marco from Sharemind.  They have a new SDK available, it's got a key-value database module, they updated performance algs in the emulator.  He's reading off of his phone lol.  There are compiler improvements, documentation improvements.  SDK is free to use and can be downloaded from sharemind's github.
* Elaine Shi from Cornell.  Announces from IC3, which is a cryptocurrency grp.  Hosting their second retreat in san francisco in feb.  First day open to public.  They also take postdocs.  Google Cornell IC3.
* Zuko: 1. Z-Cash ("Zero cash") it's encrypted bitcoin.  Uses ZK snarks.  Probably the crypto-i-est thing intended for wide use.  To use it they invented an MPC and they're pretty sure it's the fanciest crypto ceremony ever performed, and they got hacked during the ceremony, but their thing worked and it didn't get all the way ruined.  Search for Z-cash ceremony.  Next: Tahoe LAFS.  Encrypted distributed storage.  Check out their key management.  And b2sum - md5sum is officially out forever and ever amen.  They're gonna start using b2sum, it's way faster and more secure yay.
* Dave Borse from MITRE.  Has tons of jobs for CS/cyber/crypto/malware etc.  Don't send your resumes to him, but if we have questions send them to him dborse at mitre.org.  Working in consortium with NIST,DHS,... to implement specs for swidtags - software ID tags - to do IT asset management or something idk sounds metadata-y.
* Jeff Bridges - missed where he worked for.  They have a free software stack for doing blind signature transactions in the browser.  So if you know any bankers who are interested in providing anonymous transactions have them go talk to him.
* If you heard about smart contracts and ethereum you probably think ethereum is terrible and that smart contracts are dumb. But no, they show otherwise. :| they use the highest standards. mhm. they found a bunch of vulnerabilities, and they Dimitri Koratorsh in Luxemburg.
* Something Marcel. Security research team at Visa Research.  They're hiring, they do systems security, etc etc, go talk to him or there's a table.
* Linus from Melbourne, Australia.  Shoutout to academics/students who might want to do an academic peer review for some of the crypto systems they're developing.  He's a developer but he does want to save data to the cloud in an encrypted format.  The ideal solution is to have some kind of encrypted FS but what was available was insecure. So they've been working with cryptographers to make one.  And they're implementing it with phd-level devs.  They want to make sure they do peer review NOW rather than after release.  me at linuschang.com
* Jason Law - from a project called Sovereign.  They have a distributed ledger that uses byzantine agreement, etc etc, sits somewhere between bitcoin POW and what you'd see in enterprise FT system.  Philosophiaclly based on concepts of "self-sovereign identify". something about feudal lords lol.  Trying to move away from serf-sovereign i guess? They've got a government, or something. No idea what they're doing tbh.  Anonymous credentials, blockchain, etc etc.  Open source sobrain.org (?)
* Ivan I... From a startup called inpher, missed the details.
* NIST - lightweight crypto project, initiated three years ago and now they want to make a portfolio of algs dedicated to lightweight cyrpto.  They published a report that summarizes where they are and what they want to do next.  If you have an app and you have opinions, go talk to them.
* Missed his name, he's a PhD candidate at a school, there are flyers in the back for a crypto school june 5-9th.  You all need to do drawings and it sucks, so he points out: tikz for cryptographers in IACR ! super convenient crypto drawings in latex :D
* nccgroup - lots of stuff, their gig is basically "real world crypto".  They want employees!
* People are building chat apps and we've basically failed at email for 30 years.  But there's a group of devs and UI designers who are working on a thing called autocrypt, which is trying to figure out how to do actual deployable mass e2e encryption.  Replace cleartext email, roadmap for how to get this to become something they could actually work in.  Autocrypt.org, or you can email him.  dkg at aclu.org.  Set of conceptions about how to try to convince mail user agents to provide a minimalist user experience to provide e2e for at least the content of email.
* Vladmir Klevonof (?) If you've ever had to implement an RNG or if you use one and you're not sure it's working properly.  Until recently, answer is code review.  Now, static analysis for RNG.  Entroposcope.  (I'm skeptical because NIST work)
* Missed this guy's name, something about DANE (?)  Put it in more TLS/SSL stuff.  And other stuff.  If you want to do things beyond postfix, he wants to work with you.
* Deadline for something is coming up soon and I missed the first sentence dang.  kiayias.com
* Kenny Patreson - U Bristol is hiring postdocs! 6 entire people!  Talk to Nigel

----------------------------------------------------------------------------------------------

# Tervor Perrin (one of the people who got the award for signal) - message encryption

* 1940s-1970s: key distribution center made n keys, distributed them among m people
* 1980s: symmetric key infrastructure (SKI) <- central key distribution center makes individual keys for individual users; then if you want to have a group, you send the people in that group the group key encrypted under the invididual keys ----- this doesn't even approach modern messaging systems
* 1990s: PKI.  Register public key with certificate authority.  CA sends users certificates and revocation lists.  Or make directory do most of this work.
* ~~~crypto wars~~~
* 2000s: disillusionment.  Tried a lot of PKI stuff, tried identity-based encryption that wasn't really that different.....
* 2010s: resurgence! 1. Increased awareness of massive breaches and amss surveillance.  2. Emergence of mobile messaging apps!  Enter encrypted messaging apps.
* How do you know you got your key honestly?  Most common way is to allow some kind of out-of-band authentication.  Like a safety number (PK fingerprints); use QR code scans, etc etc.
* Most people aren't going to do this, because it's optional.  But some people will do it.  And if some people do it, and there's no way to know who is, then you can potentially keep a large system honest (cut-and-choose style)
* Certificate Transparency / CONIKS.  Trees!  Notified about paths... didn't quite catch this, look it up later if it's important
* Encrypt-then-authenticate.  From a user experience perspective, the user gets "free" encryption, and then they can authenticate if they want.  Engineers build enc layer first, work out how to shuffle around keys, etc, then think about how to add better auth... This is VERY different than how people try to manage PKI (in which auth comes first, and then people build apps on top of it)
* He's making a metaphor here I'm not sure he meant to make.  He's comparing auth-then-enc to early crypto days, enc-then-auth to modern.  I'm not sure he means to compare to encrypt-then-sign, but it's happening accidentally.  I think he's talking about development.
* 1990s: pretty clear what you were trying to do, you were either encrypting/signing or you were securing a protocol.
* 2000s: OTR, added idea of deniable key agreement, "ratcheting" - update forward secrecy after AKE
* He's got a really good DH ratchet slide
* 2010s: TextSecure -> Signal.  Key agreement (added pre-keys and DH-based key agreement).  Ratcheting, added symmetric-key ratcheting.
* Muti-party messaging, built on top of two-party protocol.  Then build multi-device on top of multi-party.
* There are still questions!  How 2 metadata, especially in multi-party scenario!  Key questions for multi-device...
* trevp at trevp.net
* https://moderncrypto.org

# Formal security analysis of Signal protocol - Luke Garratt - University of Oxford CS

* "And then after your session, something bad could happen, represented by the orange explosion"
* Post-compromise(d) security: recovering from compromise.  Sort of the opposite of forward secrecy.  Forward secrecy ensures that if a future session is compromised, past sessions are not also compromised.  Post-compromise security means that if a prior session is compromised, future sessions are not also compromised (!).
* This is useful - older protocols have no forward secrecy.  Adversary can store ciphertext traffic, and obtain long-term keys later and then decrypt.  Now, with FS, adv must now obtain long-term keys first, wait for interesting target session, then launch a MITM.
* Fancy protocols have post-compromise security.  Adv must now obtain long-term keys, immediately attack and keep on attacking if it wasnts to compromise future targeted sessions.
* PCS, CSF '16.  "Security guarantees even after your peer's key is compromised."  They've got a game definition and everything!
* Adapted Bellare-Rogaway style ("Entity Authentication and Key Distribution"), multi-stage key exchange model (Fischlin and Gunther, "multi-stage key exchange").
* Signal seurity model: adv has full network control, perfect forward secrecy.  Key compromise impersonation attacks.  Some (but not all) random numbers can be compromised. <---- look at this more later
* Thm: The Signal protocol is a secure multi-stage key exchange protocol, under the GDH assumption and assuming all KDFs are random oracles.
* Proof tree is full of splits and long, but not complicated.
* Limitations: This was all theoretical/algorithmic.  Long-term identity key is used in initial handshake and to sign medium-term key.  We assume for simplicity that the medium-term key is authentic.  They assume honest key distribution.  Multiple devices not yet considered.
* ia.cr/2016/221
* ia.cr/2016/1013
* Before the formal methods vs provable security war breaks out in front of the industry partners, let's move on.

# Felix Gunther - 0-RTT Key Exchange with Full Forward Secrecy

* Premise: You have to wait 1RTT to do a KE, this "can be infeasible"
* Ideally, only need to send one message from C -> S to get key.  Do not have to wait for response from server.
* This is not a new concept: QUIC (2013), TLS 1.3 (2015+++++++)
* Security drawbacks! Replay attacks! (partially unavoidable)  Also, no forward secrecy (if adv compromises server, they just get that key and there's not really anything you can do about it)
* Related scenario: asynchronous messaging.  Can be solved by PKE.  This also lacks FS.
* Coarse forward secrecy (CHK'03).  Divide time into coarse intervals, secret key correspond to interval.  When adv compromises secret key, only decrypts messages within a single interval.
* "Fine-grained puncturable forward-secret encryption" (GM'15) - somehow use ABE to re-encrypt messages you've seen before or something.
* Contributions: "punctuable forward-secret key encapsulation".  And then then use this to create a forward-secret 0-RTT key exchange.
* "Puncture" - remove capability to use same public key again (?).
* Ah, I think I see.  They make a tree.  Parent: sk_B.  Then uses "sk_01".  Eliminate path to this key, therefore cannot reuse key.
* Keys will grow really fast!  Thus, can make time intervals!  Subtree only corresponds to set time interval!
* Gives you full forward secrecy and also replay protection!
* Time performance: Enc pretty fast (I question whether it's faster than 1RTT but w/e). Dec can be much slower because we now have to do this entire tree key derivation thing.  Puncturing takes a VERY long time (like, minutes on a laptop).

# Toward 5G authenticated KE - Cristina Onete

* This time the proof took 26 years!
* Typically, 1. auth+KE; 2. secure channel (use keys from 1).  We're talking about part 1.
* Typically, we talk about network MITM in AKE security.  Also talk about FS; even if LTK of party is compromised, prior communication still secure
* Security of protocol only proven for 2-party use.  But sometimes handshakes are proxied by semi-trusted third-parties.
* We talk about only one party trusted (typical in mobile, etc)
* AKA and 3G/4G networks: 3 actors: client (mobile user), operator (mobile) - clients trust the operator, server (mobile management entities - MMEs) - affiliated either with client's operator - or in the case of roaming the server is affiliated with another operator which the client and old operator may not trust.  Client-server communicate via unsecured radio link.  Server-operator assumed to have secure channel.
* Operator trusted with all data.  Client trusted with almost everything (not actual operator key).  Server not trusted to know anything other than current session data.
* AKA protocol standardized in 1990s by 3GPP does not answer requirements set down already.
* Stateful between client and operator: sk_C, sk_op, Sqn_C, UID.
* Use sequence number to auth; if seq no is close enough to client seqno, then accept auth.  Resync when sqn out of range (if MAC_OP verifies).  Force operator to go back to client's seqno.
* SIM cards do not produce randomness (!!!!!).  Next gen sim cards will.  But wow seriously??
* Problem: client/operator are not running this protocol; client/server are!  How can authentication work without client secrets?
* Server is used as a proxy, does only identity management.  Client identifiers are stored by client/server.  Area identifier.  Between these two identifiers, global uniqueness.
* Get server ID info, server/op get batch auth vectors; server/client challenge-response;
* security weaknesses of AKA.  Server impersonation by offline relays.  Impersonation/key-distinguishing attack (corrupted servers can reuse unused auth vectors).  AKA security server corruption attacks reveal session keys; no auth on client/server resynch
* 3GPP requirements: ID-hiding (nobody can trace client's IMSI), location-hiding (nobody can trace client's LAI); untraceable: nobody can link services to clients.  ---------  IMSI catcher S07 breaks tracability like whoa
* THey have counter-proposals that fix AKA protocol.  But is it good enough for 5G?
* Other AKA problems: computation done all at operator end; legal interceptions; strong deviation in practical implementations; application-layer primitives problematic; no concept of e2e, all goes through operator.
* What she actually suggests is a fundamental leap (TLS 1.3 vs 1.2).  Need protocol for unfederated e2e security, including p2p.  Usability.  Handshakes for different situation.  App-level primitives.  Efficiency and legal compliance!

----------------------------------------------------------------------------------------------

# Cryptographic enhancements to password protocols - Hugeo Krawczyk - IBM research

* Passwords still all over the place
* leakedsource.com
* Maybe crypto can help us fix passwords without getting rid of them entirely (unrealistic)
* Mostly blinded DH [Chaum, Ford-Kaliski, Boyen,..] "oblivious PRF"
* Efficient, mature, ready for deplyment.  Talk to him if you have applications.
* Offline dictionary attacks are main source of password compromise.  Dedicated hardware, many millions++ of passwords tested per second.  Offline attacks often unavoidable - if the server can check, so can the attacker.
* Hope is to make offline dictionary attacks less successsful.
* Part 1: remove burden of choosing/memorizing passwords from users.
* Simple solution: password store, a.k.a. password manager.  Just remember one (non-trivial) master password.  Problems though!  Attacker obtains list -> offline attack on master password (i find this unlikely); inside compromise - attacker could learn master password as user types it (still seems very unlikely); user-device communication compromise (possible); typical password managers keep user-chosen passwords (though lastpass and others allow generation of high-entropy passwords). But for now, premise is that password stores aren't good, I think?
* Dream password store: all passwords kept in password store in a user's device or online, user memorizes a single master password, all passwords random and independent of each other, AND an attacker getting a hold of a store or even in full control of the device LEARNS NOTHING.  (And he means info-theoretic nothing!)  Info stored in device is independent of user's individual passwords and of master password.  Cleartext master password is never entered into the device! Eavesdropper learns nothing, attacker in device learns nothing.  SPHINX: A password Store that Perfectly Hides from Itself (No eXaggeration)
* PRF-based solution: User enters master password pwd to computer.  Computer sends to device.  Device has device key K_d.  Device calculates PRF(K_d, pwd); computer sends rwd to server; rwd=PRF(K_d, pwd)  Rwd is pseudo random password that user registers with serer.  Independent rwd with each service (e.g. PRF(K_d, pwd|url))  Works with any password protocol between client and server.
* This solution gives you offline attacks being infeasible, storage in device is chosen independently of master pwd and of rwd's.  But master pwd is sent unprotected to device.  To avoid this, move to oblivious PRF
* OPRF-based solution.  Client machine chooses 1-time-key, under which it encrypts password.  Device does "something" with that, and outputs encryption under that same key, Enc(PRF(K_d, pwd)).  So still infeasible offline attacks, storage in device still independent of pwd and individual rwds, but now master pwd hidden over wire and from device.  But how to implement?
* FHE does it.  But FHE is not exactly mature technology.  So instead OPRF(K_d, pwd) = H(pwd)^{K_d} (H hashes into a group, K_d exponentiates within the group).  Make the one-time thing chosen by client machine a random value r; send H(pwd)^r to device, device puts this to the k)d, now rwd is received value to the 1/r.  So now master pwd is prefectly hidden on wire and from device.  (I guess we're assuming hash has as much entropy as pwd?)
* So that's super duper secure, it also performs well (single round C-D, 1 exponentiation for D, 2 for C, and one hash into group for C (~~~any~~~ DH group works, no bilinear needed, etc etc)).  It's server agnostic.  no need for C-D channel protection (self-protected), and can use any communication channel.  Also, can replace D with online service.  Pwd, rwd never seen by server, no need to secure communication with server!
* Device compromise: unconditional secure pwd/rwd (online only attack); network attacks: unconditional security ddevice-cient communication.  Offline dictionary attacks infeasible (random rwd).  Offline against master pwd only if both server and device compromised.  Online dictionary attacks infeasible; password leakage defended b/c rwd useless inaother server, master pwd useless w/o device, url hashing prevents phishing.
* Two-factor authentication mentioned but I didn't catch it; check slides.
* They built SPHINX password manager.  Implementation as Android app and usability study.  Reference will be at the end.
* Part 2: How to protect (secrecy, availability) a valuable secret (bitcoin wallet, private keys, etc etc) when all you remember is a password.  PPSS - password protected secret sharing (I think?)
* Need multi-server (otherwise, single point of failure for both secrecy and availability)
* Secret share encryption key in multiple servers.  Share among n servers, retrieve from t+1 servers.
* Pretty sure this was our original MPC project idea (me, Aanchal, Kyle)
* But server needs to authenticate the user before delivering a share.  So now rather than 1 point of failrue we have n points of failure.
* What we really want: Password Protected Secret Sharing [BJSL'12]
* Init: user secret shares a secret among n servers, forgets secret entirely and keeps a single password.  Retrieval: user contacts t+1 servers, authenticates using single password and reconstructs secret.  Attacker that breaks into t servers learns nothing about secret or password
* Don't all the servers now need to keep a password file?????
* Single parallel message from user to t+1 servers, one message back from each server.  No inter-server communciation.  no assumed PKI or secure channels
* Cost is less than a TLS server
* Single-server PAKE (asymmetric/augmented PAKE, aPAKE).  Desiderata: attacker must run dictionary attack upon server compromise.  No pre-computation prior to server compromise.  Server never sees password in plaintext (duh), reduce/eliminate reliance on PKI, performance; offload hash iterations to client (key stretching)
* Password over TLS achieves attacker first two, but not the rest.  PKI-free aPAKE gets 1,3,4 and not 25.
* X-PAKE has all properties.
* To get X-PAKE, you move the server to the site of the device (now you run that part with the server) and then yay you're done!

# Jeremiah Blocki - Towards a theory of Data-Independent Memory Hard Functions

* Pretty sure I took notes on this talk before at a crypto day or something
* Premise: memory costs tend to be equitable across architectures.
* memory-hard functions: computation costs dominated by memory costs.
* sCrypt is one example of this, next talk will give positive results for scrypt, but scrypt also induces a data-dependent memory access pattern.  This makes the password vulnerable to side-channel attacks like cache timing attacks.
* iMHF: data independent memory hard function - memory access pattern does not depend on input (and it's a MHF)
* i took a picture the image->text conversion was too hard
* evaluating an iMHF uses pebbling - goal is to pebble the last node.  Can only compute a value (and put a pebble on it) if we have all the dependent values in memory.  So in the graph, can pebble a child when we have pebbles on all parents.
* Measuring cost attempt 1: space * time (max).  But then you have that thing where if your thing isn't rectangular it can stack on top of itself.  Parallel computation!!!  (If you have a spike, can move around the spikes and still use way more memory)
* Instead, use cumulative complexity, which is just the sum of the memory used at each step.  This makes way more sense anyway.
* Theorem from AS15: high pebbling complexity of G implies high amortized memory complexity for the iMHF f_{G,H} (intense paraphrasing going on rt)
* We also add the cost of storing a random orale on chip, etc etc.  Doesn't change asymptotics, but brings us closer to reality.
* Naive pebbling algorithm: sequential.
* Depth-robustness: necessary and sufficient function for iMHFs
* Had a cool method about (e,d) reducability that I unfortunately didn't follow well enough to record, just go check the slides.
* Multi-phase thing where you switch between "light" phases where you discard pebbles, and "balloon" phases, where you greedily recover missing pebbles.  Because depth of graph is small, can do this very quickly.
* Sort of lost track of this talk, check slides if necessary.
* Thm: ideal iMHFs don't exist; any grpah G with constant in-degree is at least somewhat depth-reducable.  But we can't rule them out in practce; memory parameters start becoming practical abouve 2^51, which is way higher than any real actual parameter.
* Depth robustness is sufficient (and necessary)

# Memory hardness of scrypt - Stefano Tessaro

* Scrypt, Catena, Argon2d/2i, Balloon Hashing all are reasonably practical MHF
* Provable security still relevant b/c they give us some confidence in their security into the future.
* Password hashing competition - validation mostly via cryptanalysis
* Recent advances, but still lacking designs that are both practical AND MH
* Can look for new tools to prove an existing practical algorithm secure
* Proof that scrypt achieves optimal memory hardness (this is sort of surprising, because it was the first conjectured MHF - percival, 2009)
* Scrypt is data-dependent.  Theory work has focused on data independent things because they're resistant to side-channels and also there are all these proof techniquest that make proving it easier.  But they're still WIPs.  Surprisingly little theory for dMHFs.  And we have some practical designs unaffected by attacks.
* ROMix: Core of scrypt uses a Salsa/20 based "hash function" H.
* So far no formal analysis of what the best n is...
* Show that for every adversary, even parallel ones, total memory used is still lots.
* Two naive solutions: 1. store all intermediate values afer computing them.  Then amount of memory needed is n*w where w is width of value.  So cumulative memory complexity is n^2*w.  2. recompute X_i whenever needed.  So now it's n^2*w (actually n^2*w/2 maybe?)
* We could conjecture that all other strategies are at least as bad, but we don't know for sure!
* The result: Theorem: for any adversary evaluating ROMix, the cumulative memory complexity is at least (1/25)*n^2 * (w - 4*logn) w overwhelming probability over the choice of H.
* Earlier proof attempts were incorrect (percival 2009) or weaker guarantees
* Important point: shows separation between dMHFs and iMHFs.  We know for iMHFs we know there's an asymptotically efficienty strategy.  But of course, constants matter a lot!!
* Want indices in second phase (accesses to first phase) to be unpredictable.  Only achievable via unknown data dependencies.
* You can model this as a special case of a pebbling game.  Put pebbles on line.
* Pebbling round game; in every round from 1..n, receive challenge, win round when you have pebbled the challenge.
* For EVERY strategy, with high probability, area under the curve (CMC) > n^2
* Inverse relationship between # pebbles had when challenge revealed, and # steps needed to answer random challenge.
* Area of rectangles corresponding to rounds is at least linear.  But that's not quite what we need; we have no guarantee that advs won't just drop pebbles.
* Insight: In order to have a pebble, we need to put it there.
* What about Argon2d? eventually gets same lower bound, but it's not written yet.
* iMHFs vs dMHFs: dMHFs have higher memory-hardness and seem to be easier to get right.  But side channels are a huge concern.  Memory access patterns do depend on the input.  Must analyze specific applications.  If you're not concerned about side channels, there is NO REASON NOT TO USE a dMHF.
* Note: proofs ARE important, but they are not the sole component!  Should be complemented with cryptanalysis, analysis of validity of moddels.  Concrete security crucial.

# Cloudflare CAPTCHAs - aw heck I missed their names.

* CAPTCHAs are actually hard: assumptions like culture, language, vision/hearing, mobility, social class, definitions of "house" and "storefront"
* as a result, CloudFlare tries to use these only as a last resort.
* CloudFlare sits in front of websites and tries to protect them from DDos, comment spam, etc etc...
* It uses a bunch of signals.  TorBrowser obscures these signals..  Mostly they use IP reputation of Tor exits  So usually they send a CAPTCHA to th exits from Tor.  This means that Tor users get a LOT of CAPTCHAs.
* "Fuck Cloudflare" lol
* Blocking innocent Tor users is a big problem.  They'ev tried intentionally blacklisting office IP reputation, they've tried reCAPTCHA v2 (that backfired oops),.....
* We need some sort of portable proof of humanness that can meet security requirements of Cloudflare and isn't crap for Tor users.
* Look, a clever crypto thing!  Use blind signatures to rate-limit anonymous account creation on twitter without deanonymizing people
* Tor Browser plugin + an edge service.
* User solves a CAPTCHA and submits many blinded tokens for signing.  Later, unblinds and submits a token instead of solving a CAPTCHA.  Users solve only one challenge per N websites visited.  TOkens are unlinkable, work cross-domain over multiple circuits unlike cookies, maintains Tor Browser's strong first-party isolation.
* This seems to do what we want!
* They're doing plain-jane best-of-the-80s RSA.  https://github.com/cloudflare/challenge-bypass-specification
* Theyve looked at other things, but there really just doesn't seem to be an alternative that's any good.
* Questions about protocols of this kind in general: is this a new deanonymization vector  How to deal with botnets and limit token farming?  How to stop a malicious site from draining tokens?
* Attacks? the next PETS deadline is feb 28 lolol
* Alex Davidson - alex.davidson2014 at rhul.ac.uk - @Alex_AD10, George Tankersley - gtank at cloudflare.com - @gtank__, Filippo Valsorda - filippo at cloudflare.com - @FiloSottile
* What's wrong with RSA? Well, any kind of like, revocation/expiry needs to go with public keys.  Policy as key management.  Some more stuff.
* What kind of attacks from Tor users are you blocking that you are still going to be blocking?  Tor not suitable for DDoS.... A: comment spam, high bandwidth web scraping that has no concept of rate limiting and will take down your site without even thinking about it.  Also, weirdness like sqlmap has a --tor option that will send sql injections over tor...
* He says he's talking about a WAF (?)

----------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------

Day 3.

# Quantum computing from Google - missed the name :c - Emmett something?

* Intro to quantum computing - mostly he's telling us what a qubit is.  He tells us about the Bloch sphere.  Gates are linear operators on SU(2).  Only 2-qubit gates people care about are CNOT (conditional NOT) and CZ (conditional Z, which is {1 0; 0 -1} of second bit conditioned on first bit)
* Shows off a cool double-slit diffraction from ~~~electrons~~~ (not photons)
* smbc-comics.com/comic/the-talk-3
* Their setup is pretty intense and is a reminder as to how far we still have to go.
* No-cloning theorem prevents duplicating immediate results; two types of errors to correct (bit flips/phase flips)
* Several error-correction methods - they're using a surface code where they stick these little + shaped detectors in there for detecting parity changes (alternating phase flip checks and bit flip checks)
* If your error rate is above ~1% then adding a bigger patch makes your errors worse, and if it's below that it makes errors better.  Google's best devices have just under 1% error.  They think they need to get it down to 0.1%.  But even with that there are hella error problems.
* Clifford gates (rotate octahedron onto itself) is almost good enough, but not quite universal.  You also need a T-gate single-qubit pi/8 rotation.  Can actually implement a T gate by initiating ancilla wire then performing 2-qubit gate.  So Clifford gates plus that awkward 2-qubit gate work!  But this causes a lot of overhead!
* Shows a picture of small Clifford gate implementations, and then shows a huuuuge T-gate implementation.  99% of a surface code quantum computer is computing ancillas and given to T gates.
* Scalable, fault-tolerant quantum computing is still very distance.
* Shor's algorithm for factoring 2048-bit nmber with surface code, using 99.9% accurate gate operations: 10k logical qubits for state registers, 10M physical qubits for state registers, 250M physical qubits for T-gates, 10 hrs @ 500 ns per cycle
* Current state of the art: 9 qubits with 99.5% accuracy.
* Good news: we are at or near the necessary fidelity threshold; basic physics seems okay.  Many engineering challenges required to scale up to a million physical qubits.
* Rules of thumb: nuclear fusion power is 50 years away. Quantum computers are 10-15 years away.  lololol
* What is QC good for: shor's alg, quantum simulation, quantum chemistry - Haber-Bosch process: N_2 + 3H_2 -> 2NH_3.  Even a small improvement in efficiency here has potentially huge benefits but we can't simulate even this chemical reaction with current classical computers.  Simulations of superconductors also super difficult.
* "Quantum annealing - D-Wave has made a lot of claims about how their machine can solve NP-complete problems in polynomial times, which is wrong." lol
* Basically if you don't know things about your problem, simulated annealing is gr8, but if you know more things, then maybe it's not what you want.
* TL;DR fault-tolerant quantum computing is still super duper far away
* But the good news is, there are a lot of problems that can tolerate innaccuracy - quantum chemistry, simulation, etc.  We think maybe we can achieve results in this field before we have to deal with 250M fault-tolerant physical qubits.
* "Quantum supremacy" - get a quantum computer to do something, anything, no matter how contrived, that a physical computer can't, as a proof of concept. <- I think this is a good idea!
* They're working on a 50-qubit, 2D array.  + physical infrastructure required
* When you achieve quantum supremacy, could you please talk very very carefully to the press? lol
* Talk about science being in good shape but lots of engineering required.  Is google gonna be the one to do it?  A: yes!  Speaker is physicist turned EE and he's pretty sure his team now is better equipped, funded, etc etc etc than U he was at before.  Getting all sorts of software engineering help.... Google definitily interested in the software engineering side
* Estimate to 1 sig fig of how many billions of dollars, years, etc, until you got a quantum computer?  A: it depends on who you give the money to (heh).  EU just gave a billion dollars to a QC project, but it probably won't "build" a QC because it's split among 10 universities
* Entire investment in field is "a few billion dollars over the next 5-10 years".  Scalable part is further off than that, but not by a factor of 10.

# Erasing secrets from RAM - Laurent Simon - from Cambridge

* Non-malicious program P handling secrets and then allegedly erasing them.  Ideal: give attacker access to all of a system's volatile memory and CPU state (this includes OS kernel, userspace, virtualization primitives....).  SO THIS TALK: only userspace memory and CPU state
* Main concern for erasing data from RAM in practice: compiler optimization.  Call to zeromem() removed, because sensitive buffer is going out of scope when the function returns (so it's wasteful to change the variable)
* No code analysis to account for compiler optimizations.
* Register spills - when a compiler runs out of registers, it starts "spilling" them onto the stack.
* They did dynamic code analysis, taint tracking - taint is binary, assign taint sources from which data becomes tainted, propagate taint as the binary goes.
* Variable assignment, pointer arithmetic (table lookups with tainted index all over the place in AES or format conversions) propagate taint
* OWF remove taint
* https://github.com/lmrs2/secretgrind
* based on Valgrind
* Initial tests don't see much removing of zeroing function, do see some developer mistakes.
* IO APIs caching optimizations lead to problems tho... E.g.: read first line of PEM file to detect if it's a private or public key file.  Zeroing the buffer does not erase memory. mmap works at page level - even though you only asked for the first line, mmap takes the entire page!
* Formatting functions (printf, scanf) leave copies of sensitive data on stack.  Recursion often leaves sensitive data on stack.
* Missed a bunch of info here because I was rescuing my notes file
* Call graph based tracking - no performance hit besides actual zeroing.  Good for statically linked programs or embedded systems but maybe not other stuff....
* Plugin to Clang/LLVM to automatically erase stack/registers.  Code is kind of a hack and not released yet.
* lmrs2 at cam.ac.uk
* https://www.cl.cam.ac.uk/~lmrs2/

----------------------------------------------------------------------------------------------

# TPM2.0 and DAA - Direct Anononymous Attestation - Anja Lehmann - IBM Zurich

* TPM - Trusted Platform Module - secure crypto processor for key creation/storage/use.  Makes remote "attestations" of host status.  Is host really using keys stored on TPM>
* If we had standard certificates, all attestations would be linkable and TPM's ID would be revealed.  Not great from privacy standpoint.
* DAA - signs attestation anonymously.  Unforgeability, anonymity, unlinkability, non-frameability
* DAA first proposed in 2004, Brickell Camenisch Chen.  RSA-based, TPM1.2
* Now in TPM2.0 (2014), elliptic curve and pairing based (pairings? orly?)
* Also new interest because mobile, IoT devices, FIDO auth, SGX, EPID.
* To make DAA provably secure real-world model, need (1) security model, (2) protocol secure according to 1, (3) secure implementation of 2
* Original proposal lacked ability to output signatures (could only verify by interactive protocol).  Follow-up paper CMS09 they gave random values as output signatures (uh), and this was not realizable by *any* construction (whoops)
* Game-based definitions arrived, BCL09, BL10, BFG+12.  But they were all either flawed (not correct) or had the no-signatures issue.
* BFG+13 was probably best, but considered TPM+host one party - missed most interesting case of honest TPM and corrupted host.
* Basically all the security definitions suck.
* Camenisch Drijvers Lehmann CDL16a - security model in UC, TMP and host separate parties, signatures modeled as concrete values for random TPM keys! Seems more reasonable.
* Now we need to get a protocol.  Issuer issues membership credential on committed TPM key.  Verifier gets {anonymous attestation of m, NZIK: m is signed with certified sk} from TPM
* Lots of attempts, two parallel lines.  ALL of them are at least partially broken.  Some are insecure (no privacy, forgeable).  One (ISO approved) is trivially forgeable - seems like a backdoor tbh? but idk), some other stuff...
* But they have a protocol.  Didn't quite catch it.
* "Our" real world - TPM = lightweight device.  "real" real world - TPM accessible via only a few limited APIs.
* TPM.Create(); TPM.Hash(t,m); TPM.Commit(P); TPM.Sign(c, ctr)
* Signature proof of knowledge: prover has sk, verifier has pk = g^sk.  Prover sends g^r, verifier sends c = H(g^r, m).  Prover sends s <- r + c*sk; Verifier does g^s ?= t*pk^c
* Diffie-Hellman Oracle problem: having g^sk, g^sk^2, g^sk^3, ... somehow makes it easier to find sk or something????? This seems wrong because can't you do the squaring by yourself? what?
* Cleared generators Xi et al 14.  Generator is P = g^y, Issuer knows y and tehrefore K = pP^sk = pk^y.  Not 100% sure what this is about.
* This looked super cool but I wasn't really able to follow it.  Definitely look up the slides.
* Add TPM.Bind(P, K, pi) - verify that pi is valid, store P as "cleared point"; TPM.Commit2(bsn) - choose r in Zq, store (ctr, r), P = H(bsn), t = P^r, output(ctr,t)
* Except noooo, adding new TPM stuff is really ahrd and invovles a lot of people agreeing with each other.
* If we could just change commit to commit2 then maybe it'd be okay?  Only if it's also backwards-compatible.
* So they restarted, and they actually got it, but they had to change a lot of stuff.  But they did prove it.  And now they have a revised interface with no static DH.
* What about unforgeability and anonymity?
* Not actually unforgeable or anonymous.  Fix by Xi et al fixes unforgeability, introduces "subliminal" channel (?).
* Fix to fix: instead of using nonce jointly computed by host and TPM (TPM commits to nonce, etc) - gets us unforgeability and anonymity (nice!)
* Some changes have been accepted and others are under review by TPM people
* Concept: provably secure crypto and the real world should be compatible
* ia.cr/2015/1246 ; ia.cr/ aw heck

# Helena Handschuh - DPA resistance for real people - Senior director for cryptography research from Rambus

* DPA resistance - ohhhh DPA is differential power analysis.  Low-cost, non-invasive attacks on crypto HW/SW, attacks on "all crypto algorithms that use keys" hehehe.
* Started off on smart cards which are maybe easier to break than other platforms.  Now moved to ASICs, FPGAs, CPU, embedded.... Same techniquest work across digital sources.  Timing, EM, RF...
* Example RSA implementation - if you do a square-and-multiply algorithm for exponentiation, then at each 0 bit you do a square, and at each 1 bit you do a square and multiply.  So a power trace tells you *exactly* what's going on.  You can read the exponent off of one power trace.  (Simple Power Analysis, as opposed to Differential)
* AES - differential power analysis - have to use several different power traces and then do statistical analysis.  Sort traces with "predicted LSB=1, predicted LSB=0".  Take each  bucket and average all traces, compute difference.  If you guessed right, then you'll get clear peaks.  If you get basically nothing, then you guessed wrong.
* Classical countermeasures: obfuscation, leak reduction, balanced HW/SW, dynamic differential logics (?), amplitude and temporal noise (random voltage spikes and dummy operations), add randomness (mask, blind, threshold impls) - ALL of these assume you can change your actual algorithm.
* What if you can't change algorithm?  What about protocol level?  That is this talk.  We'll talk about symmetric crypto.
* If each time I use the key I leak a little information, then my goal is to not use the same key for too many ciphertexts.  Use key carefully and only a few times.
* Use a key tree. {K_ROOT, {K_{ROOT,0}, K_{ROOT,1}}, ..... }  Walk down a tree of keys.  Follow a path down a tree, apply OWFs f0 and f1 to evolve it.
* Challenge response.  Say PATH=SHA-256(challenge).  Go down tree to get response to challenge = K_{ROOT,PATH}
* No matter how many times the keytree is traversed, no key is involved in more than 3 different operations, which is not enough to run a DPA attack.
* You can go further: leakage resistant encryption/decryption (can't really handle malicious ciphertexts, so means authentication so not vulnerable to CCA)
* Cool thing with diagram of what was sent that I didn't really manage to capture; check slides.
* Real world product testing - test vector leakage assessment methodology (TVLA) !!!!!!!!!!
* Goal of TVLA: test methods should be repeatable, precise, automated, objective, low cost, fast time to market
* Validation vs evaluatino.  In smart card world, evaluation used.  You basically have a defined security env/threat model, and if you crack your own thing, you're good.  For validation, have defined tasks, demonstrate conformance to spec, want lab consistency.  There are tradeoffs - eval is limited by lab expertise, potential inconsistency of evaluations, higher cost.  Val may not find new vulnerabilities, no pen testing, limited by test spec and coverage.  Is more standardizable, easier, etc.  They're hoping the validation approach will end up in ISO standards.
* Testing is *hard*.  It should like, be its own PhD course.  I have minimal interest, mind you, but it's way harder than we give it credit for.
* Test vector approach.  Change from attack-style methodology to white-box validation.  Similar to FIPS CAVP.  Test vectors designed by experts to detect implementation flaws.  Does not require super lab expertise.  Spec controls keys, data, and other snsitive parameters.  Instead of trying attacks, target intermediates and leakage modes.  Attempt partial key guessing.
* Use Welch's t-Test to determine leakage within confidence interval.  Measures significance of "difference of means"
* Crap I really need to review some stats.  I'd have been able to tell you about all this three years ago.  Yikes.  In fact, on that note, need a better way to stay up on topics that I'm not actively studying anymore.
* Trying to standardize side-channel resistace testing: ISO/IEC 19790, 17825, 20085............

----------------------------------------------------------------------------------------------

# Order Revealing Encryption - David Cash

* A superset of OPE
* Usual reason you'd do this - you can do range queries.
* New security issues with ORE.  Attacks on ORE with correlated columns; attacks on ORE with non-uniform data (apparently people haven't really analyzed this before)
* Meta-conclusion: Need to cryptanalyze definitions.models for scure-but-leake ORE in practice.  [I'm pretty sure everyone agrees that ORE is basically entirely broken by dictionary attacks? Or even with any meta-knowledge at all? But w/e]
* ORE - inherently less secure; easy plaintext recovery etc etc.  Very "brittle"
* Research approach: Construct ORE schemes with best-possible security against passive attackers who only capture ciphertexts.
* Two flavors of ORE: idael and leaky.  Ideal leaks *only* order.  Leaky could also leak some bits or something.  Leaky has fast, block cipher based constructions.  Could be statistics, basically looks terrible from my perspective.  ~~~The only way people can use leaky ORE securely is when input data are random.~~~
* Tux used to show how ORE reveals the penguin
* Datasets - california road intersections, and mobile phone location history
* Showing location info, you can still see LA and the bay and stuff.  Also, if the bounding box is known, you can guess about 30% of the points to within 50km.  Shape/density of image is preserved.
* Mobile location history - Picked subsets of data for one day/week/month.  Shapes of leakage leak definite road locations.  Could go from nothing to "this guy lives in Berlin"
* Lessons: ideal ORE on location data is way the hell too leaky.  Correlated columns can combine to leak information in encrypted databases.
* Nothing here seems particularly surprising, but it's good to get it all spelled out.
* ROPF security conjecture - "ROPF-secure construction leaks roughly (1) all the leading 0's and 1's of each message, (2) the most significant half of the remaining bits of each message".  Potentially last bits could still be secure (hm)
* Security of CLWW'15.  For each x_i, x_j \in \{0,1\}^n, it reveals index of first bit where they differ.
* Replace x's with (1/2).  On location data, still pretty silly because you can reassemble California.  Could also use a guessing algorithm, and sure enough that gets you California.
* TL;DR you better be careful as hell before using ORE

# Paul Grubbs - pag_crypto - Breaking web applications built on top of encrypted data

* Web applications super vulnerable to data breach.  Solution, amirite? Encrypt the data!  But then lots of functionality is broken! Search, etc etc.
* Property-revealing encryption (PRE) - reveal a property of the data that isn't everything but it's enough to deliver some kind of functionality
* BoPETs - "Building on Property-revealing EncrypTion" - OPE, searchable enc, deterministic enc.  (BoPET is a molecule, and might be Mylar)
* Mylar client talks to mylar server, which talks to [princiipal graph - metadata about docs, access control, etc] and [data (enc)] - wants to protect confidentiality of data against advs with full access to servers - mylar CLAIMS active security
* This talk - ignore access patterns, timing, etc etc.
* Example attack has access-control credential "party" in cleartext (available in metadata), and members of group known.  So boss knows there's a party, and who's going.
* Metadata: crucial for key management, important for basically every app, etc etc.
* Claim: metadata by itself can break confidentiality of user data.
* Ways adversary can get token: adversary shares a document with victim (or vice versa); adversary corrupts one of users with whom victim is sharing documents.  [Hypothetical defense that isn't implemented somehow attached to allowing search over document or something]
* Not gonna lie, Mylar sounds like it sucks.
* Even if you share a single document with a trusted friend and they are later hacked, all your private docs are gone.  S H E E S H
* Security of BoPETs "poorly understood", he says, but he also makes it sound like it sucks.  Active adversary very powerful; netadata leaks info about encrypted data.  Access patterns do too, though they didn't mention it in this talk.
* We still don't know how to integrate property-revealing encryption to systems

# Raluca Ada Popa - Building web applicatinos on top of encrypted data (lololol omg FIGHT FIGHT FIGHT)

* Mylar (NSDI'14) -> Verena (Oakland'16)
* Looks like basically same setup as previous talk
* Security spectrum - end-to-end encryption (IND-CCA2) gets you past passive attackers.  Dealing with active attacks: lvl1: no tampering with client-side code/key distr; lvl2: no tampering with data/query results. lvl3: hide metadata; lvl4. hide access patterns, leakage from queries/computation; lvl5: hide ops performed, timing, runtime, data size, structure
* We do not know how to build practical systems with all of this.  We have a bunch of relevant theory: ORAM, GarbledRAM, FHE, but they're too slow.  She proposes reaching security target one step at a time: each system's contribution is what it gives in addition to what existed.
* Mylar is end-to-end encryption for annotated fields, and provides no tampering with client-side code/key distribution (up through level 2)  Verena goes one step further (lvl3): no tampering with data and query results.
* New system: Opaque [NSDI17] also claims to hide metadata and to hide access patterns
* Approach: dev specifies sensitive field, browser encypts it and only browser sees these fields in plaintext.  Server does not get keys.  [Curious how they deal with multiple devices etc]
* Common web framework arch (Django, Ruby on Rails) incompatible with enc.  Burdensome data sharing (key dist)
* API same as before, dev specifies sensitive fields, provides principle graph.
* Keys encrypted with some fancy keychain thing that only lets users with access in principal graph can decrypt.
* Active adv might serve incorrect key.  Mylar certifies keys using certificate paths.  E.g. users can sign keys to vouch for them, I guess?  ..... I think this also implies that the clients have some kind of independent communication with each other or that somehow they already know each other's long term keys.  I guess that's reasonable for a slow amount, but like, if Bob queries the server for Alice's VK, how are we supposed to know we get the right one?  I guess this is the normal PKI problem but it seems like a big problem in a thing like this.
* BUILD 'EM UP, BREAK 'EM DOWN
* allow_search - this is the thing mentioned above that isn't actually implemented yet i think - attacker compromises server and then creates a principal "attacker"; gives it to bob.  bob's client generates delta and then when bob searches for a word encrypted with his own key, attacker coverts it to encryption with their key.  Delta should only be given to trustworthy principals.
* Can allow_search be revoked?
* SHOTS FIRED AT PREVIOUS PAPER OMG - metadata: out of scope; access patterns: out of scope; attack on search: does not work within a correct usage of mylar
* diplomatic "if things are out of scope, they're still useful"..... "we just have to be aware they're out of scope"
* murmuring from the audience
* I think the access pattern one is reasonable - ORAM still too expensive / not mature
* Attack on search - claimed prevented by allow_search, though of course prev paper says not actually implemented...
* Verena - ensures active server attacker cannot tamper with data, computation, code, or keys.  Benefits: conf against active attacks (when used in concert with mylar); "e2e integrity" (ensures users get correct results)
* Proof returned with results that verifies: data was supplied only by who we said it was, that it's complete, ... ... (missed words)
* Related work is either only designed for single user or wasn't appropriate for web setting
* Impossibility result - if server actively compromised and users do not interact with each other, no freshness.
* Introduce "hash server" - does not collude with other server.   Simple, mostly a key-value store for hashes; small TCB; stores most recent hashes to get freshness.  One or the other of the servers may be corrupted, but not both.
* Deverloper specifies integrity policy attached to queries (not data).  Query runs in a "trust context" - only members of the trust context can affect queries result.  hmm
* Enforcement: authenticated data structures; values in children and then higher layers have partial aggregate; build merkle tree on top of this
* More in papers; claims modest performance overhead
* Building real-world crypto-based systems is challenging!  But we don't have to do it all at once.  I agree with this (as long as you advertise exactly the ways in which you hav not reached the 'target' yet...)
* Audience member points out that users are incredibly crappy at making good decisions.  Another audience member points out that in some sense, swiss cheese is worse than no coverage, because people don't know where the holes are and you need experts to understand it.  [I'm not quite that cynical yet though I'm sure I'll get there.....]

----------------------------------------------------------------------------------------------

# PRNG failures and TLS vulnerabilities in the wild - david mcgrew

* Threat-driven approach to crypto:
* you, an infosec professional, have threats, controls, vulnerabilities, assets to work with and you have to defend.
* shoutout to Phil Rogaway's paper about moral character of crypto work
* Crypto visibility - do you have understanding of where/what kind of crypto is being used?  Is encryption used where needed? where bad certs/keys in use? where weak crypto being used? are there active atks? is crypto used weak/does it have flaws?
* Multisession monitoring - e.g. if you want to find PRNG duplicate state, you need to keep lots of state.  Single monitoring state adequate (though not ideal)
* set of "typical PRNG states" may be significantly smaller than "all possible PRNG states" (typica in shannon entropy sense)
* monitoring tool: https://github.com/davidmcgrew/joy does pcap -> json
* Virtual machines: snapshot is set of bits that can be cloned...
* Autoscaling: server with a hypervisor, say it runs an image, then when load increases, it'll spin up a clone...
* Diff between volume snapshot and full snapshot.  Volume: image of bootable disk; boot latency; not vulnerable.  Full snapshot: RAM+disk; no boot latency; vulnerable.
* There ARE things that store RNG seed on volume (bad), but this study did not consider them.
* VM experiments: sandbox (ThreatGRID with windows 7 SP1 - full snapshot - can introduce executable (presumably malware) sample, it'll run it, and then when it's done, it checks to see if it was "good" or "bad" ; linked clones (VMWare) ; Docker
* Turtles all the way down (application < container < virtual machine < operating system)
* VM people always glad to hear that the turtle below them (OS) isn't acting in a way to undermine security.  Turtles below can always subvert you.
* TLS failure scenarios: aberrant implementation, PRNG failure...
* Field in ServerHello and ClientHello called Random; serves as a random nonce.  Provides replay protection, used as unique input for key derivation purposes.  For client it's supposed to have 32 bits of time and then random bytes.  Server is supposed to just have the random.  RFC is slightly loose as to this; some people don't put time in client, some people put time in server even though they're not supposed to...
* Simple TLS model: clock -> time; entropy -> PRNG -> {nonce generator -> random ; rsa/ecdh -> client key exchange}
* Four main failure modes
* Aberrant implementation - "misc" failure mode - fails to conform to spec, probably not trying to conform to spec.  Fields may be fixed or otherwise repeating.  May target clock, nonce generator, RSA/[EC]DH - fields fixed or predictable in some unspecified way
* PRNG flaw - e.g. slow updating (say, it only updates 1/ms).  - address will be same and failure will be successive
* MRIFS (missed this dang) - repeats are far apart; addresses may be different
* Observations in the wild - ~1000 hosts on 2 enterprise networks - 28M TLS sessions - 4 aberrant instances, 1 PRNG instance, 1 VM snapshot instance.
* Crypto attacks - replay and cut/paste records between colliding sessions.
* Flawed server with [EC]DSA -> private key recovery
* Flawed client - act as server for colliding session -> get session keys; downgrade colliding session to EXPORT-type cipher -> break EXPORT cipher -> find original keys
* flawed client with RSA: replay colliding session -> cause symmetric keys to collide.  RC4, GCM, CCM, ChaCha: plaintext leakage. CBC: partial plaintext leakage. GCM: authentication key leakage.
* FULL SNAPSHOTS MUST NOT BE USED IN PRODUCTION
* TLS SHOULD NOT USE RSA ENC KEY TRANSPORT (thank u TLS 1.3)
* [EC]DSA SHOULD USE DETERMINISTIC VARIANT [RFC6979]
* or stir PRF(message) into entropy prior to sig (?)
* robust AEAD e.g. AES-GCM-SIV should be DEFAULT and unless you absolutely can't use it, you should use it
* And if you don't do these things, test!
* mcgrew at cisco.com - they're hiring for both research and coders

# concerto - Olivier Levillain - SSL/TLS data analysis tool

* SSL/TLS data collection: criteria - lots of choice of things like version, suites, extensions.  Might want to look at protocol features/crypto capabilities, certs, cerver behavior....
* Different methodologies: full IPv4 scans, domain names scans, passive observation.
* motivation for "concerto": did a study in 2012, then wanted to do one in 2015, and wasn't sure how to compare situation now/then, how to include new external datasets, many criteria had to evolve, etc etc. Concerto built as a thing to encourage reproducible analysis: keep raw data, ... ...
* Context perparation: cert store extraction, metadata injection for posterity. Answer injection, answer type analysis, raw cert extraction.  Cert analysis (parse, build all possible chains), and then produce statistics.
* If you propose two ciphersuites, you expect the server to send back either one of the two suites or an alert, but sometimes it returns an entirely different cipher suite, or the NULL ciphersuite, or somethimes you get a ServerHello *missing two bytes* (why???)
* They used parsifal (open source framework) to do binary parsers, and use metadata to spot inconsistencies.
* TLS 1.2 87% these days (much better than 30% in 2014 and 0% in 2011) - "Just keep in mind it was a specification published in 2008)".  Brief blip of TLS 1.1 and SSLv3 gone now.
* Usually Certificate message: server cert first, each following CA cert must sign preceding one, root CA may be ommitted.  In reality though, unordered messages, all sorts of goofy stuff going on.  He's going through these slides really fast so I didn't catch all the other stuff.
* RFC compliant percentage actually going *down*
* A lot of the time you'll get certs that have nothing to do with the stuff you want.  This makes building cert chains hard.
* concerto does not exhaustively build all possible chains; X.509v1 certs generated by appliances have no extension, sometimes considered CA, but not here.
* Crazy cross-certification - there are mutually cross-signed CAs.  Each CA has emitted several distinct certificates with the same public key but with different valid dates.  Lots of possible ways to decide how to deal with this data and how to extend the chain.
* Weird stuff like X.509v4 certs (from the fuuuuuuuuuture????), lotsa people still using MD5, all sorts of silliness.  People still accept SSLv2
* concerto today mostly CSV tables.  Cool stuff in way you build chains and in parsers.  Lots of work still.

# Rupture API - Eva Sarafianou

* Victim does regular HTTP browsing; attacker injects javascript.  Opens C&C channel.  Now later, victim does HTTPS requests.  Attacker cannot read data, but can see them passing encrypted through network and can measure things.  To initiate attack, attacker must guess part of secret.  (e.g. gmail token).  E.g. attacker makes victim issue multiple search queries, knows what some of the response is going to be.  And then byte-by-byte can   Uses it in reflection.  Compressed/encrypted response is shorter if correct!!!!!!
* By adaptively choosing reflections, can lead to full recovery.  But challenges: noise; antagonistic compression methods (Huffman coding) - can alter predictions about length of response; unrelated static content on page matching candidates.
* Rupture is OS, production level
* "Injector" lives on victim network, uses ARP spoofing to inject, injects in every HTTP connection.
* "Client" dump, ... (also in victim network)
* Sniffer listens to HSTS, HTTPS, talks HTTP to ~somewhere else~ (also in victim network)
* backend django, sqlite db, adv, etc etc.
* Usable open sourec tool, attack is easy and practical via web UI, reusable RESTful API
* Gain new victim, say, by nmap or directly by IP
* https://github.com/dionyziz/rupture
* https://dionyziz.com
* http(s)://kiayias.com
* https://esarafianou.github.io

----------------------------------------------------------------------------------------------

# Elaine Shi - Blockchain -> internet-scale consensus

* "I guess it's okay if you tell me I can't fly on a plane today, but if you tell me I can't do science I will be very cranky"
* Need replication and robustness of web; we've had these concepts in distributed computing since Forever (30 years ago)
* State-machine replication (a.k.a. linearly ordered log, consensus) - two important security properties - consistency (honest noeds agree on log), and liveness (transactions incorporated quickly)
* In the year 1 A.B. (after bitcoin), there was Genesis
* internet-scale consensus (issues: millions of nodes, mutually distrustful, networks are unpredictable) - traditional consensus protocols not robust enough, but maybe blockchain?
* What is robustness?  What happens when you ask 300M people to vote? 1. only 160M show up.  Hopefully, 51% of those show up are "honest".  Otherwise we can't predict outcome of election.
* "Sleepy" model of conseusus": nodes are either "sleepy" (offline) or "awake" (online/active); status of each node can change.  Malicious nodes can behave arbitrarily, can delay/reorder messages.  But online honest nodes can recv messages quickly (may be out of order, but will get relatively quickly, otherwise we treat them as being offline).
* Can we achieve consensus when 51% of online nodes are honest?  1. LB: consensus impossible if <50% online nodes are honest.  No classical protocol will work even if 99% of online nodes are honest!
* Classical protocols either synchronous (messages delivered immediately), or asynchronous (adv can delay messages indefinitely).  We know a lot about both of these models...
* Sleepiness makes message delays indefinite.  (Waking nodes received queued messages).  So definitely synchronous model doesn't work.  But what about asynchronous?  We're going to pessimistically assume that sleeping nodes are corrupt.  Protocol that tolerates 1/3 corruption fails when 99% offline.
* Note: there are other robustness requirements, but we're ignoring them for now.
* How to achieve robustness.  Premise: Bitcoin's Nakamoto blockchain is robust.  It's actually amazing, Bitcoin has been running without much interruption for the past 8 years.  (the "honeybadger" of money lol)
* Nakamoto blockchain DOES achieve consensus when 51% of online nodes are honest.
* Amt of energy consumed by bitcoin: 1.5GW, which is >10% of all solar energy generated in U.S. (WOW i had no idea bitcoin was this wasteful)
* Can we achieve Nakamoto blockchain robustness but not pay expensive PoW computation? (without breaking security?)
* Brief blockchain explanation ommitted
* Protocol is not secure; once elected a leader adversary can sign many blocks and mine into the future (timestamps).
* Fix: time in blocks have to strictly increase.  Honest nodes reject "future" blocks.
* New protocol is secure, but not trivially so.
* Basically adversary can still reuse elected slots on multiple forks.  (Not possible in PoW blockchain; therefore previous blockchain analyses fail)
* https://eprint.iacr.org/2016/918
* Application of sleepy consensus: consortium blockchain.  Banks want this!  They want a distributed ledger, each bank contributes some consensus nodes, this allows faster inter-bank settlement.  Lots of cryptocurrency stuff that are trying to help this.
* www.initc3.org

# Ripple+Privacy - Pedro Moreno-Sanches - Purdue

* Transactions in real world: Bob gives Alice 100, Alice gives IOU 100 to Bob.  Use trust for this.
* Credit network  - Carol pays Alice 10 who pays Bob 110 who pays Dave 10.  So really Carol has paid Dave 10 and Alice has not lost any money as a result.
* Create a directed graph, lower credit along paths to settle.
* Credit networks mitigate malicious users and Sybill attacks.  Malicious users can spin up as many users as they want but it's hard to get trust from honest users.
* Can be used to prevent spam, strengthen e-commerce, can be involved in voting (somehow)
* Ripple credit network.  Supports many currencies, cryptocurrencies, allows users to define own.  Make intercurrency transactions in seconds, with no fees, and with public verifiability
* Has 1M trade volume, used by several banks.
* We have crytocurrencies, why do we need credit networks?  Cryptocurrency is actual currency (is it though? :<) IOU credit networks are transaction networks.  Transfer of funds are direct in CC, transactions are via paths with enough credit in IOUCN.  Cryptocurrencies not scalable (hm) and IOUCN highly scalable (HMMMMMM)
* Both have public verifiability of transactions.
* Ripple provides pseudonymity to its users by using public-key hashes as identities.
* Man everyone uses hash functions as OWFs.  Sheesh.
* Ripple has #Problems (at least with privacy)
* Sort of tuned out for part of this, sorry :c
* at Ripple, discussions on privacy (demanded by banks, wanted by Ripple community)
* What does privacy mean in credit networks?  Either "transaction value" privacy.  Or "transaction receiver privacy" or "transaction sender privacy".  Basically, do you hide the amount, or do you hide the recipient?  Or the sender?
* Distributed approach - links are locally stored (all we care about is net flow).  That way we don't need a ledger or a PoW.  Strong privacy guarantees, and ops work the same as Ripple.  (Security? Hmm.)  NDSS '17. PathShuffle.
* They've also got cool things like atomic interactions.  Simple smart contracts.

# Hyperledger fabric - Christian Cachin - IBM Zurich

* Hyperledger Fabric - blockchain platform for the enterprise (hm)
* Smart contracts
* Open source, collaboratively in the Hyperledger Project
* Four elements characterize blockchain: replicated ledger (history of transactions, append-only, distributed); cryptography (integrity, authenticity, privacy); consensus (again, decentralized, tolerates disruption of nodes, validated transactions); business logic ("even though it was born with Bitcoin, it has more applications, e.g. smart contracts")
* I still don't get why everyone is so excited about blockchain
* Old trust model of financial business: point-to-pinot.  Promise of blockchain is to "reduce trust and replace it by technology" - no trusted entity.
* I'm skeptical
* Model blockchain as a state machine
* Three kinds of blockchain consensus: decentralized(permissionless) e.g. bitcoin; somewhat decentralized (e.g. ripple)); consortium/permissioned
* Could consider bitcoin a lottery race. (winner's block is executed and "mining" has occurred).  Idea was "once CPU one vote".  So much for that.  Also, you have to wait a bit (6 blocks, generally) before a transaction is confirmed.
* Consortium consensus (BFT, Hyperledger) - tolerates f-out-of-n faulty/adversarial nodes.  (Byzantine Agreement).  Central entity controls group membership here.
* State awkwward - 100 Gb of state required to verify new block correct (why would you want this).  State is kept off-chain and produces verifiable transactions.
* Zoned out a bit here; didn't really consider it relevant to me.

# Leo - authenticating dynamic dictionaries with apps to cryptocurrencies

* Code released yesreday whoa
* motivation: validate transactions.  Alice wants to pay 14 bitcoins to Bob.  Stateless validation: check stuff you can see from the transaction itself.  But how do you actually know Alice has 14 bitcoins to give away?  Now need stateful validation; have to look things up in key-val store.  This thing is enormous (1.5GB serialized in bitcoin rn, gonna get a lot worse)
* Where do we keep the state?  Either disk storage (slow validation, has been used for dos), or ram (only can be done on powerful computers which leads to centralization)
* Transaction verifier doesn't need *entire* data structure, just need to know how much alice has.  Or at least a proof that she has enough.
* Idea: Miller'12, Ethereum/Wood'14,White'15: use ADS and include proof of balance with transactions.
* Merkle treeeeeeeees (transactions include merkle path)
* This should be short enough (log of length of key-value store) to not incur DoS
* Problem: Verifier doesn't have new root hash........ Must be able to compute new root hash without knowing entire tree.  In typical ADS, verifier not checking the new root; this is harder.
* Need ADS that can be rebalanced a lot! Of course we're changing values, but we're also adding keys, and maybe getting rid of keys.  This is hard in binary trees!  This means tree rebalancing!!!!!
* Prior work: mostly based on merkle trees. diffs are how to structure/rebalance underlying binary tree.
* Skip list? (Papamanthou-Tamassia'07) - update: 1.5HlogN; insert new - 1.5 H log N (let H = # PKs, H = length of hash) (50% worse than optimal).  This requires a trusted source of randomness!!!!!1  bad randomness => unbalanced tree => very long proofs => DoS
* Generic data structures (red-black+ used), Miller-Hicks-Katz-Shi '14 - update existing: (H+K)log N; insert new - 3(H+K)logN
* Various 3 party solutions don't work.
* AVL+ trees - trees from the 60's.  update and insert are H log N.  3x more efficient than etherium in terms of length! ;
* A few improvements, including combining proofs.
* ia.cr/2016/994
* github.com/input-output-hk/scrypto