LangSec Workshop

at IEEE Security & Privacy, Thursday May 25, 2017

Abstracts

Keynote

Opening Keynote: Perry Metzger, "Slow But Steady: Achieving Real Security Within Two Decades"

It is impossible to fix security for everyone in all cases. Nothing can be made foolproof, because fools are too ingenious.

However, currently, even users who "do everything right" are remarkably unsafe, because important software fails to be secure even when properly configured.

I present the case that, surprisingly, enough is finally known to dramatically improve this problem.

I will argue that, if appropriate and (entirely affordable) methods are adopted by the authors of important systems, slow incremental improvements could at last fix the infrastructure problem, and possibly even before computer security issues destroy our civilization.

I will discuss the necessary techniques, and present a plan for their adoption.

Papers

Stefan Lucks, Norina Grosch and Joshua Koenig, "Taming the Length Field in Binary Data: Calc-Regular Languages"

When binary data are sent over a byte stream, the binary format sender and receiver are using is a "data serialization language", either explicitly specified, or implied by the implementations. Security is at risk when sender and receiver disagree on details of this language. If, e.g., the receiver fails to reject invalid messages, an adversary may assemble such invalid messages to compromise the receiver's security.

Many data serialization languages are length-prefix languages. When sending/storing some F of flexible size, F is encoded at the binary level as a pair (|F|,F), with |F| representing the length of F (typically in bytes).

This paper's main contributions and results are as follows.

(1) Length-prefix languages are not context-free. This might seem to justify the conjecture that parsing those languages is difficult and not efficient.

(2) The class of "calc-regular languages" is proposed, a minimalistic extension of regular languages with the additional property of handling length-fields. Calc-regular languages can be specified via "calc-regular expressions", a natural extension of regular expressions.

(3) Calc-regular languages are almost as easy to parse as regular languages, using finite-state machines with additional accumulators. This disproves the conjecture from (1).

Pierre Chifflier and Geoffroy Couprie, "Writing parsers like it is 2017"

Despite being known for a long time, memory violations are still a very important cause of security problems in low-level programming languages containing data parsers. We address this problem by proposing a pragmatic solution to fix not only bugs, but class of bugs. First, using a fast and safe language such as Rust, and then using a parser combinator. We discuss the advantages and difficulties of this solution, and we present two cases of how to implement safe parsers and insert them in large C projects. The implementation is provided as a set of parsers and projects in the Rust language.

Tobias Bieschke, Lars Hermerschmidt, Bernhard Rumpe and Peter Stanchev, "Eliminating Input-Based Attacks by Deriving Automated Encoders and Decoders from Context-free Grammars"

Software systems nowadays communicate via a number of complex languages. This is often the cause of security vulnerabilities like arbitrary code execution, or injections. Whereby injections such as cross-site scripting are widely known from textual languages such as HTML and JSON that constantly gain more popularity. These systems use parsers to read input and unparsers write output, where these security vulnerabilities arise. Therefore correct parsing and unparsing of messages is of the utmost importance when developing secure and reliable systems. Part of the challenge developers face is to correctly encode data during unparsing and decode it during parsing.

This paper presents McHammerCoder, an (un)parser and encoding generator supporting textual and binary languages. Those (un)parsers automatically apply the generated encoding, that is derived from the language's grammar. Therefore manually defining and applying encoding is not required to effectively prevent injections when using McHammerCoder. By specifying the communication language within a grammar, McHammerCoder provides developers with correct input and output handling code for their custom language.

Yi Lu, Sora Bae, Paddy Krishnan, and Raghavendra K.R., "Inference of Security-Sensitive Entities in Libraries"

Programming languages such as Java and C# execute code with different levels of trust in the same process, and rely on an access control model with fine-grained permissions to protect program code. Permissions are checked programmatically, and rely on programmer discipline. This can lead to subtle errors. To enable automatic security analysis about unauthorised access or information flow, it is necessary to reason about security-sensitive entities in libraries that must be protected by appropriate sanitisation/declassification via permission checks. Unfortunately, security-sensitive entities are not clearly identified.

In this paper, we investigate security-sensitive entities used in Java-like languages, and develop a static program analysis technique to identify them in large code-bases by analysing the patterns of permission checks. Although the technique is generic, our focus is on Java where checkPermission calls are used to guard potential security-sensitive entities. Our inference analysis uses two parameters called proximity and coverage to reduce false-positive and false-negative reports. The usefulness of the analysis is illustrated by the results obtained while checking the OpenJDK7-b147 for conformance to Java Secure Coding Guidelines that relate to the confidentiality and integrity requirements

Research Reports

Kit S Tse and Peter C Johnson, "A Framework for Validating Session Protocols"

Communication protocols are complex; their implementations are difficult, causing many unintended (and severe) vulnerabilities in protocol parsing. While the problem of packet parsing is solved, session parsing remains challenging. Building on existing systems that reliably parse individual messages, we present our four-component framework for implementing protocol session parsers with the goal to improve security of protocol parsing: specification of a protocol message, description of a protocol state machine, testing routines to validate implementations against fake and real data, and graph generation to visualize implementation. This framework enables the creation of a session parser, which validates individual protocol messages in the context of other messages in the same conversation. This is helpful because more secure parsers lead to more secure communication.

Prashant Anantharaman, Michael Locasto, Gabriela F. Ciocarlie and Ulf Lindqvist, "Building Hardened Internet-of-Things Clients with Language-theoretic Security"

Unprincipled input handling has caused many of the most prevalent and severe vulnerabilities in the Internet era, and this trend appears to continue in the emerging Internet of Things (IoT). In this paper, we present a methodology to build secure input-handling functionality for application-layer IoT protocols by applying the Language-theoretic Security (LangSec) philosophy. We have built working implementations for the XMPP and MQTT protocols and demonstrated that our clients, which consist of less than a hundred lines of code, correctly recognize all valid messages in our tests. With respect to CPU time, our clients compare well against the most widely deployed implementations of these two protocols.

Jacob Torrey, Mark Bridgman and Tomasz Tuzel, "Hardware-Enforcement of Walther-Recursive Program Functions"

Preliminary experiment design and research goals are presented to measure the security implications of using just-in-time (JIT) compilation in conjunction with hardware-enforced restrictions on the computational complexity of general purpose applications. LLVM passes are used during compilation to mark sub-components of an application if it is provably halting (Walter recursive). A modified JIT engine is under development to unroll loops based on run-time semantics, while Intel's Processor Tracing technology enforces those run-time bounds. This paper describes an on-going effort with the goals of improving security through the enforcement of restricted computational environments; an analysis of the impact on the attacker when return-oriented programming is limited to unidirectional execution highlights the impact of the Ristretto system.

TBA

More to be announced by March 25

Invited Talks

Lee Pike, "Programming Languages for High-Assurance Vehicles"

We report on our experiences in synthesizing a fully-featured autopilot from embedded domain-specific languages (EDSLs) hosted in Haskell. The autopilot was built for the DARPA High-Assurance Cyber-Military Systems (HACMS) program. After an independent red-team assessment, one government official remarked that it was likely "the most secure UAV in the world". We will also describe our experiences in transitioning the EDSLs to Boeing for use in large-scale platforms. The EDSLs and autopilot are both open-source.

TBA

LangSec Roundtable

2016: The Year of LangSec Bugs?

Every year since 2014 most bugs that topped LangSec headlines were "LangSec" bugs, findable and preventable using the LangSec approach. Did 2016 have the worst LangSec bugs yet?