It is impossible to fix security for everyone in all cases. Nothing can be made foolproof, because fools are too ingenious.
However, currently, even users who "do everything right" are remarkably unsafe, because important software fails to be secure even when properly configured.
I present the case that, surprisingly, enough is finally known to dramatically improve this problem.
I will argue that, if appropriate and (entirely affordable) methods are adopted by the authors of important systems, slow incremental improvements could at last fix the infrastructure problem, and possibly even before computer security issues destroy our civilization.
I will discuss the necessary techniques, and present a plan for their adoption.
When binary data are sent over a byte stream, the binary format sender and receiver are using is a "data serialization language", either explicitly specified, or implied by the implementations. Security is at risk when sender and receiver disagree on details of this language. If, e.g., the receiver fails to reject invalid messages, an adversary may assemble such invalid messages to compromise the receiver's security.
Many data serialization languages are length-prefix languages. When sending/storing some F of flexible size, F is encoded at the binary level as a pair (|F|,F), with |F| representing the length of F (typically in bytes).
This paper's main contributions and results are as follows.
(1) Length-prefix languages are not context-free. This might seem to justify the conjecture that parsing those languages is difficult and not efficient.
(2) The class of "calc-regular languages" is proposed, a minimalistic extension of regular languages with the additional property of handling length-fields. Calc-regular languages can be specified via "calc-regular expressions", a natural extension of regular expressions.
(3) Calc-regular languages are almost as easy to parse as regular languages, using finite-state machines with additional accumulators. This disproves the conjecture from (1).
Despite being known for a long time, memory violations are still a very important cause of security problems in low-level programming languages containing data parsers. We address this problem by proposing a pragmatic solution to fix not only bugs, but class of bugs. First, using a fast and safe language such as Rust, and then using a parser combinator. We discuss the advantages and difficulties of this solution, and we present two cases of how to implement safe parsers and insert them in large C projects. The implementation is provided as a set of parsers and projects in the Rust language.
Software systems nowadays communicate via a number of complex languages. This is often the cause of security vulnerabilities like arbitrary code execution, or injections. Whereby injections such as cross-site scripting are widely known from textual languages such as HTML and JSON that constantly gain more popularity. These systems use parsers to read input and unparsers write output, where these security vulnerabilities arise. Therefore correct parsing and unparsing of messages is of the utmost importance when developing secure and reliable systems. Part of the challenge developers face is to correctly encode data during unparsing and decode it during parsing.
This paper presents McHammerCoder, an (un)parser and encoding generator supporting textual and binary languages. Those (un)parsers automatically apply the generated encoding, that is derived from the language's grammar. Therefore manually defining and applying encoding is not required to effectively prevent injections when using McHammerCoder. By specifying the communication language within a grammar, McHammerCoder provides developers with correct input and output handling code for their custom language.
Programming languages such as Java and C# execute code with different levels of trust in the same process, and rely on an access control model with fine-grained permissions to protect program code. Permissions are checked programmatically, and rely on programmer discipline. This can lead to subtle errors. To enable automatic security analysis about unauthorised access or information flow, it is necessary to reason about security-sensitive entities in libraries that must be protected by appropriate sanitisation/declassification via permission checks. Unfortunately, security-sensitive entities are not clearly identified.
In this paper, we investigate security-sensitive entities used in Java-like languages, and develop a static program analysis technique to identify them in large code-bases by analysing the patterns of permission checks. Although the technique is generic, our focus is on Java where checkPermission calls are used to guard potential security-sensitive entities. Our inference analysis uses two parameters called proximity and coverage to reduce false-positive and false-negative reports. The usefulness of the analysis is illustrated by the results obtained while checking the OpenJDK7-b147 for conformance to Java Secure Coding Guidelines that relate to the confidentiality and integrity requirements
Communication protocols are complex; their implementations are difficult, causing many unintended (and severe) vulnerabilities in protocol parsing. While the problem of packet parsing is solved, session parsing remains challenging. Building on existing systems that reliably parse individual messages, we present our four-component framework for implementing protocol session parsers with the goal to improve security of protocol parsing: specification of a protocol message, description of a protocol state machine, testing routines to validate implementations against fake and real data, and graph generation to visualize implementation. This framework enables the creation of a session parser, which validates individual protocol messages in the context of other messages in the same conversation. This is helpful because more secure parsers lead to more secure communication.
Unprincipled input handling has caused many of the most prevalent and severe vulnerabilities in the Internet era, and this trend appears to continue in the emerging Internet of Things (IoT). In this paper, we present a methodology to build secure input-handling functionality for application-layer IoT protocols by applying the Language-theoretic Security (LangSec) philosophy. We have built working implementations for the XMPP and MQTT protocols and demonstrated that our clients, which consist of less than a hundred lines of code, correctly recognize all valid messages in our tests. With respect to CPU time, our clients compare well against the most widely deployed implementations of these two protocols.
Preliminary experiment design and research goals are presented to measure the security implications of using just-in-time (JIT) compilation in conjunction with hardware-enforced restrictions on the computational complexity of general purpose applications. LLVM passes are used during compilation to mark sub-components of an application if it is provably halting (Walter recursive). A modified JIT engine is under development to unroll loops based on run-time semantics, while Intel's Processor Tracing technology enforces those run-time bounds. This paper describes an on-going effort with the goals of improving security through the enforcement of restricted computational environments; an analysis of the impact on the attacker when return-oriented programming is limited to unidirectional execution highlights the impact of the Ristretto system.
More to be announced by March 25
We report on our experiences in synthesizing a fully-featured autopilot from embedded domain-specific languages (EDSLs) hosted in Haskell. The autopilot was built for the DARPA High-Assurance Cyber-Military Systems (HACMS) program. After an independent red-team assessment, one government official remarked that it was likely "the most secure UAV in the world". We will also describe our experiences in transitioning the EDSLs to Boeing for use in large-scale platforms. The EDSLs and autopilot are both open-source.
Every year since 2014 most bugs that topped LangSec headlines were "LangSec" bugs, findable and preventable using the LangSec approach. Did 2016 have the worst LangSec bugs yet?