It is impossible to fix security for everyone in all cases. Nothing can be made foolproof, because fools are too ingenious.
However, currently, even users who "do everything right" are remarkably unsafe, because important software fails to be secure even when properly configured.
I present the case that, surprisingly, enough is finally known to dramatically improve this problem.
I will argue that, if appropriate and (entirely affordable) methods are adopted by the authors of important systems, slow incremental improvements could at last fix the infrastructure problem, and possibly even before computer security issues destroy our civilization.
I will discuss the necessary techniques, and present a plan for their adoption.
When binary data are sent over a byte stream, the binary format sender and receiver are using is a "data serialization language", either explicitly specified, or implied by the implementations. Security is at risk when sender and receiver disagree on details of this language. If, e.g., the receiver fails to reject invalid messages, an adversary may assemble such invalid messages to compromise the receiver's security.
Many data serialization languages are length-prefix languages. When sending/storing some F of flexible size, F is encoded at the binary level as a pair (|F|,F), with |F| representing the length of F (typically in bytes).
This paper's main contributions and results are as follows.
(1) Length-prefix languages are not context-free. This might seem to justify the conjecture that parsing those languages is difficult and not efficient.
(2) The class of "calc-regular languages" is proposed, a minimalistic extension of regular languages with the additional property of handling length-fields. Calc-regular languages can be specified via "calc-regular expressions", a natural extension of regular expressions.
(3) Calc-regular languages are almost as easy to parse as regular languages, using finite-state machines with additional accumulators. This disproves the conjecture from (1).
Despite being known for a long time, memory violations are still a very important cause of security problems in low-level programming languages containing data parsers. We address this problem by proposing a pragmatic solution to fix not only bugs, but class of bugs. First, using a fast and safe language such as Rust, and then using a parser combinator. We discuss the advantages and difficulties of this solution, and we present two cases of how to implement safe parsers and insert them in large C projects. The implementation is provided as a set of parsers and projects in the Rust language.
Software systems nowadays communicate via a number of complex languages. This is often the cause of security vulnerabilities like arbitrary code execution, or injections. Whereby injections such as cross-site scripting are widely known from textual languages such as HTML and JSON that constantly gain more popularity. These systems use parsers to read input and unparsers write output, where these security vulnerabilities arise. Therefore correct parsing and unparsing of messages is of the utmost importance when developing secure and reliable systems. Part of the challenge developers face is to correctly encode data during unparsing and decode it during parsing.
This paper presents McHammerCoder, an (un)parser and encoding generator supporting textual and binary languages. Those (un)parsers automatically apply the generated encoding, that is derived from the language's grammar. Therefore manually defining and applying encoding is not required to effectively prevent injections when using McHammerCoder. By specifying the communication language within a grammar, McHammerCoder provides developers with correct input and output handling code for their custom language.
Programming languages such as Java and C# execute code with different levels of trust in the same process, and rely on an access control model with fine-grained permissions to protect program code. Permissions are checked programmatically, and rely on programmer discipline. This can lead to subtle errors. To enable automatic security analysis about unauthorised access or information flow, it is necessary to reason about security-sensitive entities in libraries that must be protected by appropriate sanitisation/declassification via permission checks. Unfortunately, security-sensitive entities are not clearly identified.
In this paper, we investigate security-sensitive entities used in Java-like languages, and develop a static program analysis technique to identify them in large code-bases by analysing the patterns of permission checks. Although the technique is generic, our focus is on Java where checkPermission calls are used to guard potential security-sensitive entities. Our inference analysis uses two parameters called proximity and coverage to reduce false-positive and false-negative reports. The usefulness of the analysis is illustrated by the results obtained while checking the OpenJDK7-b147 for conformance to Java Secure Coding Guidelines that relate to the confidentiality and integrity requirements
Communication protocols are complex; their implementations are difficult, causing many unintended (and severe) vulnerabilities in protocol parsing. While the problem of packet parsing is solved, session parsing remains challenging. Building on existing systems that reliably parse individual messages, we present our four-component framework for implementing protocol session parsers with the goal to improve security of protocol parsing: specification of a protocol message, description of a protocol state machine, testing routines to validate implementations against fake and real data, and graph generation to visualize implementation. This framework enables the creation of a session parser, which validates individual protocol messages in the context of other messages in the same conversation. This is helpful because more secure parsers lead to more secure communication.
Unprincipled input handling has caused many of the most prevalent and severe vulnerabilities in the Internet era, and this trend appears to continue in the emerging Internet of Things (IoT). In this paper, we present a methodology to build secure input-handling functionality for application-layer IoT protocols by applying the Language-theoretic Security (LangSec) philosophy. We have built working implementations for the XMPP and MQTT protocols and demonstrated that our clients, which consist of less than a hundred lines of code, correctly recognize all valid messages in our tests. With respect to CPU time, our clients compare well against the most widely deployed implementations of these two protocols.
Preliminary experiment design and research goals are presented to measure the security implications of using just-in-time (JIT) compilation in conjunction with hardware-enforced restrictions on the computational complexity of general purpose applications. LLVM passes are used during compilation to mark sub-components of an application if it is provably halting (Walter recursive). A modified JIT engine is under development to unroll loops based on run-time semantics, while Intel's Processor Tracing technology enforces those run-time bounds. This paper describes an on-going effort with the goals of improving security through the enforcement of restricted computational environments; an analysis of the impact on the attacker when return-oriented programming is limited to unidirectional execution highlights the impact of the Ristretto system.
Lua is an interpreted, cross-platform, embeddable, performant and low-footprint language. Lua's popularity is on the rise in the last couple of years. Simple design and efficient usage of resources combined with its performance make it attractive for production web applications even to big organizations such as Wikipedia, CloudFlare and GitHub. In addition to this, Lua is one of the preferred choices for programming embedded and IoT devices. This context suggests a large and growing Lua codebase yet to be assessed. This growing Lua codebase could be potentially driving production servers and extremely large number of devices, some perhaps with mission-critical function for example in automotive or home-automation domains.
However, there is a substantial and obvious lack of static analysis tools and vulnerable code corpora for Lua as compared to other increasingly popular languages, such as PHP, Python and JavaScript. Even the state-of-the-art commercial tools that support dozens of languages and technologies actually do not support Lua static code analysis.
In this paper we present the first public Static Analysis for Security Testing (SAST) tool for Lua code that is currently focused on web vulnerabilities. We show its potential with good and promising preliminary results that we obtained on simple and intentionally vulnerable Lua code samples that we synthesized for our experiments. We also present and release our synthesized corpus of intentionally vulnerable Lua code, as well as the testing setups used in our experiments in form of virtual and completely reproducible environments. We hope our work can spark additional and renewed interest in this apparently overlooked area of language security and static analysis, as well as motivate community’s contribution to these open-source projects. The tool, the samples, and the testing VM setups will be released and updated at http://lua.re and http://lua.rocks.
Many vulnerabilities in products and systems could be avoided if better secure coding practices were in place. There exist a number of Integrated Development Environment (IDE) plugins that help developers check for security flaws while they code. In this work, we present a review of these plugins. We specifically focus on the plugins that detect input validation-related vulnerabilities. We list salient features such as their supported IDEs, applicable languages and specific types of vulnerability checks. We believe this work synthesizes information useful for future research on IDE plugins for detecting input validation-related vulnerabilities.
We report on our experiences in synthesizing a fully-featured autopilot from embedded domain-specific languages (EDSLs) hosted in Haskell. The autopilot was built for the DARPA High-Assurance Cyber-Military Systems (HACMS) program. After an independent red-team assessment, one government official remarked that it was likely "the most secure UAV in the world". We will also describe our experiences in transitioning the EDSLs to Boeing for use in large-scale platforms. The EDSLs and autopilot are both open-source.
Modern embedded devices have a near unilateral need to parse complex structured data in order to operate. Historically, the parsers for these formats have been defect prone, difficult to work on, and often inefficient. We present a working example of parsers constructed using a modified version of the Nom parser combinator framework operating on an embedded platform, parsing complex input formats and safely and efficiently converting this into structured data.
Further, we demonstrate that these parsers can safely be integrated with an existing C codebase, clearing the way for incremental migrations to memory safe languages in environments where greenfield rewrites are economically infeasible.
In this presentation I will explain different types of input to firmware interfaces, and different approaches for BIOS analysis in the perspective of formal validation as well as finding vulnerabilities. I will discuss examples of static analysis, dynamic analysis, and symbolic execution analysis of BIOS components.
Every year since 2014 most bugs that topped LangSec headlines were "LangSec" bugs, findable and preventable using the LangSec approach. Did 2016 have the worst LangSec bugs yet?