LangSec

The Language-theoretic approach (LANGSEC) regards the Internet insecurity epidemic as a consequence of ad hoc programming of input handling at all layers of network stacks, and in other kinds of software stacks. site

LANGSEC posits that the only path to trustworthy software that takes untrusted inputs is treating all valid or expected inputs as a formal language, and the respective input-handling routines as a recognizer for that language. The recognition must be feasible, and the recognizer must match the language in required computation power.

An input should be verified before it is parsed. A verifier for a regular language is simpler because it has finite state. In our effort to tally json we have found that similarly benefit from finite variability of our input.

A regular expression matcher can be subject to combinatorial explosion unless care is taken to handle special cases. Russ Cox discuss efficient implementations due to Ken Thompson. page