Secure input and output handling are secure programming techniques designed to prevent security bugs and the exploitation thereof.
Contents
Input handling
Input handling is how an application, server or other computing system handles the input supplied from users, clients, or a computer network.
Secure input handling is often required to prevent vulnerabilities related to Code injection, Directory traversal and so on.
Input validation
Validating (or sanitizing) user input is to ensure that input is safe prior to use.
The most secure way to do this is to Terminate on suspicious input and use a Whitelist strategy to determine if execution should be terminated or not. This behavior is however not always preferred from a usability point of view.
Whitelists and blacklists
In computer security, there are often known good inputs — input the developer is completely certain is safe. There are also known bad characters; input the developer is certain is unsafe (can cause Code injection etc.). Based on this, two different approaches to how input should be managed exists:
Security professionals tend to prefer Whitelists, because Blacklists may accidentally treat bad input as safe. However, in some cases a whitelist solution may not be easily implemented.
Terminate/stop/abort on input problems
This is a very safe strategy. If unexpected characters occur in input, abort execution. But if implemented poorly, it can lead to a denial-of-service attack in which the attacker floods the system with unexpected input, forcing the system to expend scarce processing and communication resources on rejecting it.
Filter input
Filtering input is used as a less orthodox security principle than Terminate/stop/abort on input problems.
*
" is illegal, then "I ***LOVE*** you
" will just become "I LOVE you
", which is experienced as a minor but acceptable oddity.Filter input: Automatic taint checking
Some programming languages have built-in support for taint checking. These languages throw compile time or run time exceptions whenever a variable derived from user input is used in a risky way, e.g. to execute a shell command.
Filter input: Whitelist filters (Filter in known goods)
Example:
A-Za-z
is used to protect a UNIX application from shell injection.; ls -l /
to attempt shell injection.; - /
are thrown away by filter because they are not in whitelist.lsl
are kept by filter because they are in whitelist.Filter input: Blacklist filters (Filter out known bads)
A strategy that is usually insufficient is to filter out known bads. If the characters in the set [:;.-/] are known to be bad, but ; ls -l / is received, the original input is replaced with ls l (;-/ are thrown away). This strategy has several problems:
Encode (escape) input
To keep malicious inputs contained, any inputs written to the database need to be encoded.
SQL encoding: ' OR 1=1 --' is encoded to \' OR 1=1 --'
In PHP this can be done with the function mysql_real_escape_string() or with PDO::quote()
Other solutions
There may be other solutions, depending on which programming language is used and what type of code injection is being prevented. E.g., the htmLawed PHP script can be used to remove cross-site scripting code.
In particular, to prevent SQL injection, parameterized queries (also known as prepared statements and bind variables) are excellent for improving security while also improving code clarity and performance.
Output handling
Output handling is how an application, server or system handles the output (e.g. generating HTML, printing, logging, ...). It is important to keep in mind output often contains input supplied from users, clients, network, databases etc.
Secure output handling is primarily associated with preventing Cross-site Scripting (XSS) vulnerabilities, but could also prove to be important in similar areas (e.g. if generating Microsoft Office documents with some API, output management could potentially be required to prevent macro-injections).
Encode (escape) output
"Encoding" processes content that is about to be output so that any characters which have potentially special meanings to the receiving application are made safe. Characters from a typical known safe charset for the particular destination medium are often left as they are. A simple encoding might leave alone alphanumerics a–z, A–Z and 0–9. Any other characters could be possibly interpreted in an unexpected manner, and are therefore replaced with the appropriate "encoded" representation.
HTML encoding: <script> is encoded to <script>
In PHP this can be done with the function htmlspecialchars()