Writing Secure Web Applications: Sanitize Browser Input
Writing Secure Web Applications: Sanitize Browser Input
http://advosys.ca/tips/websecurity.html Aug 16 2001 Contents: 1. Background 2. Sanitize browser input 3. Use a data directory 4. Avoid the shell 5. 'hidden' fields aren't 6. Don't trust HTTP_REFERER 7. Use POST instead of GET 8. Validate on the server 9. Use taint checking 10. Don't use raw path and file names 11. Use absolute path and filenames 12. Specify the open mode 13. Log suspicious errors 14. Look beyond the web application 15. Recommended reading Web application development is very different from other environments. Web clients and Internet communications pose many security problems not found in traditional clientserver applications. Web developers must know how web servers and web browsers interact, the nature of Internet communications, and the attacks most web applications undergo when they are made available on the Internet. If you think your servers are secured by a firewall, think again. Security flaws in web applications easily bypass firewalls and other basic security measures. It's easy to unknowingly write a web application that allows outsiders access to files on the server, gather passwords and customer information, and even alter the application itself despite firewalls and other security you may have implemented. This document presents common web application security problems. The examples used relate to Unix, Perl and the CGI.pm library running under Unix, arguably the most widely deployed language for web apps. Regardless, The flaws and concepts described apply to Java servlets, Cold Fusion, PHP, or any development language and environment.
1. Input containing special characters such as ! and & could cause the web server to execute an operating system command or have other unexpected behavior. 2. User input stored on the server, such as comments posted to a web discussion program, could contain malicious HTML tags and scripts. When another user views the input, that user's web browser could execute the HTML and scripts.
Solutions
The best practice is to strip unwanted characters, invisible characters and HTML tags from user input. When stripping unwanted characters, the safest way is to check the input against a list of valid characters, not a list of invalid ones. Why? It's too difficult to determine all possible malicious characters... just when you think you've thought of them all, a cracker invents an unexpected attack like "poisoned null" characters described above. It's also easier to simply check input against a list of characters AZ and 09. All input should be sanitized, not just selected fields. All input can potentially percolate through to unexpected places. Even if you're certain a particular input field cannot cause problems now, it might become possible in future revisions of the application. Rather than try to guess what input could be dangerous, it's simpler and more effective to just sanitize all input immediately when received from the browser.
Advosys Web Tips: Writing Secure Web Applications 2
It's simple to strip unwanted characters from a string. However, input from web applications is not always plain ASCII or UTF8. Characters in HTML form fields, cookies and CGI query strings can also be expressed as HTML Character Entities (see http://www.w3.org/TR/REChtml40/sgml/entities.html) For example, the symbol < ("less than") can be input using the HTML Entity code < or in numeric format as <. Stripping ampersand (&) and semicolon (;) characters from input will disable such sly attempts to bypass input filters, but if your web application requires those characters it is best to decode any HTML character entities in all input to their corresponding characters before stripping. The Perl module HTML::Entities will do that for you. If your web application is written in Perl using the CGI.pm library, all form input fields can be sanitized at the beginning of the program with a routine similar to the following:
use HTML::Entities (); use CGI qw/:standard/; $ok_chars = 'azAZ09 ,'; foreach $param_name ( param() ) { $_ = HTML::Entities::decode( param($param_name) ); $_ =~ s/[^$ok_chars]//go; param($param_name,$_); }
The above converts HTML character entities to plain characters then silently removes all characters except the ones listed in $ok_chars from all HTML form input fields collected by CGI.pm. This method cripples shell metacharacters, the poisoned NULL attack, disables HTML tags and defeats attempts to hide metacharacters using HTML entities. The above code snippet works well, but instead of silently stripping input a more thorough solution would be to raise an alarm for the system administrator. When invalid characters are detected, it would be better to alert a system administrator and log the error and IP of the user to a file. Then you would know when potential crack attempts are made and by whom. When sanitizing input, keep in mind that user input is not just limited to HTML form fields. A malicious user can potentially alter everything normally sent from a web browser. HTML cookies and HTTP headers such as REMOTE_USER can be manipulated. It's not difficult to write a Perl script that poses as a web browser and sends hacked versions of form input, complete with forged cookies and HTTP headers. Never trust any input from a browser... all of it could be altered. The above example Perl code only removes characters from HTML form fields. Input from cookies and HTTP headers should also be sanitized in a similar way before being used in a web application.
Here's an example layout of a web server home directory on Unix. Note separate subdirectories in /data to hold data files from each web app used on the site:
/cgibin /htdocs /data /catalog /webforum /webmail
Keeping all data files in a common location also makes it easier to manage the web site. You have separate locations for HTML files, CGI programs, and data. Separating the data files into subdirectories by application helps eliminate file naming problems, such as two different apps that create data files named "data.txt"
This avoids a shell being spawned. If you string the program name together with the parameters, system() spawns a shell. Not only is using system() more secure than the backtick method, it's more efficient... by not spawning a shell your application uses fewer system resources and runs faster.
InfoSec Labs (http://www.infoseclabs.com/) also has a very detailed white paper on hidden form field vulnerabilities. See http://www.infoseclabs.com/mschff/mschff.htm. If you write web applications in Perl, check out the CGI::EncryptForm module by Peter Marelas (available from CPAN http://www.cpan.org/). It's one way of encrypting hidden form field information to prevent users from changing hidden fields. A better way of preserving state information and settings is to store data in a file or database on the server then use an HTTP cookie or unique URL ID to reference the file. This is more difficult to program, but important data stays on your server. The Perl module CGI::Persistent by Vipul Ved Prakash is one tool that makes this technique easier to implement.
When the web application is called using GET, the above input is visible on the browser's URL location window. However, a more dangerous problem is that URLs are logged in many places: The web server access log The web browser's disk cache and history file In firewall logs In proxy server and web cache logs such as Squid. All this logging allows others to see the data sent from HTML forms using GET. The POST method sends form input in a data stream, not part of the URL. The data is not visible in the browser location window and is not recorded in web server log files. The POST method is also more practical... there's a limit to how many characters can be sent using the GET method, but POST can send an almost unlimited amount of data from an HTML form.
However, even though POST information is generally not logged, like all other plain text information sent from a browser it can still be sniffed as it passes across the Internet. However, sniffing must be done in real time as information is sent across the Internet and requires the attacker to have physical access to the data lines between the web browser and web server. The risk of information being sniffed is far less than the risk of information being gathered from log files.
Beware that Perl's taint check is far from foolproof. It is not a substitute for knowing web application security issues and writing secure programs. More information about Perl taint checking is found on the 'perlsec' man page (http://language.perl.com/info/documentation.html) The principle used by Perl's taint checks can be used in other programming languages, but for other languages the check must be applied manually.
Instead, do something like this: BETTER> my %filelist = ( "name" => "/home/data/name.txt", BETTER> "address" => "/home/data/address.txt" ); BETTER> $keyword = param('datafilename'); BETTER> open DATAFILE $filelist($keyword)
or die;
Using the above method, HTML form input is never passed directly to the 'open' command. If a malicious user tries to pass a bad value (such as '/etc/passwd'), it will fail to find a match in the associative array. Lookup tables also prevent a cracker from using a poisoned NULL attack (see Sanitize browser input, above) to shorten strings. For flexibility, the locations of the files can be pulled in from an external configuration file, rather than be hardcoded into the application. However, make sure the config file cannot be accessed by web users (ie. do not put it in the HTML directory) and that it cannot be changed by other users on the web server. If your web application absolutely must be capable of opening ad hoc files based directly on browser input, rather than a predetermined list, never accept complete path and filenames from HTML form input fields. Your web application should at least prefix the input filename with an absolute path and strip slashes, backslashes, NULLs and doubledots from the input. For example: $datadir = '/sites/internet/data'; $datafile = param('datafilename'); sanitize($datafile); open DATAFILE $datadir . $datafile or die;
However, future versions of Perl may change this, or Perl running on an unusual platform could use a different default. It is better to be specific: open DATAFILE '</sites/internet/data/mydata.txt';
software on it has the latest security patches installed, it sits behind a wellmaintained firewall, and everything is monitored for breakin attempts. Many organizations believe skilled application developers are also skilled at system and network management. However, that is extremely rare. Being an application developer is a full time career, as is being a system administrator, network manager and security professional. Many of the wellpublicized attacks on web sites targeted wellknown operating system and network problems that were missed simply because the organization was focused on developing the web applications and design of the web site, or assumed their development team had all the answers. The "Code Red" worm that affected hundreds of thousand of Microsoft's IIS web server was able to spread largely because unskilled or overworked staff didn't patch a security hole in that product soon enough. Securing web applications requires a combined effort in many areas: application design, server management, network management and security auditing. Professionals specializing in one of these areas require a good knowledge of all the others, but in reality each area is a specialty that requires the attention of a focused individual or team.
Recommended reading:
World Wide Web security FAQ: http://www.w3.org/Security/Faq/wwwsecurityfaq.html CERT advisory 97.25.CGI: http://www.cert.org/advisories/CA199725.html CERT advisory CA200002 http://www.cert.org/advisories/CA200002.html Secure Programming for Linux and Unix HOWTO (David A. Wheeler): http://www.dwheeler.com/secureprograms/ Detecting CGI script abuse: http://www.advosys.prv/tips/cgitrap.html Preventing HTML form tampering: http://www.advosys.prv/tips/formtampering.html Hidden form field vulnerability white papers (InfoSec Labs): http://www.infoseclabs.com/mschff/mschff.htm Programming Perl (Second Edition): O'Reilly Associates Inc ISBN 1565921496
Comments, suggestions, criticisms, additions to this document? Please email [email protected] Latest version of this document available at http://advosys.ca/tips/websecurity.html Copyright Advosys Consulting Inc. Ottawa Canada. All Rights Reserved. Last modified Aug 16 2001
10
Limitation of liability
Advosys Consulting take no responsibility for the accuracy or validity of any claims or statements contained in the Documents and related graphics ("the content") on the Advosys web site. Further, Advosys Consulting Inc. makes no representations about the suitability of any of the information contained in the content for any purpose. All such documents, related graphics, products and services are provided "as is" and without warranties or conditions of any kind. In no event shall Advosys Consulting Inc. be liable for any damages whatsoever, including special, indirect or consequential damages, arising out of or in connection with the use or performance of information, products or services available on or through the Advosys Site.
Trademarks
Product, brand and company names and logos used on the Advosys web site site are the property of their respective owners.
11
Areas of expertise
Advosys is a diversified consulting firm providing services in many areas of Information technology: Internet technologies Firewalls and information security Web applications Network architecture Unix and Linux systems management
Unbiased recommendations
Advosys Consulting is an independent consulting firm. We have broad experience with multiple vendors including Sun, HewlettPackard and Microsoft but are not a reseller of their hardware or software. Unlike many consulting firms, Advosys receives no commissions, percentages, or other rewards from companies for promoting particular products or services. This allows us the freedom to offer uncompromised objectivity. We have knowledge and experience with a broad range of products and technologies and can recommend solutions from any manufacturer, or opensource freeware if that is what best fits the requirements. Advosys works only for you, our client, not for a product manufacturer. For more information, please visit us at http://advosys.ca
12