Secure coding practices in C under UNIX - as confused by Sorel Note: This doesn't exactly follow the same thought-train as the talk, because that thought-train was more like a thought-speckling without any direction, sane or otherwise. *** Do you actually need secure coding practices? The usual line is that suid programs and web applications are about the only places you need secure coding. While these are very important uses of it, that deffinition leaves a few things out. Secure coding practices are needed any time the person giving the program input and the program itself have different access levels. For instance, if the input comes from a guest user, or if the program runs as root. This includes some less well attended cases such as: mail readers, file editors, user to user messaging, etc.. *** How secure does your code need to be? There are numerous fronts on which programs recieve input. Protecting against every possible thing that could come into the program would in most cases be wasted effort, because many of those things could only be done by an already priviledged user. Your code doesn't have to be perfectly secure. It just has to be secure enough to stop the level of attacker that it will face. Most modern applications don't stand up to full scrutiny without flaws being discovered, but they are still good enough to be used, because most pottential attackers won't stop to do that analysis. Every time you add another paranoid security check to your code, you're making it do more processing. This is going to slow the code down. One of the main arguments against hypersecure coding is that it slows the program down, makes it take more system resources, and therefore requires excessive amounts of hardware to use at a realistic scale. When deciding how to secure your program, you should take a well considered but somewhat minimalist approach. Only defend what needs defending, don't bog the program down in unneeded doublechecks. *** Where do exploits come from? Programs don't just randomly exploit themselves. Exploits happen because someone causes them by giving the program input it wasn't expecting and it did something with it that the designer or the administrator didn't want. Espetially in a UNIX environment, programs can get input from a wide variety of sources. Not all of these sources are equally accessable to the user, but many of them can be modified with a little deviousness. -- Network input. This is a biggy since *anyone* can send network data unless you're really careful, and most places were caution can be added are the admins responsibility, not yours. -- User input. Any time you're talking with a user, they're sending you strings of data. Even numbers are, at the point of input, strings. These strings must not be trusted, and string handling is a pain and a half. -- System files. Configuration files, log files, etc... If someone else can change them, you can't trust them. -- Temporary files. Program generated temp files are an ongoing problem because it's very hard to keep other people out of them. -- Environment variables. Every program is passed a 'local environment' which consists of certain string variable setting. This local environment is widely relied on, and easily set by the user. -- Signals. Signals are a form of software interupt. They can change the flow of program execution at any time, and are therefore troublesome. *** What should be done to prevent exploits? ** Locking down network inputs. If your program deals directly with the network, than it pottentially deals with completely unauthorized users. This inevitably poses a major security risk, becuase getting even guest level access to the system would be a large step up for them. Choose underlying protocols wizely. If you care about any authentication whatsoever, don't use udp unless you're willing to do it completely manually. Also, don't rely on the first packet of a tcp connection to be unspoofed since that can't be assured if ttcp is enabled on that machines stack. (which it frequently is, unfortunately) Design toplevel protocols carefully. Any interactions between the program and its users will follow some sort of set protocol. As the program designer, you have the freedom to design this protocol. Keep in mind both completeness and ease of parsing. A difficult to parse protocol is difficult to parse securely. Consider time sensitive operations. Computers have limited resources, and every open connection uses resources. Whenever there is a risk that one open connection may block another from doing something or from being created, timeouts should be set and used. Allow precise configuration. Allow the administrators to specify via config files where you should allow input from (ie. what ip blocks), how authentication is to be done, etc... This will encourage daft admins to consider these issues, as well as allowing you to send coherent error replies to disallowed hosts/users. Sanify default configuration. Set up defaults which do not pose security risks. For instance, a system could be set to by default only listen on localhost. This will annoy newbie admins, but it will force them to stop and think about how it should be configured, and will prevent autoinstall scripts from running insecure setups without their knowledge. Carefully consider remote administration. Web interfaces for administrative tasks are becoming more common, but they pose added hazzards for the system. Think carefully before implementing such a system. Do you really want to cater to admins who can't even ssh to their own machines happily? At *least* turn web interfaces off by default. ** Locking down user input in general. User input is the classical bane of a secure programs existance. You don't know what they're going to input, when they're going to input it, how long it will be, whether they are going to wait for a response, and occationally even whether the input really came from them. Every possible type of input has its own set of issues and annoyances, espetially strings. Because of their indeterminant length, strings pose a real pain in the arse situation. Limit string input lengths. Eventually any string has to end, it's just a matter of how long it takes. You've got a problem if it takes longer than you allocated space for. If you are reading data into a statically allocated buffer, use length-aware functions to only read in the desired number of bytes. Ensure you got the whole string. When limiting string inputs as above, make sure to check whether the end of the string got chopped off in the process. If the input was truncated, return an error, do not use it for further processing just incase it might cause unexpected results. Diligently check formatting. Check any and all user input against the specification for 'good' input. Ensure that it does not contain any disallowed characters or strings, anything that could later be interpretted as an escape character, etc.. Limit dynamic allocation. If you are dynamically allocating elements read from the user, place upper limits on the allocation lest you run out of memory. Running out of memory at best crashes your program, at worst hangs the machine. You don't want to rely on system resource limits for this, although they should be used as well. Be careful with number sign/size. Numeric values come in assorted sizes and both signed and unsigned forms. The C language performs implicit conversions between them any time it feels data will not be irrevocably lost. If data integrity checks are performed before a conversion and data is used after the conversion, problems can ensue. Consider timeouts. If the program is doing a lengthy operation, find some way to inform the user so they don't just walk away on the assumption that the program crashed. If you are tying up resources another user may need (for instance locking system files), place timeouts on these operations. ** Locking down system file access. File access has all the same issues for untrusted files as user input does, plus the issues related to ensuring that you're accessing the correct file. Those issues are espetially problematic for trusted files since accessing the wrong file can have far reaching consequences in that case. Disallow symlinks. Make sure you're accessing a regular file, not a symlink to some other file than the intended one. Make sure also that you aren't writing to device or other special files as doing so can have unexpected consequences. Watch for race conditions. Race conditions happen when someone changes an aspect of the system after you check it, but before you use it. For instance, removing a file and turning it into a symlink after you checked for symlinks. These are hard to avoid, but important. Watch for locking issues. Multiple programs can have a file open at the same time, they can even write to / read from it at the same time. This can result in file mangling, unexpected inputs, reading of private data, etc.. Because mandatory file locking is not specified in POSIX at this point, the question of a portable way of doing this is still up in the air. ** Locking down temporary files. Temporary files are generally a bad idea in secure programs, but they are sometimes unavoidable due to the amount of data passing through the system and type of operations that need to be performed on it. It should be avoided wherever feasable. Avoid predictability. If the temporary file names are chosen predictably, this eases the task of attacking them. Use temp creation functions which add atleast some level of randomness to the filenames. Check for pre-existance. If the file already exists, than it could have been created by someone else, and they may have the ability to read from / write to the file. You'll want to either remove it, or choose another name. Check directory permissions. You don't want someone else being able to change the permissions for your file, delete it, write to it, etc.. A lot of the control for this is actually done at the directory level, so it should be checked there as well. Use full paths. Don't rely on the current working directory to be anything sane. Don't rely on the home directory either, as this is passed as an environment variable. (see next section) Ensure successful creation. This includes checking such things as that the file exists, that it has the correct permissions, that it has the right ownership, etc.. Don't allow users to specify filenames!!! Doing so is generally not needed, and opens you up to a whole bucket of new trouble. Delete when you're done. Don't leave files laying around on the system. Leaving files is both rude and a bad security practice. It leaves uncertainty as to whether you're the one who left them, clutters your temp file namespace, and leaves information out in the open. ** Locking down environment variables. Environment variables effect a lot of aspects of system configuration. They are often used to toggle library options or specify search paths. Not being certain within your program of which paths are being searched and what options are in effect can cause flawed assumptions to be made. Check the documentation for all system functions or external programs that you use. If they list any environment variables which have effects you care about, make sure to sanify the variables before using those functions or programs. While that process is somewhat long and tedious, it's also pointful. Supported environment variables vary between systems, between libraries, between versions of the same library.. Some of them such as IFS are quite common, but others border on obscure. Check if the libraries or programs that you are using have 'safemode' options which would allow you to disable environment variable checks or place strong limitations on their capabilities. If such options are present, they should be used. Add library version checking to your configuration scripts. Your scripts should detect which versions are in use on the system and adjust your parameters accordingly such that all needed environment variables are covered. They should also warn the user if they are trying to install on a system with unknown characteristics. An acceptable blanket-policy would be to simply delete the environment you were passed and set up one that you know to be sane. If you do this, you should make it admin configurable as some systems may require odd parameters. Use all string-handling caution. Environment variables are strings that the user can set and should be treated as such. This means you can't trust their length, their content, etc.. to be anything resembling sane. There are ways of preventing users from setting them, but these methods are under the admins domain, not yours. ** Locking down signal handling. Software interupts under UNIX are implemented via the signal mechanism. Your program can recieve signals from itself, hardware sources, other programs run by you, or programs run by root. These signals can be ignored, cause the program to exit, or be handled in a way you specify. Block signals you aren't interested in. Any signals which your program is not interested in should be blocked. This decreases the number of special cases you have to handle, and can blanket-block any signals you weren't aware of, which decreases the amount of machine-specific knowledge you have to have. Use external cues to validate signals. Because signals can come from so many sources, it is important to check their origins. Use of origin checks can be of great help in keeping things like SIGALRM internal to the program. Rate limit operations. If, for instance, you are using SIGALRM to time database outputs, you should make sure that you are not responding to floods of signals by sending floods of database commands. This is done by querying actual time values, keeping internal counters, etc.. Prevent race conditions during setup. Doublecheck that the handlers are not installed until they are valid, and are deinstalled as soon as they loose validity, and that the default handlers at the beginning/end of the program cannot cause any hassles. ** Other things you should consider. Minimize privledges. If your program runs suid (as another user than the one who started it), than you should make it loose those privledges as rapidly as possible. This doesn't do anything to prevent exploits, but it can limit their effects. Use modular design. Modular programs are easier to secure because modular design forces you to think about data flow. You have to know what data is accessable when, which then tells you which data needs locking down where. Take advantage of available tools. Use things like 'splint' on your program. They can automatically detect a lot of possible error conditions, and even suggest ways of fixing them. While these programs can't do the whole job of securing your code, they can be a good doublecheck for the simpler issues. Read the documentation for any libraries you use. Not just for the environment variables either. You should also take a strong interest in the changelogs as these often contain notes about misfeatures which could compromise your system security. Consider change-root setups. If use of change-roots would improve your programs security significantly, you should add atleast a document listing everything that needs to be in the change-root, and possibly even a script to autocreate it as an option on install. Don't ignore efficiency. Secure code isn't much use it it takes forever to get anything done. Keep your security checks as minimalistic as you reasonably can so they don't bog the system down. Even just checking function return values has a significant efficiency penalty. *** More reading. (as if this wasn't long enough) http://www.dwheeler.com/secure-programs/ http://www.whitefang.com/sup/ http://archive.ncsa.uiuc.edu/General/Grid/ ACES/security/programming/ http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/secur e.html *** Final note / most important tip: Think when you code! No set of rules will be enough if you just plunge ahead blindly without consideration for your actions or their consequences / sideffects. No book or set of guidelines can anticipate every circumstance you will encounter, so you have to think of them yourself as you go.