[header=3]Web Application Security: Passwords[/header]
Web application security isn't hard but it's surprisingly difficult to get right. I'm posting this to educate you on what's out there, if you're curious, and to give you some idea how exploits work. It is therefore my hope that you won't then fall prey to information that is incorrect or only half-correct and thus more dangerous to your sanity than your computer. This post will be a fairly lengthy read, a somewhat technical read, but I have attempted to put some effort into writing it such that it is easy to understand. Be aware that some of this article is based on original research; I currently work fairly extensively with web applications and have some knowledge related to how they work, their implementations, and vulnerabilities. If you just flat-out don't want to read any of this, you don't have to. I'm providing it exclusively for educational purposes.
This installment actually has has nothing to do with web application security with the exception of passwords, but it's pertinent to our little corner of the web. As it stands there are currently only five methods that can be used to obtain passwords and at least one method is dependent upon other mechanisms for exploiting the host software (meaning that it's not magical--the attacker has to find a way to access privileged data first before the attack will be successful). These attacks are, in order of easiest to most difficult:
- Phishing
- Key loggers
- Rainbow tables*
- Timing attacks**
- Brute force
* Requires access to the encrypted passwords and is dependent upon successful attacks targeting the host site.
** Generally difficult and time consuming enough to render it completely unlikely to be used in practice. Requires knowledge of statistics. Unfortunately, almost every application is vulnerable to this unless it is written by an individual familiar with the vulnerability. The plus side is that timing attacks cannot be used to harvest passwords, so why did I include it? Keep reading.
[header=3]The Password Attacks[/header]
Phishing
Phishing is among the most common and probably most successful method of harvesting passwords on the Internet. The general idea is to create a site that looks identical to a valid site likely to be visited by the target users. Examples include banks, credit card companies, online vendors, and even World of Warcraft. The most recent phishing attacks that have been most successful have been those that claim to offer exclusive access to in-game pets or mounts. Phishing sites will also require you to manually enter in your username and password. This attack also requires the least effort on behalf of the attackers; it is up to the users to voluntarily give up their account information.
Incidentally, I would guess this attack is the most difficult to quantify in part because users are liars. Users don't always intend to lie, and it's important to remember that human memory is fallible which presents BOFH administrators the illusion that users lie. Specifically: Unless victims of phishing attacks are pressed for further details, they will not remember that they willingly gave up their account information, because their brain has already associate the activity of providing their username/password combination as a valid action. Valid actions aren't questioned, and I should think that this is part of the reason why a vast majority of people who probably were victims of phishing attacks never remember. It's not being dishonest--it's because our brains have more important things to do, and when we associate an action with a legitimate activity, it suddenly transfers from a questionable act to one that we're unlikely to remember.
This is also why there is so much speculation by those who have had their accounts cracked that there's obviously some dark and sinister deed occurring beneath the surface that Blizzard is a part of.
Key Loggers
Key loggers are somewhat more difficult for an attacker to pull off than phishing, because it requires them to actively install malicious software on the target users' computers. There's plenty of resources here that have been posted by others, so I'm not going to describe a great deal about key loggers and WoW, but for those who need a refresher, here you go: Key loggers are malicious software that can be installed by visiting a compromised site. The software then monitors everything you type (and sometimes all of your mouse movements), records these activities, and then submits it to a web site controlled by the attacker. In order for a key logger to be successful, it must be able to send collected data to another location. Consequently, key loggers are generally only viable for a few months until authorities shut down the sites that collect data from compromised machines.
Key loggers can be defeated by safe browsing habits. There's plenty of tips in our forums on how to browse safely, but I think some of it might need updating.
Rainbow Tables
Despite the fruity name, rainbow tables are a solution that can be fairly effective at "cracking" weak passwords. The only problem with rainbow tables, though, is that the attacker must have access to the encrypted passwords. Assuming the attacker does have access to the passwords (which is difficult in its own right), a rainbow table can be very effective.
Before I describe what a rainbow table is, it's important to understand how most passwords are stored in a database, like the one that powers this forum. Passwords are very rarely stored in plain text. It's a poor security practice, and I think it could even be argued that it's a somewhat borderline breach of privacy. Thus, most passwords are encrypted by a method called "hashing." Hashing is simply a process of generating in an ideal situation output that is fully unique and uniform in length for a given input. MD5 is a popular algorithm for hashing (it has its flaws, which I'll mention in a much later post), and looks something like this:
- Code: Select all
purple = bb7aedfa61007447dd6efaf9f37641e3
asdf = 912ec803b2ce49e4a541068d495ab570
OMG something is eating my face = 796df227b8fbb0244b436b4075d48347
The string I input into MD5 is on the left; the output hash from MD5 is on the right. Notice that each hash is of identical length (32 characters) and represented in hexadecimal (0-9 including A-F, so the number 10 becomes the letter "a" and 16 becomes the number "10"). Incidentally, the word "purple" and the home row key combination "asdf" are two of the most common passwords in use. No, I'm not kidding.
A rainbow table makes use of the fact that, for example, the MD5 hash of "purple" is always going to be "bb7aedfa61007447dd6efaf9f37641e3". Therefore, a rainbow table is simply a massive list of passwords containing all 52 printable characters (plus 10 numbers and possibly assorted symbols) in various combinations. It's feasible to find rainbow tables that contain passwords of 12 characters in length or more; however, for each additional character added to the target password, the size of the rainbow tables increase almost exponentially. Rainbow tables are therefore a trade off between speed and space, and it's not uncommon for them to be large enough to fill two DVDs or more.
Using a rainbow table in an actual attack requires that the user have access to the hashed password, such as "bb7aedfa61007447dd6efaf9f37641e3" in our case. The attacker then scans through the list of hashes until he or she encounters the string "bb7aedfa61007447dd6efaf9f37641e3" which matches up with the word "purple." The attacker then has the clear text version of a useable password.
There are several ways to mitigate the usefulness of rainbow tables, including a fairly simple tactic called "salt." Just as salting food adds flavor to it, salting a password increases what is known in the field as "entropy" (basically random noise) and makes the password much more difficult to guess. Salt isn't anything magic: It's generally just a string of random numbers, letters, and symbols that gets added to the password prior to the final hash generation. This makes the password somewhat more secure. To illustrate, let's assume that we're still using the password "purple" and that we're going to add a 10 character salt of "RN*7ZndhLa".
First, we'll hash the password purple and obtain the string "bb7aedfa61007447dd6efaf9f37641e3".
Next, we'll prepend the salt to our hashed password and obtain "RN*7ZndhLabb7aedfa61007447dd6efaf9f37641e3".
Finally, we'll run the combined string through MD5 again and wind up with this: 49eea5527a1e8ffe311b4322c7f46f88.
It doesn't look anything like our original hash of "bb7aedfa61007447dd6efaf9f37641e3" anymore, does it?
One might wonder if it would then be possible for an attacker to generate rainbow tables containing 42 characters. Yes, it is possible, but the tables would probably consume the better part of an 80 gig hard drive and take months to generate.
The important thing to remember about MD5 is that it's an old algorithm and modern processors can generate billions of MD5 hashes in less than a day's worth of use. This implies that raindbow tables can be generated more or less on the fly when needed, so using MD5 to hash a password is really poor practice. The unfortunate thing is that every major message board uses MD5.
The reality of it is pretty much this: If someone can access the password hash, they can very probably find a fancy password (and its salt) in an extremely short period of time. I know I implied--and outright suggested--otherwise, but MD5, SHA1, and other hashes aren't really the best way to store passwords anymore.
Timing Attacks
Let me start with the bad news: Nearly every web application (yes, including phpBB) is susceptible to timing attacks. The good news, though, is that timing attacks are one of the easiest classifications of attacks to defeat and one of the most difficult attacks to pull off within a short period of time. If you want to read a post that explains how to defeat them in code along with what they are go here. My summary here is based on Coda Hale's post.
Timing attacks are based on a little known but highly useful trait of comparisons in code where inequality results in an immediate jump from the statement performing the comparison to the next block of code (or the containing block). Unfortunately for the attacker, attempting to use timing-based attacks to guess passwords is ineffective at best and is no different from using a brute-force attack. Unfortunately for the user, timing-based attacks are ideal for something known as "session hijacking."
Session hijacking involves exploiting the way a web application recognizes someone who's logged in. Whenever you log in (such as now while you're reading this), the web application stores a unique key to identify your session. This key is unique to you and you alone and is usually regenerated whenever you log out. More advanced web applications will often regenerate keys regularly to limit the window of opportunity during which a session hijacking attempt can be made.
It's not obvious at first why session hijacking is bad until you consider this: If an attacker can gain access to the session key, he can then masquerade as you, gaining access to everything in your account. He could obtain your e-mail address, change your password, make posts as you, and generally cause a great deal of havoc and embarrassment. The plus side is that session hijacking is fairly difficult in practice, but it's also the subject of this section.
Here's how a timing-based attack works:
First, let's assume that the server assigned us the session key "7888d65a43501d992cc38638b59964d6". When we log in, the server reads the session key (typically from a cookie) and then compares it with known session keys in the session table. Assuming it has found our exact key, it will perform a comparison one character at a time, which would look something like this:
- Code: Select all
if '7' == '7':
if '8' == '8':
if '8' == '8':
And so on until it encounters the last character. However, if we supplied only a partially correct session identifier of "79af87723dc295f95bdb277a61189a2a", the comparison would look like this:
- Code: Select all
if '7' == '7': // This passes
if '8' == '9': // This doesn't, bail out.
And therein lies the problem. Inequality forces the comparison operator to bail out the exact instant an inequality is detected. This means that incorrect session identifiers return an invalid response sooner than slightly more valid session identifiers. By using statistics, the attacker can then determine that certain responses are statistically taking a longer period of time to return than others and are therefore more correct. After a period of weeks or months, the attacker can then obtain the session key in its entirety and login as you.
However, there's some good news: This attack usually won't work in practice, and it certainly won't work for passwords. First, most web applications regenerate the session key periodically; regenerating the key would therefore force the attacker to throw out all of his work up to that point in time and start over. Second, by closing your browser, your session key is usually invalidated which also narrows the window of opportunity for this attack. The downside is that selecting "remember my password" usually stores a pre-logged in session or even your password in a cookie, which then remains unchanged indefinitely and opens us up to timing-based attacks all over again. Timing-based attacks are easily defeated, however, and are thus not subject of concern.
A similar method to timing-based attacks is the brute force attack. This operates directly on passwords. Keep reading.
Brute-force Attacks
I'll keep this short because brute-force attacks on user passwords are exceedingly simple but they're also exceedingly time consuming. Essentially, a brute-force attack requires knowledge of the user's username and nothing else.
Brute force attacks are called such because they require an attacker to guess passwords by submitting, one character at a time, new passwords in the hopes of guessing a user's password. This would start with something to the effect of submitting the letter "a" then the letter "b" and so on until the attacker reaches "z". Then, the attacker tries "aa", "ab", and so on until encountering "az." Repeat this until the attacker reaches the word "purple," and upon a successful login, the attacker then knows the user's password.
Brute-force attacks are ridiculously easy to defeat. 1) Use lengthy passwords. A password over 8 characters is generally long enough to render brute-force attacks almost impossible to accomplish in less than a few weeks or months. 2) Limit users to a specific number of failed attempts per hour or day. By locking out accounts after incorrectly guessing passwords after 5 attempts, brute-force attacks are essentially eliminated. 3) A slightly less draconian approach to #2 is to "throttle" or "slow down" the server response after a certain number of failed login attempts. This achieves roughly the same result with far less inconvenience to the user.
[header=3]Passwords are Tough[/header]
That guessing passwords is a difficult endeavor is part of the reason why phishing and key logging are two of the most successful and wide-spread attacks. In fact, these two types of attacks are so successful, I sincerely doubt we'll ever have to worry about the other three methods of obtaining passwords! Basically, it all boils down to a single uncomfortable truth: Account security rests on the user. It doesn't rest on an administrator or a secured system somewhere across the country. If you enter your account details exactly once into a questionable site, you're going to be burned.
In the next post, I'll discuss in detail web application vulnerabilities in the wild. You'll then know enough to understand how web sites are compromised, cracked, and attacked.