Securing User Data

When it comes to creating a web application it's vital that user information is secure. If it isn't then people's faith in both you and your work will plummet. As I'm developing my own application from scratch, this is a very important aspect of my code to get right, and a very difficult thing to do correctly.

I'm going to be taking a look at some of the research I've done with regards to securing user information, covering things like how best to hold data like user passwords, as well as some of the techniques you can use to do so effectively and securely.

Security is Key

It goes without saying that keeping user data secure is important. A user entrusts you to some of their data, and it is up to you to keep it secure, even from yourself. As much as people shouldn't use the same password everywhere it is a lamentable fact that they do. This brings me to the first point of securing user details:

Never store passwords in plain text.

Users don't want you knowing their passwords, and they definitely don't want anyone else knowing their passwords if the worst should happen... your database gets accessed by people it shouldn't be. So how do we go about this? The first stage is encrypting the passwords people use. There are two common ways to go about doing this:

MD5 Encryption
SHA1 Encryption

Whilst either of these is a viable option, it is a commonly held belief that SHA1 encryption is better for security than MD5. So we now know that encrypting user information is important, and we have some methods for doing so... but how do we use them? In PHP, my server-side language of choice, this is quite simple:

  
$md5 = md5('wordpass'); // Generates e80eded141e1295d694cd35cf2b8f675
$sha1 = sha1('wordpass'); // Generates 2939094f35a3badf2a890768ba034fa5eb16e95e

As you can see, either of these functions converts a password into what appears to be a random set of characters. Each is, however, unique to the word that generated it. To then check that the user is entering the correct information we simply run the same encryption on the password the user enters, and compare it to the stored password we have.

It's not THAT Simple

Unfortunately security isn't quite that simple. It isn't possible to reverse engineer passwords encrypted using MD5 or SHA1 (currently anyway) but, if all you do is use a single encryption on passwords then user details are still at risk. How? Not everyone to ever use the internet has done so with innocent intentions and there are resources available, to people who look for them, to get lists of randomly generated passwords in both plain text and their encrypted counterparts.

These resources, called Rainbow Tables, allow for a basic form of reverse engineering. Rather than decrypting information, they look up an encrypted password and, from that, get user passwords. So it's necessary to take things a bit further.

Added Security

To take encryption to the next level there are a few things you need to do. The first is ensure that the encrypted password relies on some other form of input, so that even if the password is pulled from a Rainbow Table it's not actually the password the user generated, it is the password plus additional characters, rendering the password useless. This is called a key and helps add an additional layer of protection to the user's information. Because this information is readily available (you store it for this specific use) it can be easily applied to all passwords when they are generated, as well as when a user wants to log in at a future date.

Example of using a key

  
$key = 'vnfuwy8t92bjkvfs';
$md5pass = md5($key.'wordpass'); // Generates 2f5177c377a9d073d82c5685c8eaacc1
$sha1pass = sha1($key.'wordpass'); // Generates 64cfaf4fd8ab086fc4620964dacc67d2707dc587

Because of the additional characters being added to the password, even if it is decrypted the end result still isn't a valid password.

Is This Enough?

Possibly, but if your site is hacked, there is a chance this information could be discovered, rendering this security useless. Besides, when it comes to other people's information, is there such a thing as too secure?

An additional form of security that can be used to help ensure that each password has another, unique string attached to it. This is referred to as salting. As each password is assigned a unique string of text this greatly improves individual account security, as there is no readily available method for working out what that string is, without looking through the source of the application in question.

In my own encryptions I am basing this off of elements of the email address that the user registers with, as this will always be a unique identifier for the account, as each email address can only be registered on the site once. In my own example I add an encrypted version of the email address to the password, something like the following:

  
$key = 'vnfuwy8t92bjkvfs';
$md5pass = md5($key.'wordpass'.md5('email@example.com')); // Generates 2f720caab50709a9228e0d985e1f47bf
$sha1pass = sha1($key.'wordpass'.sha1('email@example.com')); // Generates e08dcf66052f9f42802b572680e5f8be5f5a2862

This adds an additional layer of security which isn't defined/stored anywhere in the site, being used on registration and when the user logs in. Even with that I'm not 100% certain that this is the best approach to salting the login information, but I am hopeful that it should be strong enough, when combined with everything else, that it will provide a secure environment for user details.

Wrapping Up

Above I have covered some of the most common ways to secure user data, along with basic examples of how they can be used. My own implementations differ slightly from these, but the general structure is similar. Hopefully this information will provide useful to some of the people who read this and, if you can think of ways to improve upon it, I'd love to hear your thoughts on this topic.