Phone Numbers: Input, Storage and Formatting

March 11th, 2006

Its quite common these days to input one or more phone numbers. Be it home, work, fax or mobile, everyone's got a number, and you will invariably need to collect that piece of data some time soon.

Looking at some sites, its blatantly obvious that they do nothing to validate the data before it is entered into the database and when it is later retrieved, its printed out the same as it was input. A good thing would be to:

  1. Validate the incoming phone number. Make sure it has enough digits at a minimum
  2. Store the phone number in a format that is easily retrievable and searchable
  3. When displaying this phone number in the future, we should have a standardised pattern of display

I'm going to place a limitation on this from the start. Being that I'm from the UK, we'll use UK phone numbers. If you follow this through, you shouldn't find it hard enough to change it for US phone numbers, or include checking for international phone numbers.

Validation

Regular expressions come into play in this situation. Lets have a look at some formats of phone numbers that we might come across:

  • 0161 123 1234
  • 020 8123 1234
  • 07123 123 123
  • 02349 123 123
  • 01159 123 123
  • 0800 123 123

That is just a small selection. Of course, the separator between the digits should be optional, but on the other hand, there is generally a small array of allowable non-numerical characters that are used for separation and standardisation such as + - . , and a blank space.

The simplest way to validate the phone number would be to strip out all the characters apart from our digits. We can then check to see if we have the required length of 10 or 11 digits:

PHP:
  1. $phone = '0161 123 1234';
  2. $phone = preg_replace("/[^d]/",'',$phone);
  3. if((strlen($phone)!==10) || (strlen($phone)!==11)) { //phone number is either too big or too small
  4.   die("Invalid Phone Number");
  5. }

The above regular expression will replace anything that is not a digit with 'nothing' - i.e. it is removed. Once the validation is passed, it is checked to ensure that it is 10 or 11 digits long, and if not, will error out. Assuming the phone number has passed, we now need to store it.

Storing our phone numbers

Its quite common for beginner coders to store the phone numbers in a VARCHAR (or similar) field in order to maintain formatting. There are several issues with this:

  1. Size: Storing '020 8123 1234' in a VARCHAR field will consume 14 bytes of storage space. Storing the same value as an integer in a BIGINT column: 8 bytes. So its a 43% saving on diskspace
  2. Difficult to query:If your phone number exists as '020 8123 1234' and someone searched for '0208 123 1234' you won't get a match, even though the data is there.

The most efficient way is storing the phone number as an integer value. The INT(UNSIGNED) field gives us values from 0 up to 4294967295. However, this will not accomodate what we need (i.e. 08451231234 will get 'truncated' to 4294967295), so we need to use a BIGINT column. This has more than enough room for our number, and we will make it UNSIGNED, as we are not likely to have a negative phone number.

Formatting our Phone Number

Once our phone number is in the database, retrieving it out is a matter of a simple database query. However, when you look at what you have, our '020 8123 1234' is not really something that is 'useable' - in fact it looks like this: 2081231234.

Fortunately, formatting our number is very easy in PHP. In the UK, we have several different formats, generally following these rules:

  • If a number begins with '01' the format is 01xxx xxx xxx unless it begins with:
    • 0121
    • 0131
    • 0141
    • 0151
    • 0161
    • 0191
    • 0113
    • 0114
    • 0115
    • 0116
    • 0117
  • If a number begins with '020' then the format is 020 xxxx xxxx
  • All other '02' numbers are in the following format: 02xxx xxx xxx
  • Mobile phone numbers are 07xxx xxx xxx
  • Free and Local & National phone numbers are formatted like: 08xx xxx xxxx or 08xx xxx xxx. 05 is also a valid freephone prefix.
  • Premium Rate 09 numbers are formatted like: 09xx xxx xxxx

That is by no means a full list, but should serve as a skeleton for us to work from. In order to determine our formats, we have an array of area codes that we can use. All of these end up with 4 digits as the initial area code.

PHP:
  1. $national = array("161","115","800","500","870","845","121","131","141","117",
  2. "118","151","114","116","191","113");

So we will want to check our number for those. We also want to check our number for '07', '09' and '020'. We could check this using strpos, but that won't tie our check to the first few digits. The other options are to use a regular expression, or to use substr() to grab the first few digits, and do a direct comparison. We'll use a regular expression for this one, as most people can understand the use of substr(), however, a lot of people have issues with regular expressions, and so hopefully this might clear a few things up!

PHP:
  1. $phone = 2081231234;
  2. //A regular expression to check for 07xxx, 09xx and 020 area codes
  3. $phoneRegex = preg_match("/^(20|7|9)/",$phone,$matches);
  4. //$phoneRegex will be true on a successful match and $matches[1] will contain the digits that matched
  5.  
  6. //checking other area codes
  7. $phone = 8451231234; //example local rate number
  8. $national = array(...); //our array from above
  9. if(in_array(substr($phone,0,3),$national)) {
  10.   //do formatting
  11. }

So now we know how to determine our phone number, its a matter of just formatting our number. This can be done by splitting our number using sscanf() and then simple using the details as needed. Here's an example:

PHP:
  1. $phone = 2081234123;
  2. print_r(sscanf($phone,"%2d%4d%4d"));
  3. /* gives:
  4. Array
  5. (
  6.     [0] => 20
  7.     [1] => 8123
  8.     [2] => 1234
  9. ) */
  10.  
  11. //We can avoid using an array too:
  12. sscanf($phone,"%2d%4d%4d",$area,$first,$last);
  13. /* Gives:
  14. $area: 20
  15. $first: 8123
  16. $last: 1234 */

One last check we need to do is to to check how long our number is. Sometimes we can get a number like 0800 123 123 so we need to acount for this as well. So now we know how to determine the format for our number, lets roll it all into one function:

PHP:
  1. function formatPhone($phone) {
  2.   //format Phone Number
  3.   //define a few variables:
  4.   $national = array("161","115","800","500","870","845","121","131","141","117",
  5. "118","151","114","116","191","113"); //our array of national area codes for format like 0xxx xxx xxxx
  6.   $format = '';
  7.   $length = strlen($phone);
  8.  
  9.   //check if our number matches 020, 07 or 09:
  10.   if(preg_match("/^(20|7|9)/",$phone,$matches)) {
  11.     //we have a match // All these numbers have 11 digits
  12.     if($matches[1]==20) {
  13.       $format = "%2d%4d%4d";
  14.     } elseif($matches[1]==7) {
  15.       $format = "%4d%3d%3d";
  16.     } else {
  17.       $format = "%3d%3d%4d";
  18.     }
  19.   } elseif(in_array(substr($phone,0,3),$national)) {
  20.     //matched our $national array
  21.     $format = ($length==9) ? "%3d%3d%3d" : "%3d%3d%4d";
  22.   } else {
  23.     //if it doesn't match, then its going to be 0xxxx xxx xxx
  24.     $format = "%4d%3d%3d";
  25.   }
  26.   //We now have a pattern. We simply need to match the pattern to our phone number, and return the format
  27.   $result = sscanf($phone,$format);
  28.   return '0'.implode(" ",$result); //return with the preceeding 0
  29. }
  30. echo formatPhone(2081231234); //gives 020 8123 1234
  31. echo formatPhone(8001231234); //gives 0800 123 1234
  32. echo formatPhone(7123123123); //gives 07123 123 123

And there we have it. Simple formatting of phone numbers. It is by no means a complete formatting routine, and i'm sure there are numbers that will not be formatted properly. It also doesn't account for use of +44 as an international code, although this could be easily fixed by changing the return line to use '+44' instead of '0'.

I'd recommend reading up on sscanf, and getting used to it. It can definately come in use! Finally, I used the ternary operator in the above function which may have looked a bit confusing to beginners. However, it is something else you should get into the habit of using for small if...else statements.


 Add to del.icio.us    Digg this    Technorati

Related Posts:

Entry Filed under: Input Validation

8 Comments Add your own

  • 1. Rob...  |  March 11th, 2006 at 12:26 pm

    You missed out that 01xxx can also have 5 digits following: i.e. 01xxx xxxxx.

    Certainly Worcesters (01905) and Blackburn have some numbers like this, so presumably others do too.

  • 2. Ben  |  April 9th, 2006 at 8:14 pm

    Also worth noting that "0118" is missing from the list of area codes

    Internal extensions also need a mention here, it's not uncommon to see them tacked on the end of a phone number separated with a #

  • 3. cj  |  May 2nd, 2006 at 11:34 pm

    I'm sick of trying to conform to other countries numbering or typography conventions. Those systems that don't allow a country prefix are broken. And those that won't let me use a "+" prefix to indicate the number is in IDD format annoy me too. Validation is difficult to get right, requires continual maintenance, and the risk of annoying potential customers is large.

  • 4. lslinnet  |  July 27th, 2007 at 2:09 pm

    Have you read about international phone numbers? perhaps you should think about adjusting this to support every known mobilenumber ( http://en.wikipedia.org/wiki/Calling_code )

  • 5. Nick Allen  |  January 7th, 2008 at 3:36 pm

    Well no-one else did, so I'll say thanks! Very handy.

  • 6. Chris Pink  |  February 28th, 2008 at 6:41 pm

    You missed the backslash in the preg_replace line, should read;

    $phone = preg_replace("/[^d]/",'',$phone);

    I consider it a triumph i realised this (eventually)

  • 7. Peter Lange  |  April 29th, 2008 at 4:48 pm

    Yes, not enough people are doing so, so let me say thank you for your efforts. They aremuch appreciated and proving extremely handy in the work I am doing currently.

  • 8. John  |  August 8th, 2008 at 5:25 pm

    Surely the preg_replace line, should read;
    $phone = preg_replace("/[^d]/",'',$phone);

    But, thanks very much. Nice tutorial n UK too!

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed


Calendar

March 2006
M T W T F S S
« Feb   Apr »
 12345
6789101112
13141516171819
20212223242526
2728293031  

Most Recent Posts