Email address validation, again

Tag: regex Author: smile597 Date: 2009-09-05

Looking at the posts here for email address validation, I am looking to be much more liberal about the client side test I am performing.

The closest I have seen so far is:

^([\w-\.]+)@((\[[0–9]{1,3}\.[0–9]{1,3}\.[0–9]{1,3}\.)|(([\w-]+\.)+))
([a-zA-Z]{2,4}|[0–9]{1,3})(\]?)$

That will not match this#[email protected], which according to RFC is valid

  • Uppercase and lowercase English letters (a-z, A-Z)
  • Digits 0 through 9
  • Characters ! # $ % & ' * + - / = ? ^ _ ` { | } ~
  • Character . (dot, period, full stop) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively.

I want a pretty simple match:

  • Does not start with .
  • Any character allowed up to the @
  • Any character allowed after the @
  • No consecutive . or @ allowed
  • Part after the last . (tld) must be [a-z0-9-]

I will use \i to make the search case insensitive. The consecutive characters is where I am getting hung up on.

Duplicate of , and many others.
I have been working on one, looks like this is going to cover it broadly [email protected](?:[-a-z0-9]+\.)+[a-z]{2,10} I do not suppose there will be a tld longer than 10 chars, .museum seems to be the current record holder in strangeness.
possible duplicate of How to validate an email address in PHP (see the regex pattern in there)

Other Answer1

If you want to match against the official standard, you can use

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

So even when following official standards, there are still trade-offs to be made. Don't blindly copy regular expressions from online libraries or discussion forums. Always test them on your own data and with your own applications.

Other Answer2

/^[^.].*@(?:[-a-z0-9]+\.)+[-a-z0-9]+$/

comments:

Seems near perfect, one exception: <pre>[email protected] [email protected] [email protected] [email protected] t#[email protected] t#[email protected]#mple.com <-this matches t#[email protected]#mple.c#om</pre> Anything after the last @ should be only [a-z0-9-] (valid domain chars) then a dot, then another [a-z0-9-]
No idea how to reformat that comment, sorry about that.
In your question, you said only the TLD name should be [-a-z0-9]. Fixing that is trivial.

Other Answer3

It depends on who is using your applications. For internal applications, often a username is a valid email address. Much of the RFC-822 email spec describes additional fields which may be present in an email address. For example, Allen Town [email protected], is a pretty standard email address which you might type into your favorite mail client. However, for an application, you may want to be the one adding the name to the email address when you send email, and don't want that to be part of the users address.

The most liberal way of validating an email address is to just attempt to send an email to whatever address the user gives. If they receive the email, and can confirm it, then it's a valid address.

comments:

I understand this, but I would like some up front validation. Just so the aol users can not make mistakes :) There will be no local delivery, so the email must be in the format of [email protected]

Other Answer4

function validator(email) {
   var bademail = false;
   bademail = (email.indexOf(".") == 0) ? true : bademail;
   bademail = (email.indexOf("..") != -1) ? true : bademail;
   bademail = (email.indexOf("@@") != -1) ? true : bademail;
   if(!bademail) {
      var tldTest = new RegExp("[a-z0-9-]");
      var lastperiodpos = email.lastIndexOf(".");
      var tldstr = email.slice(lastperiodpos + 1);
      bademail = (!(tldTest.test(tldstr))) ? true : bademail;
   } 
   return bademail;
}

comments:

+1 because some boor gave you a -1 without leaving a comment. I hate that!
Thanks. I just figured I'd actually keep it simple, as requested, rather than involve regex where it's not needed. Wish I could have thought of a way to use it at the end that wasn't convoluted.
-1. I don’t think, that ".." is generally illegal in an email address. A friend of mine once had such an email address. You should simply remove that rule. And what about special characters in the domain part? They are also allowed, but must be translated according to RFC 3492. So this is not a correct answer.
@pvorb: You are wrong... ".." (consecutive dots) is generally illegal in both the local part and domain part of the address. en.wikipedia.org/wiki/E-mail_address#Local_part
@EricJ.: Tanks for clarification. It seems like I and the email provider of my friend have been wrong.

Other Answer5

A very Perl-ish RFC822 compliant regular expression can be found here

Other Answer6

The following has been useful for me for quite sometime now.

function validateEmail(email) { 
    var re = /^(([^<>()[\]\\.,;:\[email protected]\"]+(\.[^<>()[\]\\.,;:\[email protected]\"]+)*)|(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/;
    return re.test(email);
} 

Other Answer7

Perfect validation regex is probably hard to match, but I've used this one for quite some time:

/^([\w-\.\+])+\@([\w-]+\.)+([\w]{2,6})+$/

Only changed it recently to match 6-char TLDs.

comments:

What about 7-x char TLDs or quoted strings?