Regular expression for validating domain name
Here’s a fairly common code sample from Rails Applications with some sort of authentication system: If you’re experienced at Regex, this seems simple. Sections 3.2.4 and 3.4.1 of the RFC go into the requirements on how an email address needs to be formatted and, well, there’s not much you can’t do in your email address when quotes or backslashes are involved.If (like me when I first saw this) you AREN’T experienced at Regex, it takes a while to parse. The local string (the part of the email address that comes before the @) can contain any of these characters: is a valid email address. For this reason, for a time I began running any email address against the following regular expression instead: Simple, right? This is often the most I do and, when paired with a confirmation field for the email address on your registration form, can alleviate most problems with user error.The only good solution I can think of is to perform the validation so that it will accept rarely used (but valid) syntax BUT will warn the user that he/she should double check it for typos.Enter some address here: The domain part of the address is much easier to handle.This PHP script uses regular expressions to check if given input is a syntactically valid email address.It won't go to the very basics of regexps since there are some very good tutorials available in the web. Note that I need a regular expression, not an external webservice or tool.
If you actually check the Google query I linked above, people have been writing (or trying to write) RFC-compliant regular expressions to parse email addresses for years.But what if I told you there were a way to determine whether or not an email is valid without resorting to regular expressions at all? The activation email is a practice that’s been in use for years, but it’s often paired with complex validations that the email is formatted correctly.It’s surprisingly easy, and you’re probably already doing it anyway. If you’re going to send an activation email to users, why bother using a gigantic regular expression?A regular expression is a pattern that the regular expression engine attempts to match in input text. Each section in this quick reference lists a particular category of characters, operators, and constructs that you can use to define regular expressions: Character escapes Character classes Anchors Grouping constructs Quantifiers Backreference constructs Alternation constructs Substitutions Regular expression options Miscellaneous constructs We’ve also provided this information in two formats that you can download and print for easy reference: Download in Word (.docx) format Download in PDF (.pdf) format The backslash character (\) in a regular expression indicates that the character that follows it either is a special character (as shown in the following table), or should be interpreted literally. Back to top Anchors, or atomic zero-width assertions, cause a match to succeed or fail depending on the current position in the string, but they do not cause the engine to advance through the string or consume characters.A pattern consists of one or more character literals, operators, or constructs. The metacharacters listed in the following table are anchors. Back to top Grouping constructs delineate subexpressions of a regular expression and typically capture substrings of an input string.