As much as Javascript offers to the Web application developer, it has a rather big hole when it comes to URI handling functions. It’s something that’s always seemed a little strange to me; a language, now rather tied to the Web, missing functionality to handle the main resource identifiers of the environment.
Nonetheless, I recently found myself in need of a function to extract the hostname from a URL string, simply for a presentational purposes. Unfortunately my framework of choice, Mootools, didn’t have a ready-made solution to hand. Being that I could be sure the incoming string would always be a complete URL, and wouldn’t be part of a longer string, the only requirement was to simply extract the “www.google.com“ from a “http://www.google.com/something?foo=bar“. No need to be too clever about it; a quick’n‘dirty regular expression does the trick:
| # | Code |
|---|---|
| 0001 | function getHostname(str) { |
| 0002 | var re = new RegExp('^(?:f|ht)tp(?:s)?\://([^/]+)', 'im'); |
| 0003 | return str.match(re)[1].toString(); |
| 0004 | } |
This function supports URLs with ftp://, http://, and https:// prefixes, but nothing more. Note the use of (?:f|ht) and (?:s). This matches the expression inside, but doesn’t store the match in the output array. We’re only interested in the hostname, so we reduce the amount of cruft coming back in the result array.
Can it be improved?
Creating quick functions like this in the global namespace is rather messy (and somewhat lazy!). What happens if someone else comes along and writes their own getHostname() function after this – POP!… it’s gone.
Since this getHostname() function is acting on an input string, and returning a string, why not extend the native String object prototype? Well, we can, and it requires hardly any modification to the code for great benefit. Shocking, a feature of object-oriented programming in a procedural language!
| # | Code |
|---|---|
| 0001 | String.prototype.getHostname = function() { |
| 0002 | var re = new RegExp('^(?:f|ht)tp(?:s)?\://([^/]+)', 'im'); |
| 0003 | return this.match(re)[1].toString(); |
| 0004 | } |
| 0005 | |
| 0006 | // now you can do things like mystring.getHostname(); |
All we needed to do was perform or regular expression match on the string instance itself (using the this keyword, which references the current string instance), instead of passing in a string as a function argument each time. Note that we’re not actually changing the value of the string, just returning the value of the getHostname() function on the string.
But wait, what about MooTools?
Indeed. I’m using MooTools, so can it be integrated into MooTools in a similar way? Yes, but the method is just a little different. The object-oriented nature of the MooTools framework design means that the String object is already extended and built upon into a “MooTools Native”. The concept of Natives in Mootools is explained over at the MooTools Blog. Suffice to say, MooTools makes it possible to do all kinds of magic with the native Javascript objects that’s out of the scope if this entry.
To follow by example, we extend this base “Native” class through the MooTools Class.implement().
| # | Code |
|---|---|
| 0001 | String.implement({ |
| 0002 | getHostname : function() { |
| 0003 | var re = new RegExp('^(?:f|ht)tp(?:s)?\://([^/]+)', 'im'); |
| 0004 | return this.match(re)[1].toString(); |
| 0005 | } |
| 0006 | }); |
Now, things are much more tidy. To see it in action, check the Compendium image galleries and look for links in the lightbox popups. There’s also some test data to play about with:
| # | Code |
|---|---|
| 0001 | var str1 = 'http://www.google.com'; |
| 0002 | var str2 = 'https://www.yahoo.com/'; |
| 0003 | var str3 = 'http://www.google.com/foo?bar=quux'; |
| 0004 | var str4 = 'HTTPS://yahoo.co.uk/hello/world/example.jpeg'; |
| 0005 | var str5 = 'hTtPs://WwW.iMaGeS.sOmEtHiNg.YaHoO.cO.uK/hello/world/example.jpeg'; |
| 0006 | var str6 = 'ftp://user@foo:ftp.foo.bar/files/'; |
| 0007 | var str7 = 'ftp://ftp.me.you//'; |
| 0008 | |
| 0009 | alert(str1.getHostname()); // www.google.com |
| 0010 | alert(str2.getHostname()); // www.yahoo.com |
| 0011 | alert(str3.getHostname()); // www.google.com |
| 0012 | alert(str4.getHostname()); // yahoo.co.uk |
| 0013 | alert(str5.getHostname()); // WwW.iMaGeS.sOmEtHiNg.YaHoO.cO.uK |
| 0014 | alert(str6.getHostname()); // user@foo:ftp.foo.bar |
| 0015 | alert(str7.getHostname()); // ftp.me.you |
If you’ve got any other useful URL handling functions, please let me know.
- Filed in web and javascript under regular expressions,mootools,classes,strings,URLs
- View Comments (Commenting is now closed for this entry.)
- Permanent link to this entry: Extracting the hostname of a URL with Javascript
Comments for "Extracting the hostname of a URL with Javascript"
Commenting is now closed for this article