As much as Javascript offers to the Web application developer, it has a rather big hole when it comes to URI handling functions. It’s something that’s always seemed a little strange to me; a language, now rather tied to the Web, missing functionality to handle the main resource identifiers of the environment.

Nonetheless, I recently found myself in need of a function to extract the hostname from a URL string, simply for a presentational purposes. Unfortunately my framework of choice, Mootools, didn’t have a ready-made solution to hand. Being that I could be sure the incoming string would always be a complete URL, and wouldn’t be part of a longer string, the only requirement was to simply extract the “www.google.com“ from a “http://www.google.com/something?foo=bar“. No need to be too clever about it; a quick’n‘dirty regular expression does the trick:

#Code
0001function getHostname(str) {
0002var re = new RegExp('^(?:f|ht)tp(?:s)?\://([^/]+)', 'im');
0003return str.match(re)[1].toString();
0004}

get this code

This function supports URLs with ftp://, http://, and https:// prefixes, but nothing more. Note the use of (?:f|ht) and (?:s). This matches the expression inside, but doesn’t store the match in the output array. We’re only interested in the hostname, so we reduce the amount of cruft coming back in the result array.

Can it be improved?

Creating quick functions like this in the global namespace is rather messy (and somewhat lazy!). What happens if someone else comes along and writes their own getHostname() function after this – POP!… it’s gone.

Since this getHostname() function is acting on an input string, and returning a string, why not extend the native String object prototype? Well, we can, and it requires hardly any modification to the code for great benefit. Shocking, a feature of object-oriented programming in a procedural language!

#Code
0001String.prototype.getHostname = function() {
0002var re = new RegExp('^(?:f|ht)tp(?:s)?\://([^/]+)', 'im');
0003return this.match(re)[1].toString();
0004}
0005 
0006// now you can do things like mystring.getHostname();

get this code

All we needed to do was perform or regular expression match on the string instance itself (using the this keyword, which references the current string instance), instead of passing in a string as a function argument each time. Note that we’re not actually changing the value of the string, just returning the value of the getHostname() function on the string.

But wait, what about MooTools?

Indeed. I’m using MooTools, so can it be integrated into MooTools in a similar way? Yes, but the method is just a little different. The object-oriented nature of the MooTools framework design means that the String object is already extended and built upon into a “MooTools Native”. The concept of Natives in Mootools is explained over at the MooTools Blog. Suffice to say, MooTools makes it possible to do all kinds of magic with the native Javascript objects that’s out of the scope if this entry.

To follow by example, we extend this base “Native” class through the MooTools Class.implement().

#Code
0001String.implement({
0002getHostname : function() {
0003var re = new RegExp('^(?:f|ht)tp(?:s)?\://([^/]+)', 'im');
0004return this.match(re)[1].toString();
0005}
0006});

get this code

Now, things are much more tidy. To see it in action, check the Compendium image galleries and look for links in the lightbox popups. There’s also some test data to play about with:

#Code
0001var str1 = 'http://www.google.com';
0002var str2 = 'https://www.yahoo.com/';
0003var str3 = 'http://www.google.com/foo?bar=quux';
0004var str4 = 'HTTPS://yahoo.co.uk/hello/world/example.jpeg';
0005var str5 = 'hTtPs://WwW.iMaGeS.sOmEtHiNg.YaHoO.cO.uK/hello/world/example.jpeg';
0006var str6 = 'ftp://user@foo:ftp.foo.bar/files/';
0007var str7 = 'ftp://ftp.me.you//';
0008 
0009alert(str1.getHostname()); // www.google.com
0010alert(str2.getHostname()); // www.yahoo.com
0011alert(str3.getHostname()); // www.google.com
0012alert(str4.getHostname()); // yahoo.co.uk
0013alert(str5.getHostname()); // WwW.iMaGeS.sOmEtHiNg.YaHoO.cO.uK
0014alert(str6.getHostname()); // user@foo:ftp.foo.bar
0015alert(str7.getHostname()); // ftp.me.you

get this code

If you’ve got any other useful URL handling functions, please let me know.

Comments for "Extracting the hostname of a URL with Javascript"

Commenting is now closed for this article

About

beardscratchers.com is a music-focused web experiment and creative-arts journal from London, England.

Subscribe/Syndicate

Categories

Previous Entries…

Journal content and design are © of Nick Skelton

built with web standards and a baseline.