DotNetNuke Powered!
          
John Mitchell's (mostly DotNetNuke) Blog - Regex for replacing relative Url with absolute Url
 Monday, May 29, 2006

It took me a while to figure out how to do this one so i thought I'd put it out here for future reference.

I wanted to replace all relative urls in anchor tags with an absoute one. It was easy enough to find a way to capture all  href= attributes.  But I also wanted to make sure that any any href=http:// were not captured since hrefs like those are probably already absolute.

It took some googling, but I finally came across the magic phrase "negative look-ahead assertion"

That is the way to tell the regex engine that you want to NOT match on a character sequence.

So here it is, the regex for finding all <a href= that do not start with http://, the negative look ahead assertion is in bold red.

(<\s*a\s+[^>]*href\s*=\s*[\"'])(?!http)([^\"'>]+)[\"'>]

and this is the replacement code:

$1http://www.snapsis.com$2

and if one wanted to take that a little further to also get img tags, then it can be done like this:

(<\s*(a|img)\s+[^>]*(href|src)\s*=\s*[\"'])(?!http)([^\"'>]+)[\"'>]

with the replacement code now being:

$1http://www.snapsis.com$4

5/29/2006 9:40:09 PM (Central Standard Time, UTC-06:00)  #    Comments [0]
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):