Submit Your Article Forum Rules

Results 1 to 2 of 2

Thread: JavaScript filtering gone wrong

  1. #1
    Administrator weegillis's Avatar
    Join Date
    Oct 2003
    Posts
    5,789

    Post JavaScript filtering gone wrong

    For years now I've had this script running that polls the page for new window classes and writes up their contained links to include target="_blank" and the appropriate window.open method. Old school, but working flawlessly all these years. Until now.

    This site has never had any OBL's with a HASH in the URL, so was not prepared for the confusion I experienced today. I ran around plugging in inspector scripts all over the place to view the collections at every point only to discover that the problem was my hash filter.
    Code:
    JavaScript
    
    function inPage(x) { return x.match(/\/#/); };
    Nothing was not working as it should. In fact, everything was working as it should. Sheesh...

    Wikimapia has a hash in the URL of their map locations. My filter was negating these links on the basis of this alone and it took me two hours to figure this out, all because I was suspicious of my own code.

    Had I actually made the connection when I first laid eyes on the URL in the location bar, it would have gone without a hitch. I would have said, "Gonna need a counter-effect for this one." and gone ahead and created one. But such is the realm of working with old code, whether we created it or not. If we don't study it well enough, it can come back and bite us. I was sending visitors to a 404 page that Wikimapia was gracious enough to make a pleasant experience, albeit futile. I didn't study this code well enough when I created this filter.

    This short-sighted filter can be slighted with a simple counter-measure encoded right in the URL's, namely, '%23' in place of '#'. This in itself is adequate enough to get around my filter. But it slammed the door shut with a 404 at the other end. So I got around the filter but I negated the resource I am aiming at. Really smart. Scramble...

    Code:
    JavaScript
    
    function hashEncode(x) { return (x.replace(/%23/,"#")); };
    As long as this happens after the filter, there's no problem with the URL.

    Lesson learned: Never think one's code is bullet proof, and study the situation before you spend hours debugging something that already works. It's not always the real code, it is sometimes the MISSING code.

    <edit>

    Now the real problem is going to be how to give the unscripted client the correct URL. It just never ends, does it?

    </edit>
    Last edited by weegillis; 04-27-2012 at 01:21 AM. Reason: edit / 'it' removed / italics

  2. #2
    Administrator weegillis's Avatar
    Join Date
    Oct 2003
    Posts
    5,789
    Quote Originally Posted by weegillis View Post
    <edit>

    Now the real problem is going to be how to give the unscripted client the correct URL. It just never ends, does it?

    </edit>
    After some wrestling, it finally came down to this simple solution:
    Code:
    JavaScript
    
    function inPage(x) { return (x.match(/\/#/) && x.match(/^\/#lat=/)); };
    This is geared to match Wikimapia's URL's which include 'lat=' as the first variable expression in the hash string. So if a URL contains '#' and not '#lat=' it returns true, otherwise, false, and allows the Wikimapia URL to pass through the filter. Works fine.

    Problem solved.

    ---------- Post added at 06:13 PM ---------- Previous post was at 05:51 PM ----------

    Oh what the hey? Figured on bumping this before someone points out the obvious: the clunky double call to match(). Spoke too soon, I guess. Spotted it right off so ran out and investigated the correct way to AND within a REGEX. Here's what I came to conclude is the right expression for this case:
    Code:
    JavaScript
    
    function inPage(x) {  return x.match(/(?=\/#)(?=^\/#lat=)/); };
    I included the forward slash a long time ago because if I wanted an in-page link (or AJAX hash) to open a new window (tab), all I had to do was include a resource name in the URL, such as index.php#, which would sneak past the filter. Had Wikipedia done this with their URL, I would never have encountered this issue (at least not in the present moment). It also ensures the exact starting point, but this is moot.
    Last edited by weegillis; 04-27-2012 at 08:23 PM. Reason: corrections, and additional explanation

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •