 |

08-08-2007, 01:43 PM
|
|
WebProWorld Member
|
|
Join Date: Nov 2006
Posts: 72
|
|
Javascript and those naughty characters...
Ah, yea....
Anyways, I'm trying to strip our bad characters from a block of text with javascript. This seems to work sometimes but not others and I think it has to do with me not stripping out all the bad characters.
Of course, I could be completely wrong.
Here is my code so far, which truncates the resulting text:
<script type="text/JavaScript">
var outputstr = "!---TEXT---"
var textpreview = outputstr.replace(/[^a-zA-Z 0-9]+/g,'')
if (textpreview.length > 50)
{
document.write(textpreview.substr(0,50) + " ...")
}
else
{
document.write(textpreview)
}
</script>
The !---TEXT--- is a text file created by our shopping cart and holds product information. I thought I was only returning A-Z and 0-9 with my regex but I'm not sure if I'm doing it correctly.
Now, I said it work sometimes and not with others. The major difference I see with the ones that work and the ones that don't is that there are quotes "" in the text or there may be some html in the text.
A. Would this cause the javascript to break?
B. If so, what can I do?
C. Am I completely wrong about why it's not working?
D. Just give up this javascript stuff because I don't know what the hell I'm doing.
Thanks in advance,
DaK
|

08-08-2007, 03:08 PM
|
 |
Moderator
|
|
Join Date: Jun 2006
Location: United States
Posts: 1,722
|
|
Re: Javascript and those naughty characters...
Being client side, JavaScript should really never be used to do processing on data - different browsers have different JS implementations, and users could have JS turned off or be using a security product that changes the way JS works. What is going into the text file that you want to strip out? Bear in mind that a user can view anything the JS can see.
|

08-08-2007, 04:35 PM
|
|
WebProWorld Member
|
|
Join Date: Nov 2006
Posts: 72
|
|
Re: Javascript and those naughty characters...
Hey Wige,
I don't know if I have much of a choice other than to use javascript. The page is generated by our shopping cart perl/cgi and where I can't get to the source code or use PHP, this looks to be about my only choice unless someone else has a suggestion.
The text file holds a product description. However, (correct me if I'm wrong) I don't think that javascript can process the text file if there are funky characters in it (html, ", etc....). This is the problem, some of our product description are just one or two lines with not html or weird characters. These a processed fine with the script. It's the descriptions that have weird characters that are not showing up in the results.
It's part of a search results page but I don't want there to be tons of text in the results so I wan't to truncate the product description to only the first 50 characters.
DaK
|

08-08-2007, 05:11 PM
|
 |
Moderator
|
|
Join Date: Jun 2006
Location: United States
Posts: 1,722
|
|
Re: Javascript and those naughty characters...
I take it from the description the javascript is opening the text file containing the data, then processing line by line, filtering each result?
If you know enough PERL or PHP and are allowed to run scripts, I would suggest creating a server side script that opens the file and does the filtering for you, and passes the processed data to to the javascript. That way you can use the more comprehensive filtering abilities of PERL and leave the browser with less of a workload, and eliminate many client-side issues. In that case you would simply point the javascript to the new server side script.
|

08-08-2007, 08:09 PM
|
|
WebProWorld New Member
|
|
Join Date: Apr 2005
Posts: 19
|
|
Re: Javascript and those naughty characters...
More questions than questions there--
1. You want to allow the {space}?-- You said only "A-Z and 0-9"
2. Do you want to not-include nbsp?-- which get generated automatically in some html processing.
3. Do you later .toUpperCase() it?-- in which case NBSP in-caps fails...?
4. Would /[^\w ]+/ be simpler?-- except of course you may not want the "_" of \w
5. Would you want to convert unusable characters to space?
6. Do you need trim to single-spacing?
7. Why not simply, write textpreview.replace(/^(.{50}).+/,'$1...')
8. If you're seeing html, look for .innerHTML, htmlText in lieu of .innerText, text, data,...
9. NB. document.selection.createRange().text has empty-cells-of-zero-length for BR's ... like weapons of mass destruction they can be elusive in textonly.
10. [thinking... script type="text/JavaScript" might be choosing an old-version of javascript...]
11. [thinking... if you use RegExp.$1 you need make sure it matched something, else RegExp.$1 is old data from the last match... and would be anything]
12. " ..." should be "..." without the space because it may land in the middl... (And … is one-character for that.)
__
PS. Here's a ms-bug: Find-in-page "a b ™c" (when rendered) fails till you remove either the nbsp or the trademark ... I reported this to MS today....
Ray.
__________________
Mr. Raymond Kenneth Petry
Lanthus Corporation
Last edited by lanthus : 08-08-2007 at 09:06 PM.
|

08-09-2007, 10:32 AM
|
|
WebProWorld Member
|
|
Join Date: Nov 2006
Posts: 72
|
|
Re: Javascript and those naughty characters...
Ok, you've given me a bit to chew on here...
1. You want to allow the {space}?-- You said only "A-Z and 0-9"
Yes, I want to allow the space. I basically just want to be left with text with no formatting or special characters. Although I should say I would like to keep the special characters like "", ; , etc... as that would make the text more understandable to the user. But I don't know if those are actually breaking the code.
2. Do you want to not-include nbsp?-- which get generated automatically in some html
processing.
I suppose I would want to include it since it may represent a space between two words.
3. Do you later .toUpperCase() it?-- in which case NBSP in-caps fails...?
I would want it to be "natural" casing. If the letter in the text file is upper case, then keep it that way, etc...
4. Would /[^\w ]+/ be simpler?-- except of course you may not want the "_" of \w
I'll have to read about that, reg expressions are new to me.
5. Would you want to convert unusable characters to space?
No, just remove them.
6. Do you need trim to single-spacing?
Yes.
7. Why not simply, write textpreview.replace(/^(.{50}).+/,'$1...')
I don't know. I'll try it and see what it does. Regular expressions are new to me.
8. If you're seeing html, look for .innerHTML, htmlText in lieu of .innerText, text, data,...
If I'm seeing html where? The text file that is being processed may have html in it but I don't want it in the search results.
9. NB. document.selection.createRange().text has empty-cells-of-zero-length for BR's ... like weapons of mass destruction they can be elusive in textonly.
I'll have to read about this. You pretty much spoke Chinese to me there. LOL.
10. [thinking... script type="text/JavaScript" might be choosing an old-version of javascript...]
How should I be declaring it?
11. [thinking... if you use RegExp.$1 you need make sure it matched something, else RegExp.$1 is old data from the last match... and would be anything]
You lost me here...
12. " ..." should be "..." without the space because it may land in the middl... (And … is one-character for that.)
That can be easily corrected.
Thanks for all the questions? Hopefully you or someone can pin down my problem.
DaK
|

08-09-2007, 10:41 AM
|
|
WebProWorld Member
|
|
Join Date: Nov 2006
Posts: 72
|
|
Re: Javascript and those naughty characters...
Here are some test you can run to maybe get a better understanding of what is going on.
Go to this URL:
Curtain Rods, Drapery Hardware, Blinds, Luxury Bedding, Static Cling Window Films and Tints
Just a test page and I know the top nav is broken...lol.
In the search box in the header, type "arizona" (without the quotes). You will see results how I would like them to appear. If you click on the "view product link" for the first item and you will see where it is pulling the text from. No special characters or formatting.
Hit your back button. Now type in "finial" (again, no quotes). Now you will see the text is not showing up as expected. Click the view product link. No html but there are some special characters.
Hit your back button. Now type in "etch art" (with the quotes). Again, no text. Click the view product button and you will see what text should be there. No special characters but there is html on the page.
Maybe this will shed some light on my situation.
Thanks everyone!
DaK
|

08-09-2007, 11:00 AM
|
 |
Moderator
|
|
Join Date: Jun 2006
Location: United States
Posts: 1,722
|
|
Re: Javascript and those naughty characters...
I grabbed one of the search results from the finial search.
HTML Code:
<script type="text/JavaScript">
var outputstr = "4 1/4"H, 6 1/4"L, 4"P. Sold as each. "
/*var textpreview = "4 1/4"H, 6 1/4"L, 4"P. Sold as each. "*/
/*var textpreview = outputstr.replace(/[\/=;:.<>'&_,%`"@~#]/gi," ")*/
var textpreview = outputstr.replace(/[^a-zA-Z 0-9]+/g,'')
if (textpreview.length > 50)
{
document.write(textpreview.substr(0,50) + " ...")
}
else
{
document.write(textpreview)
}
</script>
How is the first line of the script generated? This is where the script breaks, before even reaching the regular expression.
|

08-09-2007, 11:31 AM
|
|
WebProWorld Member
|
|
Join Date: Nov 2006
Posts: 72
|
|
Re: Javascript and those naughty characters...
It's generated by the shopping cart. We use !---TEXT--- which is a tag used by the shopping cart to represent the text description for the product. The shopping cart is written in perl but we can't access the source code.
DaK
|

08-09-2007, 11:54 AM
|
 |
Moderator
|
|
Join Date: Jun 2006
Location: United States
Posts: 1,722
|
|
Re: Javascript and those naughty characters...
I see. The shopping cart looks like it is expecting the output to simply be embedded in the page. I would check the support for that cart software to see if they have an alternate code (they may have something like !--SCRIPT-- for JavaScript for example) that sanitizes the code. In the meantime a possible workaround is to change the first line of the script from:
HTML Code:
var outputstr = "!--TEXT-- "
to:
HTML Code:
var outputstr = '!--TEXT--'
outputstr = outputstr.replace('"', '\\"')
Note the quotes. In the first line, you are using single quotes. In the second line you are enclosing a double quote in a single quote. I am not absolutely sure about how many backslashes you need in the second line, but worst case scenario you could use """.
|

08-09-2007, 03:47 PM
|
|
WebProWorld Member
|
|
Join Date: Nov 2006
Posts: 72
|
|
Re: Javascript and those naughty characters...
Wige!!!!!!
That did it! Thanks for taking the time to help, everything seems to be working now.
DaK
|
| Thread Tools |
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|