|
|
||||||
|
||||||
| Index Link To US Private Messages Archive FAQ RSS | ||||||
| Web Programming Discussion Forum Working with an API? Developing a plugin? Writing a Mod or script for your favorite blog, Web 2.0 site or Forum? Welcome. |
Share Thread: & Tags
|
||||
|
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Ah, yea....
Anyways, I'm trying to strip our bad characters from a block of text with javascript. This seems to work sometimes but not others and I think it has to do with me not stripping out all the bad characters. Of course, I could be completely wrong. Here is my code so far, which truncates the resulting text: <script type="text/JavaScript"> var outputstr = "!---TEXT---" var textpreview = outputstr.replace(/[^a-zA-Z 0-9]+/g,'') if (textpreview.length > 50) { document.write(textpreview.substr(0,50) + " ...") } else { document.write(textpreview) } </script> The !---TEXT--- is a text file created by our shopping cart and holds product information. I thought I was only returning A-Z and 0-9 with my regex but I'm not sure if I'm doing it correctly. Now, I said it work sometimes and not with others. The major difference I see with the ones that work and the ones that don't is that there are quotes "" in the text or there may be some html in the text. A. Would this cause the javascript to break? B. If so, what can I do? C. Am I completely wrong about why it's not working? D. Just give up this javascript stuff because I don't know what the hell I'm doing. Thanks in advance, DaK |
|
|||
|
Hey Wige,
I don't know if I have much of a choice other than to use javascript. The page is generated by our shopping cart perl/cgi and where I can't get to the source code or use PHP, this looks to be about my only choice unless someone else has a suggestion. The text file holds a product description. However, (correct me if I'm wrong) I don't think that javascript can process the text file if there are funky characters in it (html, ", etc....). This is the problem, some of our product description are just one or two lines with not html or weird characters. These a processed fine with the script. It's the descriptions that have weird characters that are not showing up in the results. It's part of a search results page but I don't want there to be tons of text in the results so I wan't to truncate the product description to only the first 50 characters. DaK |
|
||||
|
I take it from the description the javascript is opening the text file containing the data, then processing line by line, filtering each result?
If you know enough PERL or PHP and are allowed to run scripts, I would suggest creating a server side script that opens the file and does the filtering for you, and passes the processed data to to the javascript. That way you can use the more comprehensive filtering abilities of PERL and leave the browser with less of a workload, and eliminate many client-side issues. In that case you would simply point the javascript to the new server side script.
__________________
The best way to learn anything, is to question everything. |
|
|||
|
More questions than questions there--
1. You want to allow the {space}?-- You said only "A-Z and 0-9" 2. Do you want to not-include nbsp?-- which get generated automatically in some html processing. 3. Do you later .toUpperCase() it?-- in which case NBSP in-caps fails...? 4. Would /[^\w ]+/ be simpler?-- except of course you may not want the "_" of \w 5. Would you want to convert unusable characters to space? 6. Do you need trim to single-spacing? 7. Why not simply, write textpreview.replace(/^(.{50}).+/,'$1...') 8. If you're seeing html, look for .innerHTML, htmlText in lieu of .innerText, text, data,... 9. NB. document.selection.createRange().text has empty-cells-of-zero-length for BR's ... like weapons of mass destruction they can be elusive in textonly. 10. [thinking... script type="text/JavaScript" might be choosing an old-version of javascript...] 11. [thinking... if you use RegExp.$1 you need make sure it matched something, else RegExp.$1 is old data from the last match... and would be anything] 12. " ..." should be "..." without the space because it may land in the middl... (And … is one-character for that.) __ PS. Here's a ms-bug: Find-in-page "a b ™c" (when rendered) fails till you remove either the nbsp or the trademark ... I reported this to MS today.... Ray.
__________________
Mr. Raymond Kenneth Petry Lanthus Corporation Last edited by lanthus; 08-08-2007 at 10:06 PM. |
|
|||
|
Ok, you've given me a bit to chew on here...
1. You want to allow the {space}?-- You said only "A-Z and 0-9" Yes, I want to allow the space. I basically just want to be left with text with no formatting or special characters. Although I should say I would like to keep the special characters like "", ; , etc... as that would make the text more understandable to the user. But I don't know if those are actually breaking the code. 2. Do you want to not-include nbsp?-- which get generated automatically in some html processing. I suppose I would want to include it since it may represent a space between two words. 3. Do you later .toUpperCase() it?-- in which case NBSP in-caps fails...? I would want it to be "natural" casing. If the letter in the text file is upper case, then keep it that way, etc... 4. Would /[^\w ]+/ be simpler?-- except of course you may not want the "_" of \w I'll have to read about that, reg expressions are new to me. 5. Would you want to convert unusable characters to space? No, just remove them. 6. Do you need trim to single-spacing? Yes. 7. Why not simply, write textpreview.replace(/^(.{50}).+/,'$1...') I don't know. I'll try it and see what it does. Regular expressions are new to me. 8. If you're seeing html, look for .innerHTML, htmlText in lieu of .innerText, text, data,... If I'm seeing html where? The text file that is being processed may have html in it but I don't want it in the search results. 9. NB. document.selection.createRange().text has empty-cells-of-zero-length for BR's ... like weapons of mass destruction they can be elusive in textonly. I'll have to read about this. You pretty much spoke Chinese to me there. LOL. 10. [thinking... script type="text/JavaScript" might be choosing an old-version of javascript...] How should I be declaring it? 11. [thinking... if you use RegExp.$1 you need make sure it matched something, else RegExp.$1 is old data from the last match... and would be anything] You lost me here... 12. " ..." should be "..." without the space because it may land in the middl... (And … is one-character for that.) That can be easily corrected. Thanks for all the questions? Hopefully you or someone can pin down my problem. DaK |
|
|||
|
Here are some test you can run to maybe get a better understanding of what is going on.
Go to this URL: Curtain Rods, Drapery Hardware, Blinds, Luxury Bedding, Static Cling Window Films and Tints Just a test page and I know the top nav is broken...lol. In the search box in the header, type "arizona" (without the quotes). You will see results how I would like them to appear. If you click on the "view product link" for the first item and you will see where it is pulling the text from. No special characters or formatting. Hit your back button. Now type in "finial" (again, no quotes). Now you will see the text is not showing up as expected. Click the view product link. No html but there are some special characters. Hit your back button. Now type in "etch art" (with the quotes). Again, no text. Click the view product button and you will see what text should be there. No special characters but there is html on the page. Maybe this will shed some light on my situation. Thanks everyone! DaK |
|
||||
|
I grabbed one of the search results from the finial search.
HTML Code:
<script type="text/JavaScript"> var outputstr = "4 1/4"H, 6 1/4"L, 4"P. Sold as each. " /*var textpreview = "4 1/4"H, 6 1/4"L, 4"P. Sold as each. "*/ /*var textpreview = outputstr.replace(/[\/=;:.<>'&_,%`"@~#]/gi," ")*/ var textpreview = outputstr.replace(/[^a-zA-Z 0-9]+/g,'') if (textpreview.length > 50) { document.write(textpreview.substr(0,50) + " ...") } else { document.write(textpreview) } </script>
__________________
The best way to learn anything, is to question everything. |
|
|||
|
It's generated by the shopping cart. We use !---TEXT--- which is a tag used by the shopping cart to represent the text description for the product. The shopping cart is written in perl but we can't access the source code.
DaK |
|
||||
|
I see. The shopping cart looks like it is expecting the output to simply be embedded in the page. I would check the support for that cart software to see if they have an alternate code (they may have something like !--SCRIPT-- for JavaScript for example) that sanitizes the code. In the meantime a possible workaround is to change the first line of the script from:
HTML Code:
var outputstr = "!--TEXT-- " HTML Code:
var outputstr = '!--TEXT--'
outputstr = outputstr.replace('"', '\\"')
__________________
The best way to learn anything, is to question everything. |
![]() |
|
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Special Characters | snowycat | Other Engines/Directories | 2 | 12-01-2006 07:37 AM |
| Naughty Black Hat Stuff! | clivemcg | Affiliate Marketing Discussion Forum | 3 | 10-26-2005 12:51 PM |
| Your Logs May Show If You've Been Naughty | WPW_Feedbot | Search Engine Optimization Forum | 0 | 05-02-2005 10:30 PM |
| Am I invisible or have I been naughty? | purex | Google Discussion Forum | 7 | 04-20-2004 06:50 AM |
| strange characters | anabella | Google Discussion Forum | 5 | 04-16-2004 08:07 PM |
|
WebProWorld |
Advertise |
Contact Us |
About |
Forum Rules |
MVP's |
Archive |
Newsletter Archive |
Top |
WebProNews
WebProWorld is an iEntry, Inc. ® site - © 2009 All Rights Reserved Privacy Policy and Legal iEntry, Inc. 2549 Richmond Rd. Lexington KY, 40509 |