AS400 and Odd Characters

    *Please feel free to move this post if it is in the wrong place.

    I've got an issue at work with an AS400 system. I'm not an AS400 person - my knowledge of them is simply entering information. The problem is.. I can't convince the AS400 team that there is a problem.

    The break down....

    Starting on June 11th of this year we started having problems with what I can "invalid characters" halting the display of information. The characters are plus signs, cent signs, exclamation marks, parantheses, etc. These characters display differently between the GUI version and the Green Screen version of the AS400 - the cent sign (green screen) appears as a paranthese (GUI) even though it's made using shift + 6 ( ^ ) - this translates into dual question marks (??) in my program. The problem is, these symbols in orders created before 2009 don't play well with the XML Parser.

    The AS400 group is convinced it is my program, I'm convinced it is theres.... "invalid" characters entered into my program don't appear in my program or when sent back to the AS400 system. The characters only appear when created in the AS400 system. Now... I could be crazy but to me that says it is the AS400 system that has the problem. Thing is, I don't know how to convince the AS400 team of this.

    Anyone have any suggestions? Or, am I wrong? Is it my program?
    Just to make sure I am understanding the issue properly: if data is entered through your software and stored in the AS400 database (Lotus Domino most likely?) when that data is queried and displayed either through your software or directly the extended characters display correctly. However, when data is entered directly and then queried, extended characters are not displayed correctly, being displayed one way when queried directly and another way when queried through an app?

    Honestly, it sounds to me like the data entry app that produces the mangled output is using a different character set than the database. The database is trying to reinterpret the passed data with its own character set, with the result that the data gets mangled, and displays differently on each output platform. So, the input is encoded with charsetA on the input station, the server mangles it, trying to parse it as charsetB, then sends it back to the terminals which try to display the damaged data incorrectly in various ways. Just an early theory at least, but I see similar issues with MySQL when cutting and pasting data from Word using a basic charset that doesn't understand smart quotes. Different browsers will display the output in different ways trying to interpret data the database didn't understand or store properly.
    The GUI and green screen are the same program - the AS400. I apologize for being ditzy on this but my knowledge of AS400 set up, programming, background, etc. is usage only. End user usage for that matter. I have considered the mismatch of charsets between the two programs - AS400 and MF (my program). The problem is - they're telling me that the characters were entered by the user. I can buy that to some extent but.. not when you have a set of characters that consistantly appear together with a differing amount of space between them. For example - if you consistently have a cent sign appear then a word (or several) then an exclamation mark.

    The first thing I'd question is what is the user typing in, right? That question becomes more complicated when it's 50 users doing it and not just one. How do 50 users know to type an exclamation mark somewhere after they type a cent sign?
    I take it the information is being entered through your application, hence the debate as to whether the problem is occurring in the program or in the database. Would it be possible to edit the program to mirror the input to a local data file, so that you can go back and see exactly what was typed in? Even better, would you be able to capture both the input as entered by the user, and the query as submitted to the database?

    Better yet, do you have a sample you would be able to share? Some string of text as entered that will come up incorrectly when retrieved? Knowing the set of characters involved, even if the damage is different each time the data is queried, can be helpful.

    Also, do you have command line access to the database where you can run SQL queries?
    The information is being entered into the AS400 system - it is then passed to my program which says, "Yes, i'll play well with you today." Or, "No, I will not play nice until you learn to speak greek!"

    As for samples... I have a bazillion... here's a couple...

    1. ADV OF TRIP CHG $ 10.00, WILL BE HOME
    3. DOG IN BASEMENT! REQ DATE 01/22/08 HS 07/11/05 PLEASE NOTE AS

    These comments are from a query ran against the AS400 - they appear in the query the same way they appear in Green screen.
    Ok... that has gotta be an input issue. If that output is the same whether it is output to your program or the terminal, it has to be the input. It almost looks like a terminator (tab/enter/newline/etc) is not being parsed correctly. Each line of text that you showed is from a single database field, correct?

    The one that really stands out to me is #1. 2 and 3 look almost where you would expect to see line breaks or tabs. #1 is different in that the special character is in the middle of the word. Honestly, this looks like something I used to see when I worked in a tech support call center. We had a three line box to enter a description of the problem, but in the back end each line of the box actually corresponded to a seperate database field (desc1, desc2, desc3). Just guessing from the way they cut off (PLEASE NOT AS, REQ DATE) is that the case here? The system I worked with before users might hit enter which in the client would put them on the next line, but the database would record it as a single line with a weird character in the middle. Bad input verification maybe?

    You know what... this might even be, um, is this a terminal application where each keystroke is sent to the server, then echoed to the screen? Could this issue be caused by the server not processing the backslash character (used for newlines, tabs, and other two-bit characters) properly? Worse, is this a VARCHAR type field?
    You have the call center part right. My examples are only snippets of the full comments. Each line has a 65 character limit. 5 (or 6) lines per comment screen. Continuous typing and/or tab takes you to the next line.

    One of my initial thoughts was spaces. A couple of the comment errors were found on orders I had created ages ago - I have a tendancy to space twice between words - my mind goes faster then my fingers... Where some of these characters are found in "empty" space - where it looks like someone may have hit the space bar two or three times. Your suggestion of it being a terminator sounds valid - I can see it being a strong possible. It still doesn't explain the cent sign and exclamation mark.

    Thank you for your help Wige! It gave me a new direction to go.
