Hi all,
The problem I submit gives me a real headache!
Here is the situation.
I have a Notes/web application in which an applet is running into one particular form. This applet uses an XML representation of a set of data from the database. The XML code is stored in a RichText field, and the way the applet reads this field is different between Notes and web.
In Notes, what we’ve done is to write the content of the RTF (the XML code) in a file during the QueryOpen event. This file is open in LS and we use the parameter “charset=‘UTF-8’” of the “Open” statement to be sure all characters will be supported. The same process is used to save changes made in the applet back to the Notes document. And in this case everything works fine.
As it is also possible to use our application in a web browser (IE 5.5+), we also have to support the character encoding in the web. And that is the start of my headache.
In the web, it is not possible to write the XML code in a file, so we pass a URL as parameter to the applet. The URL points to the document itself, but shown through a different form in a special view so that only the RTF is accessible. This solution works well… except for one thing, the character encoding.
If nothing particular is set concerning the character encoding, the document containing the applet seems to be in “iso-8859-1” (the JS method “document.charset” tells me that during the onLoad event). The header of the XML code read by the applet contains the standard “utf-8” encoding. There is also no tag set to specify the encoding of the page.
Reading the XML code seems not to be a problem, the real problem is to write the modified XML code from the applet back to the RTF in the document. In fact, the XML code is treated by the applet, and then a JS script calls a Java method of the applet (“getXmlOutstream()”) to get the XML code. Finaly, this JS script writes the string it gets to the RTF (“XMLCode”):
function xmlToRT() {
if (document.applets[0]!=null) { // In case the applet is hidden
document.forms[0].XMLCode.value = "";
strXmlOutstream = document.applets[0].getXmlOutstream();
// Carriage return are removed.
rExpCR = new RegExp("\\n\\r", "g");
document.forms[0].XMLCode.value = strXmlOutstream.replace(rExpCR, "");
}
}
The process itself just works. But if the XML code contains special characters like “é”, “ö”, “ň” or “š”, a conversion occurs on these characters and then they look like “é”, “ö”, “Å□” and “Å¡”. I have tried a UTF-8 conversion with in the JS script above but then the result is : “é”, “ö”, “ņ” and “□” !!!
The conversion seems a good idea, except the it doesn’t fit for all characters, and that is THE problem I have. There is probably something missing in my procedure, but I’m unable to find what…
For a better understanding, and if someone wants to try it by himself, here is the JS function I use to convert the string from the applet:
function decode_utf8(utftext) {
var plaintext = "";
var i=0;
var c=c1=c2=c3=0;
while(i<utftext.length) {
c = utftext.charCodeAt(i);
if (c<128) { // 1 byte
plaintext += String.fromCharCode(c);
i++;
} else if((c>191) && (c<224)) { // 2 bytes
c2 = utftext.charCodeAt(i+1);
plaintext += String.fromCharCode(((c&31)<<6) | (c2&63));
i+=2;
} else { // 3 bytes
c2 = utftext.charCodeAt(i+1);
c3 = utftext.charCodeAt(i+2);
plaintext += String.fromCharCode(((c&15)<<12) | ((c2&63)<<6) | (c3&63));
i+=3;
}
}
return plaintext;
}
Any help to find a way to support all character will be very, very appreciated.
Thanks
Phil