I have developed a web agent acting as a RESTful service. The agent accepts POST requests to receive blocks of JSON data.
The JSON is UTF-8 encoded and the HTTP request is stating the encoding used.
However, the (JSON) text received in the Request_content field of the agent context document is somewhat encoded differently or plain wrong.
In my case I have difficulties with German Umlaute üÜöÖäÄß
I have created for test purposes an HTTP request with a Chrome extension. I do not send JSON here, but it doesn’t matter for the effect:
POST /SOAPGATEQ_5.NSF/REST4Documents?openagent HTTP/1.1 Host: domino.flexdomino.net Content-Type: application/json; charset=UTF-8 (I have used text/html, but the result is the same) Content-Length: 7
Müller (send as part of the body)
HTTP/1.1 200 OK Server: Lotus-Domino Date: Wed, 15 Jul 2015 20:31:13 GMT Connection: close Content-Type: application/json; charset=utf-8 Content-Length: 266
{“error”:{“text”:“[Error.REST4Documents] 27, CLASS:JSONREADER<PARSE: line 131> ERROR: 1000: Invalid JSON format. (Block character mismatch ASCII(77,114) M├╝ller… Context: Current character = ‘M’; Previous character = ‘M’; Remaining string = ‘M├╝ller’.”}}
Ignore the fact that the return is an error as quite naturally the JSON parser complains about the single word “Müller”,
HOWEVER, please note what the context document received … “M├╝ller”
Whatever I do from a sending point of view, change in content type or change in charset, the result is the same or similar,
I get funny characters instead of the German Umlaute.
The only way I get a result I can eventually work with is to use application/x-www-form-urlencoded,
in which case all UTF-8 characters that are not 1 byte length are encoded with %hex%hex.
I have not yet tried if I can work around the problem using this method with @URLDecode to get my proper UTF-8 JSON data with correct German Umlaute,
but then again, it is quite some overhead for something that should work without any decoding requirement.
Is there something I’m missing (server side settings), specific character code to be used other than UTF-8? Or is this simply a bug in Domino?