Have I done something wrong or what? I have a DTD stating that
But when trying to enter ä or ö in any #PCDATA field, the SAXParser throws an error; "Entity ‘auml’ not found. Why?
If I try to use any international character it becomes converted into some strange characters, usually a questionmark… And I hoped I could avoid that by translating the characters in the same way as I code HTML…
Subject: XML/DTD problem; SAXParser not able to read #PCDATA with ä or ö
XML only has five predefined named entities: <, > &, ' and ". For anything else, you need to use a UTF-8 or UTF-16 character reference number (&#XX; or &#XXXX unless you define your own entites in your DTD (so you can use a named entity instead of a numbered entity.
Subject: RE: XML/DTD problem; SAXParser not able to read #PCDATA with ä or ö
Ok, I see. I thought that there were more predefined enteties… I will instruct the developers of the external system that they have to translate international characters to UTF-8/16.
By the way - how do I define my own entity in the DTD? For example the ä ?
Subject: RE: XML/DTD problem; SAXParser not able to read #PCDATA with ä or ö
The pattern looks like this:
where auml is the named entity you are defining and the quoted string is the numerical entity reference that means the same thing. If you have a declaration like that in the DTD and you use a validating parser, the parser SHOULD substitute &#xx; wherever it sees ä if it does any translation to literal characters, and should treat the entity as a legal character when parsing.