XML/DTD problem; SAXParser not able to read #PCDATA with ä or ö

Have I done something wrong or what? I have a DTD stating that

But when trying to enter ä or ö in any #PCDATA field, the SAXParser throws an error; "Entity ‘auml’ not found. Why?

If I try to use any international character it becomes converted into some strange characters, usually a questionmark… And I hoped I could avoid that by translating the characters in the same way as I code HTML…

What have I misunderstood?

Subject: XML/DTD problem; SAXParser not able to read #PCDATA with ä or ö

XML only has five predefined named entities: <, > &, ' and ". For anything else, you need to use a UTF-8 or UTF-16 character reference number (&#XX; or &#XXXX:wink: unless you define your own entites in your DTD (so you can use a named entity instead of a numbered entity.

Subject: RE: XML/DTD problem; SAXParser not able to read #PCDATA with ä or ö

Ok, I see. I thought that there were more predefined enteties… I will instruct the developers of the external system that they have to translate international characters to UTF-8/16.

By the way - how do I define my own entity in the DTD? For example the ä ?

Subject: RE: XML/DTD problem; SAXParser not able to read #PCDATA with ä or ö

The pattern looks like this:

where auml is the named entity you are defining and the quoted string is the numerical entity reference that means the same thing. If you have a declaration like that in the DTD and you use a validating parser, the parser SHOULD substitute &#xx; wherever it sees ä if it does any translation to literal characters, and should treat the entity as a legal character when parsing.