123459p8fnwSee Foreign Language Representation, for full details on the
implementation of these character sets.
huse8bNote that brackets encoding only applies to program text. Within comments, brackets are considered to be normal graphic characters, and bracket sequences are never recognized as wide characters.
If no -gnatW? parameter is present, then the default representation is normally Brackets encoding only. However, if the first three characters of the file are 16#EF# 16#BB# 16#BF# (the standard byte order mark or BOM for UTF-8), then these three characters are skipped and the default representation for the file is set to UTF-8.
Note that the wide character representation that is specified (explicitly or by default) for the main program also acts as the default encoding used for Wide_Text_IO files if not specifically overridden by a WCEM form parameter.
When no -gnatW? is specified, then characters (other than wide characters represented using brackets notation) are treated as 8-bit Latin-1 codes. The codes recognized are the Latin-1 graphic characters, and ASCII format effectors (CR, LF, HT, VT). Other lower half control characters in the range 16#00#..16#1F# are not accepted in program text or in comments. Upper half control characters (16#80#..16#9F#) are rejected in program text, but allowed and ignored in comments. Note in particular that the Next Line (NEL) character whose encoding is 16#85# is not recognized as an end of line in this default mode. If your source program contains instances of the NEL character used as a line terminator, you must use UTF-8 encoding for the whole source program. In default mode, all lines must be ended by a standard end of line sequence (CR, CR/LF, or LF).
Note that the convention of simply accepting all upper half characters in comments means that programs that use standard ASCII for program text, but UTF-8 encoding for comments are accepted in default mode, providing that the comments are ended by an appropriate (CR, or CR/LF, or LF) line terminator. This is a common mode for many programs with foreign language comments.