Encodings - Guide to GNU gcj

Next: Warnings, Previous: Input Options, Up: Invoking gcj

1.3 Encodings

The Java programming language uses Unicode throughout. In an effort to integrate well with other locales, gcj allows .java files to be written using almost any encoding. gcj knows how to convert these encodings into its internal encoding at compile time.

You can use the --encoding=NAME option to specify an encoding (of a particular character set) to use for source files. If this is not specified, the default encoding comes from your current locale. If your host system has insufficient locale support, then gcj assumes the default encoding to be the ‘UTF-8’ encoding of Unicode.

To implement --encoding, gcj simply uses the host platform's iconv conversion routine. This means that in practice gcj is limited by the capabilities of the host platform.

The names allowed for the argument --encoding vary from platform to platform (since they are not standardized anywhere). However, gcj implements the encoding named ‘UTF-8’ internally, so if you choose to use this for your source files you can be assured that it will work on every host.