http://www.dabsoft.ch/dicom/3/C.12.1.1.2/

C.12.1.1.2 Specific Character Set

Specific Character Set (0008,0005) identifies the Character Set that expands or replaces the Basic Graphic Set (ISO 646) for values of Data Elements that have Value Representation of SH, LO, ST, PN, LT or UT. See PS 3.5.

If the Attribute Specific Character Set (0008,0005) is not present or has only a single value, Code Extension techniques are not used. Defined terms for the Attribute Specific Character Set (0008,0005), when single valued, are derived from the International Registration Number as per ISO 2375 (e.g., ISO_IR 100 for Latin alphabet No. 1). See Table C.12-2.

Table C.12-2 DEFINED TERMS FOR SINGLE-BYTE CHARACTER SETS WITHOUT CODE EXTENSIONS

Character Set Description	Defined Term	ISO registration number	Number of characters	Code element	Character Set
Default repertoire	none	ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 1	ISO_IR 100	ISO-IR 100	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 2	ISO_IR 101	ISO-IR 101	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 3	ISO_IR 109	ISO-IR 109	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 4	ISO_IR 110	ISO-IR 110	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Cyrillic	ISO_IR 144	ISO-IR 144	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Arabic	ISO_IR 127	ISO-IR 127	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Greek	ISO_IR 126	ISO-IR 126	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Hebrew	ISO_IR 138	ISO-IR 138	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 5	ISO_IR 148	ISO-IR 148	96	G1	Supplementary set of ISO 8859
		ISO-IR 6	94	G0	ISO 646
Japanese	ISO_IR 13	ISO-IR 13	94	G1	JIS X 0201: Katakana
		ISO-IR 14	94	G0	JIS X 0201: Romaji
Thai	ISO_IR 166	ISO-IR 166	88	G1	TIS 620-2533 (1990)
		ISO-IR 6	94	G0	ISO 646

Note: To use the single-byte code table of JIS X0201, the value of attribute Specific Character Set (0008,0005), value 1 should be ISO_IR 13. This means that ISO-IR 13 is designated as the G1 code element which is invoked in the GR area. It should be understood that, in addition, ISO-IR 14 is designated as the G0 code element and this is invoked in the GL area.

If the attribute Specific Character Set (0008,0005) has more than one value, Code Extension techniques are used and Escape Sequences may be encountered in all character sets. Requirements for the use of Code Extension techniques are specified in PS 3.5. In order to indicate the presence of Code Extension, the Defined Terms for the repertoires have the prefix “ISO 2022”, e.g., ISO 2022 IR 100 for the Latin Alphabet No. 1. See Table 12-3 and Table 12-4. Table 12-3 describes single-byte character sets for value 1 to value n of the attribute Specific Character Set (0008,0005), and Table 12-4 describes multi-byte character sets for value 2 to value n of the attribute Specific Character Set(0008,0005).

Note: A prefix other than “ISO 2022” may be needed in the future if other Code Extension techniques are adopted.

The same character set shall not be used more than once in Specific Character Set (0008,0005).

Note: For example, the values “ISO 2022 IR 100\ISO 2022 IR 100” or “ISO_IR 100\ISO 2022 IR 100” are redundant and not permitted.

Table C.12-3DEFINED TERMS FOR SINGLE-BYTE CHARACTER SETS WITH CODE EXTENSIONS

Character Set Description	Defined Term	Standard for Code Extension	ESC sequence	ISO registration number	Number of char-acters	Code element	Character Set
Default repertoire	ISO 2022 IR 6	ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 1	ISO 2022 IR 100	ISO 2022	ESC 02/13 04/01	ISO-IR 100	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 2	ISO 2022 IR 101	ISO 2022	ESC 02/13 04/02	ISO-IR 101	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 3	ISO 2022 IR 109	ISO 2022	ESC 02/13 04/03	ISO-IR 109	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 4	ISO 2022 IR 110	ISO 2022	ESC 02/13 04/04	ISO-IR 110	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Cyrillic	ISO 2022 IR 144	ISO 2022	ESC 02/13 04/12	ISO-IR 144	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Arabic	ISO 2022 IR 127	ISO 2022	ESC 02/13 04/07	ISO-IR 127	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Greek	ISO 2022 IR 126	ISO 2022	ESC 02/13 04/06	ISO-IR 126	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Hebrew	ISO 2022 IR 138	ISO 2022	ESC 02/13 04/08	ISO-IR 138	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Latin alphabet No. 5	ISO 2022 IR 148	ISO 2022	ESC 02/13 04/13	ISO-IR 148	96	G1	Supplementary set of ISO 8859
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646
Japanese	ISO 2022 IR 13	ISO 2022	ESC 02/0 9 04/09	ISO-IR 13	94	G1	JIS X 0201: Katakana
		ISO 2022	ESC 02/08 04/10	ISO-IR 14	94	G0	JIS X 0201: Romaji
Thai	ISO 2022 IR 166	ISO 2022	ESC 02/13 05/04	ISO-IR 166	88	G1	TIS 620-2533 (1990)
		ISO 2022	ESC 02/08 04/02	ISO-IR 6	94	G0	ISO 646

Note: If the attribute Specific Character Set (0008,0005) has more than one value and value 1 is empty, it is assumed that value 1 is ISO 2022 IR 6.

Table C.12-4DEFINED TERMS FOR MULTI-BYTE CHARACTER SETS WITH CODE EXTENSIONS

Character Set Description	Defined Term	Standard for Code Extension	ESC sequence	ISO registration number	Number of char-acters	Code element	Character Set
Japanese	ISO 2022 IR 87	ISO 2022	ESC 02/04 04/02	ISO-IR 87	942	G0	JIS X 0208: Kanji
	ISO 2022 IR 159	ISO 2022	ESC 02/04 02/08 04/04	ISO-IR 159	942	G0	JIS X 0212: Supplementary Kanji set
Korean	ISO 2022 IR 149	ISO 2022	ESC 02/04 02/09 04/03	ISO-IR 149	942	G1	KS X 1001: Hangul and Hanja

There are multi-byte character sets that prohibit the use of Code Extension Techniques. The Unicode character set used in ISO 10646, when encoded in UTF-8, and the GB18030 character set, encoded per the rules of GB18030, both prohibit the use of Code Extension Techniques. These character sets may only be specified as value 1 in the Specific Character Set (0008,0005) attribute and there shall only be one value. The minimal length UTF-8 encoding shall always be used for ISO 10646.

Notes: 1. The ISO standards for 10646 now prohibit the use of anything but the minimum length encoding for UTF-8. UTF-8 permits multiple different encodings, but when used to encode Unicode characters in accordance with ISO 10646-1 and 10646-2 (with extensions) only the minimal encodings are legal.

2. The representation for the characters in the DICOM Default Character Repertoire is the same single byte value for the Default Character Repertoire, ISO 10646 in UTF-8, and GB18030. It is also the 7-bit US-ASCII encoding.

Table C.12-5DEFINED TERMS FOR MULTI-BYTE CHARACTER SETS WITHOUT CODE EXTENSIONS

Character Set Description	Defined Term
Unicode in UTF-8	ISO_IR 192
GB18030	GB18030

Programming

DICOM Specific Character Set

http://www.dabsoft.ch/dicom/3/C.12.1.1.2/

C.12.1.1.2 Specific Character Set

티스토리툴바