File No.:

Registration of MSARG Collection and Sequences

To: Unicode Consortium

From: Government of Macao Special Administrative Region, China

Action: Request for review

Date: 2016-05-13

Contents

1. Background

To facilitate more effective electronic communication among the government units, Macao Special Administrative Region Government (MSARG) is planning to establish and implement Macao SAR Information Systems Character Set (MISCS), which is tentatively named MISCS-2016. This character set will include all the Chinese characters and symbols used in the computer systems of MSARG. In this character set, 11 Chinese character variants are unified with another character which has been encoded in ISO/IEC 10646, but the glyphs used in Macao are different from the encoded characters. To make these glyphs available in Macao's computer systems, it is necessary to register Ideographic Variation Sequences (IVSes) for 21 variants, 10 of which correspond to base characters whose representative glyphs are the same as in the code charts.

2. Request for Review and Comments

MSARG is going to submit an Ideographic Variation Database (IVD) collection named MSARG Collection as well as 21 proposed sequences in this collection to the registrar, Unicode Consortium. MSARG requests all concerned experts to kindly review this collection and the sequences.

3. Introduction of MISCS-2016

MISCS-2016, as a complete named character set under the ISO/IEC 10646 international encoding standard, covers all approved Chinese characters and symbols used in Macao's computer systems. MISCS-2016 will include: (1) the Big-5 character set, (2) HKSCS-2008, (3) Macao's Vertical Extension to ISO/IEC 10646, (4) Macao's Horizontal Extension to ISO/IEC 10646, and (5) Macao's variants (excluding base characters) with registered IVSes. Under the ISO/IEC 10646 international encoding standard, the coding scheme of MISCS-2016 as source references is as follows:

  1. MB-hhhh is used to refer to all characters in the Big-5 character set, in which "hhhh" is the hexadecimal Big-5 code.
  2. MA-hhhh is used to refer to all characters already encoded in HKSCS-2008, in which "hhhh" is the corresponding hexadecimal Big-5 code in HKSCS-2008.
  3. MC-nnnnn is used for characters vertically extended to ISO/IEC 10646, in which "nnnnn" is an MISCS-assigned source reference code between 00001 and 99999, and assigned in sequence.
  4. MD-hhhh[h] is used for characters horizontally extended to ISO/IEC 10646, in which "hhhh[h]" is the four- or five-digit hexadecimal code of the character in the ISO/IEC 10646 international standard. For characters in the Basic Multilingual Plane (BMP or Plane 0), the code points contain four hexadecimal digits. For characters in other planes, the code points contain five hexadecimal digits.
  5. ME-hhhh[h]-nnn is used for character variants with registered IVSes, in which "hhhh[h]" is the four- or five-digit hexadecimal code of the base character in the ISO/IEC 10646 international standard, and "nnn" is an MISCS-assigned number between 001 and 999. For character variants corresponding to the same base character, "nnn" is assigned in sequence.

Since sequence identifiers cannot use the hyphen, "M([AB]_[0-9A-F]{4}|C_[0-9]{5}|D_[0-9A-F]{4,5}|E_[0-9A-F]{4,5}_[0-9]{3})" is used as the pattern of the sequence identifiers.

4. Information of MSARG Collection

4.1 Registration Form

Name and address of the registrant: Public Administration and Civil Service Bureau (SAFP)
Macao Special Administrative Region, China
Rua do Campo, no. 162, Edificio Administracao Publica, 21-27 Andares, Macau
Name and email address of the representative: Chau Cheuk Kwan, Clement: cchau@safp.gov.mo
Lam Sok Chi: sokchil@safp.gov.mo
Prof. Lu Qin: csluqin@comp.polyu.edu.hk
URL of the web site describing the collection: http://www.iso10646hk.net/ivd/MSARG/
(This is a temporary web site and it will be changed to another web site in the future.)
Suggested identifier for the collection: MSARG
Pattern for the sequence identifiers: M([AB]_[0-9A-F]{4}|C_[0-9]{5}|D_[0-9A-F]{4,5}|E_[0-9A-F]{4,5}_[0-9]{3})

4.2 Data Files

Three data files are provided:

  1. Description of the collection: IVD_Collections_MSARG.txt
  2. The format of this file conforms to the requirements specified in Section 3 of Unicode Technical Standard #37.

  3. List of proposed sequences: IVD_Sequences_MSARG.txt
  4. The format of this file conforms to the requirements specified in Sections 3 and 4.2 of Unicode Technical Standard #37.

  5. List of representative glyphs for the proposed sequences: Glyphs_List_MSARG.pdf

5. History

2016-05-13 First publication.

2016-07-25 The last sentence in Section 1 was revised based on the comments received.