VNI
VNI Software Company is a developer of various education, entertainment, office, and utility software packages. They are known for developing an encoding and input method for Vietnamese.
The company is family-owned and based in Westminster, California.
History
VNI was founded in 1987 by Hồ Thành Việt to develop software that eases Vietnamese language use on computers. Among their products were the VNI Encoding and VNI Input Method.
VNI vs. Microsoft
In the 1990s, Microsoft recognized the potential of VNI's products and incorporated VNI Input Method into Windows 95 Vietnamese Edition and MSDN, in use worldwide.
Upon Microsoft's unauthorized use of these technologies, VNI took Microsoft to court over the matter. Microsoft settled the case out of court, withdrew the input method from their entire product line, and developed their own input method. It has, although virtually unknown, appeared in every Windows release since Windows 98.
Starting with Windows 10 version 1903, the VNI Input Method (as "Vietnamese Number Key-based"), along with the Telex input method, are now natively supported.[1]
Unicode
Despite the growing popularity of Unicode in computing, the VNI Encoding (see below) is still in wide use by Vietnamese speakers both in Vietnam and abroad. All professional printing facilities in the Little Saigon neighborhood of Orange County, California continue to use the VNI Encoding when processing Vietnamese text. For this reason, print jobs submitted using the VNI Character Set are compatible with local printers.
Input methods
VNI invented, popularized, and commercialized an input method and an encoding, the VNI Character Set, to assist computer users entering Vietnamese on their computers. The user can type using only ASCII characters found on standard computer keyboard layouts. Because the Vietnamese alphabet uses a complex system of diacritical marks, the keyboard needs 133 alphanumeric keys and a Shift key to cover all possible characters.
VNI Input Method
Originally, VNI's input method utilized function keys (F1, F2, ...) to enter the tone marks, which later turned out to be problematic, as the operating system used those keys for other purposes. VNI then turned to the numerical keys along the top of the keyboard (as opposed to the numpad) for entering tone marks. This arrangement survives today, but users also have the option of customizing the keys used for tone marks.
With VNI Tan Ky mode on, the user can type in diacritical marks anywhere within a word, and the marks will appear at their proper locations. For example, the word trường, which means "school", can be typed in the following ways:
- truong-7-2 → trường (most formal way)
- 72truong → trường
- t72ruong → trường
- tr72uong → trường
- tru7o72ng → trường
- truo72ng → trường
- truo7ng2 → trường
The first way is the most conventional method, following handwriting and spelling convention, where the base is written first (truong) and then the tonal marks added later one by one.
VNI Tan Ky
With the release of VNI Tan Ky 4 in the 1990s, VNI freed users from having to remember where to correctly insert tone marks within a word, because, as long as the user enters all the required characters and tone marks, the software will group them correctly. This feature is especially useful for newcomers to the language.
VNI Auto Accent
VNI Auto Accent is the company's most recent software release (2006), with the purpose of alleviating repetitive strain injury (RSI) caused by prolonged use of computer keyboards. Auto Accent helps reduce the number of keystrokes needed to type each word by automatically adding diacritical marks for the user. The user must still enter every base letter in the word.
Character encodings
VNI Encoding (Windows/Unix)
The VNI Encoding uses up to two bytes to represent one Vietnamese vowel character, with the second byte supplying additional diacritical marks, therefore removing the need to replace control characters with Vietnamese characters, a problematic system found in TCVN1 (VSCII-1) and in VISCII, or using two different fonts such as is sometimes employed for TCVN3 (VSCII-3), one containing lowercase characters and the other uppercase characters. A similar approach is taken by Windows-1258 and VSCII-2.
This solution is more portable between different versions of Windows and between different platforms. However, due to the presence of multiple characters in a file to represent one written character increases the file size. The increased file size can usually be accounted for by compressing the data into a file format such as ZIP.
The VNI encoding was used extensively in the south of Vietnam, and sometimes used overseas, while TCVN 5712 was dominant in the north.[2]
Points 0x00 through 0x7F follow ASCII.
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8_ | ||||||||||||||||
9_ | ||||||||||||||||
A_ | ||||||||||||||||
B_ | ||||||||||||||||
C_[lower-alpha 1] | ̂̀ 0302 0300 |
̂́ 0302 0301 |
̂ 0302 |
̂̃ 0302 0303 |
̣̂ 0323 0302 |
̂̉ 0302 0309 |
Ỉ 1EC8 |
̆̀ 0306 0300 |
̆́ 0306 0301 |
̆ 0306 |
̣̆ 0323 0306 |
Ì 00CC |
Í 00CD |
Ỵ 1EF4 |
̣ 0323 | |
D_[lower-alpha 1] | Đ 0110 |
Ị 1ECA |
Ĩ 0128 |
Ơ 01A0 |
̃ 0303 |
Ư 01AF |
̀ 0300 |
́ 0301 |
̆̉ 0306 0309 |
̉ 0309 |
̆̃ 0306 0303 |
|||||
E_[lower-alpha 2] | ̂̀ 0302 0300 |
̂́ 0302 0301 |
̂ 0302 |
̂̃ 0302 0303 |
̣̂ 0323 0302 |
̂̉ 0302 0309 |
ỉ 1EC9 |
̆̀ 0306 0300 |
̆́ 0306 0301 |
̆ 0306 |
̣̆ 0323 0306 |
ì 00EC |
í 00ED |
ỵ 1EF5 |
̣ 0323 | |
F_[lower-alpha 2] | đ 0111 |
ị 1ECB |
ĩ 0129 |
ơ 01A1 |
̃ 0303 |
ư 01B0 |
̀ 0300 |
́ 0301 |
̆̉ 0306 0309 |
̉ 0309 |
̆̃ 0306 0303 |
- Combining marks in the C_ and D_ rows are used with uppercase letters.
- Combining marks in the E_ and F_ rows are used with lowercase letters.
VNI Encoding for Macintosh
A version intended for use on Macintosh systems, with a different arrangement (corresponding to the different arrangement between Windows-1252 and Mac OS Roman). Diacritic characters used for uppercase vowels are shown boxed, and those used for lowercase vowels are shown unboxed.
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8_ | ̣̂ 0323 0302 |
̂̉ 0302 0309 |
̆́ 0306 0301 |
Đ 0110 |
Ư 01AF |
̆̃ 0306 0303 |
̂́ 0302 0301 |
̂̀ 0302 0300 |
̂ 0302 |
̣̂ 0323 0302 |
̂̃ 0302 0303 |
̂̉ 0302 0309 |
̆́ 0306 0301 |
̆̀ 0306 0300 | ||
9_ | ̆ 0306 |
̣̆ 0323 0306 |
í 00ED |
ì 00EC |
̣ 0323 |
đ 0111 |
ĩ 0129 |
ị 1ECB |
ơ 01A1 |
ư 01B0 |
̃ 0303 |
̆̉ 0306 0309 |
́ 0301 |
̉ 0309 |
̆̃ 0306 0303 | |
A_ | Ỉ 1EC8 |
̀ 0300 | ||||||||||||||
B_ | ỉ 1EC9 |
̀ 0300 | ||||||||||||||
C_ | ̂̀ 0302 0300 |
̂̃ 0302 0303 |
̃ 0303 |
|||||||||||||
D_ | ||||||||||||||||
E_ | ̂ 0302 |
̆ 0306 |
̂́ 0302 0301 |
̣̆ 0323 0306 |
̆̀ 0306 0300 |
Í 00CD |
̣ 0323 |
Ì 00CC |
Ĩ 0128 |
Ơ 01A0 | ||||||
F_ | Ị 1ECA |
̆̉ 0306 0309 |
̉ 0309 |
́ 0301 |
VNI Encoding for DOS
The VNI encoding for use on DOS does not use separate characters for diacritics, instead replacing certain ASCII punctuation characters with tone-marked uppercase letters (compare ISO 646).
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ | NUL 0000 |
SOH 0001 |
STX 0002 |
ETX 0003 |
EOT 0004 |
ENQ 0005 |
ACK 0006 |
BEL 0007 |
BS 0008 |
HT 0009 |
LF 000A |
VT 000B |
FF 000C |
CR 000D |
SO 000E |
SI 000F |
1_ | DLE 0010 |
DC1 0011 |
DC2 0012 |
DC3 0013 |
DC4 0014 |
NAK 0015 |
SYN 0016 |
ETB 0017 |
CAN 0018 |
EM 0019 |
SUB 001A |
ESC 001B |
FS 001C |
GS 001D |
RS 001E |
US 001F |
2_ | SP 0020 |
! 0021 |
" 0022 |
# 0023 |
$ 0024 |
% 0025 |
& 0026 |
' 0027 |
( 0028 |
) 0029 |
* 002A |
+ 002B |
, 002C |
- 002D |
. 002E |
/ 002F |
3_ | 0 0030 |
1 0031 |
2 0032 |
3 0033 |
4 0034 |
5 0035 |
6 0036 |
7 0037 |
8 0038 |
9 0039 |
: 003A |
; 003B |
< 003C |
= 003D |
> 003E |
? 003F |
4_ | Ỵ 1EF4 |
A 0041 |
B 0042 |
C 0043 |
D 0044 |
E 0045 |
F 0046 |
G 0047 |
H 0048 |
I 0049 |
J 004A |
K 004B |
L 004C |
M 004D |
N 004E |
O 004F |
5_ | P 0050 |
Q 0051 |
R 0052 |
S 0053 |
T 0054 |
U 0055 |
V 0056 |
W 0057 |
X 0058 |
Y 0059 |
Z 005A |
[ 005B |
\ 005C |
] 005D |
Á 00C1 |
_ 005F |
6_ | À 00C0 |
a 0061 |
b 0062 |
c 0063 |
d 0064 |
e 0065 |
f 0066 |
g 0067 |
h 0068 |
i 0069 |
j 006A |
k 006B |
l 006C |
m 006D |
n 006E |
o 006F |
7_ | p 0070 |
q 0071 |
r 0072 |
s 0073 |
t 0074 |
u 0075 |
v 0076 |
w 0077 |
x 0078 |
y 0079 |
z 007A |
Ặ 1EB6 |
Ả 1EA2 |
à 00C3 |
Ạ 1EA0 |
DEL 007F |
8_ | Ấ 1EA4 |
ẻ 1EBB |
é 00E9 |
â 00E2 |
ẽ 1EBD |
à 00E0 |
ẹ 1EB9 |
Ầ 1EA6 |
ê 00EA |
ế 1EBF |
è 00E8 |
ề 1EC1 |
Ẩ 1EA8 |
ì 00EC |
ể 1EC3 |
ễ 1EC5 |
9_ | Ẫ 1EAA |
ỏ 1ECF |
õ 00F5 |
ô 00F4 |
ọ 1ECD |
ò 00F2 |
ố 1ED1 |
ù 00F9 |
ồ 1ED3 |
ổ 1ED5 |
ỗ 1ED7 |
ộ 1ED9 |
ủ 1EE7 |
ũ 0169 |
ụ 1EE5 |
ư 01B0 |
A_ | á 00E1 |
í 00ED |
ó 00F3 |
ú 00FA |
ứ 1EE9 |
ừ 1EEB |
ử 1EED |
ữ 1EEF |
ự 1EF1 |
ỉ 1EC9 |
ĩ 0129 |
ị 1ECB |
ệ 1EC7 |
đ 0111 |
Đ 0110 |
Ậ 1EAC |
B_ | Ắ 1EAE |
Ằ 1EB0 |
Ẳ 1EB2 |
Ẵ 1EB4 |
É 00C9 |
È 00C8 |
Ẻ 1EBA |
Ẽ 1EBC |
Ẹ 1EB8 |
Ế 1EBE |
Ề 1EC0 |
Ể 1EC2 |
Ễ 1EC4 |
Ệ 1EC6 |
Í 00CD |
Ì 00CC |
C_ | Ỉ 1EC8 |
Ĩ 0128 |
Ị 1ECA |
Ó 00D3 |
Ò 00D2 |
Ỏ 1ECE |
Õ 00D5 |
Ọ 1ECC |
Ố 1ED0 |
Ồ 1ED2 |
Ổ 1ED4 |
Ỗ 1ED6 |
Ộ 1ED8 |
Ớ 1EDA |
Ờ 1EDC |
Ở 1EDE |
D_ | Ỡ 1EE0 |
Ợ 1EE2 |
Ú 00DA |
Ù 00D9 |
Ủ 1EE6 |
Ũ 0168 |
Ụ 1EE4 |
Ứ 1EE8 |
Ừ 1EEA |
Ử 1EEC |
Ữ 1EEE |
Ự 1EF0 |
Ý 00DD |
Ỳ 1EF2 |
Ỷ 1EF6 |
Ỹ 1EF8 |
E_ | ả 1EA3 |
ã 00E3 |
ạ 1EA1 |
ấ 1EA5 |
ầ 1EA7 |
ẩ 1EA9 |
ẫ 1EAB |
ậ 1EAD |
ă 0103 |
ắ 1EAF |
ằ 1EB1 |
ẳ 1EB3 |
ẵ 1EB5 |
ặ 1EB7 |
ý 00FD |
ỳ 1EF3 |
F_ | ỷ 1EF7 |
ỹ 1EF9 |
ỵ 1EF5 |
ơ 01A1 |
ớ 1EDB |
ờ 1EDD |
ở 1EDF |
ỡ 1EE1 |
ợ 1EE3 |
Ô 00D4 |
Ơ 01A0 |
Ư 01AF |
Ă 0102 |
 00C2 |
Ê 00CA |
á 00E1 |
Letter Number Punctuation Symbol Other Undefined
VIQR and VNI-Internet Mail
The use of Vietnamese Quoted-Readable (VIQR), a convention for writing in Vietnamese using ASCII characters, began during the Vietnam War, when typewriters were the main tool for word processing. Because the U.S. military required a way to represent Vietnamese scripts accurately on official documents, VIQR was invented for the military. Due to its longstanding use, VIQR was a natural choice for computer word processing, prior to the appearance of VNI, VPSKeys, VSCII, VISCII, and Unicode. It is still widely used for information exchange on computers, but is not desirable for design and layout, due to its cryptic appearance.
VIQR's main issue was the difficulty of reading VIQR text, especially for inexperienced computer users. VNI created and released a free font called VNI-Internet Mail, which utilized a variant of the VIQR notation and VNI's combining character technique to give VIQR text a more natural appearance by replacing certain ASCII punctuation with combining characters.
The following table compares VNI-Internet Mail to other codified VIQR or VIQR-like conventions.
Diacritical mark | RFC 1456 VIQR notation[5] | VSCII-MNEM notation[6] | VNI Internet Mail notation[4] | Example |
---|---|---|---|---|
Breve | ( | < | | | displayed as Ă |
Circumflex | ^ | > | ^ | E^ displayed as Ê |
Horn | + | * | * | U* displayed as Ư |
Acute | ' | ' | ' | O' displayed as Ó |
Grave | ` | ! | ` | O` displayed as Ò |
Hook above | ? | ? | { | O{ displayed as Ỏ |
Tilde | ~ | " | ~ | O~ displayed as Õ |
Dot below | . | . | } | O} displayed as Ọ |
Barred D | DD | DD | D_ | D_ displayed as Đ |
See also
References
- "Hãy thử gõ tiếng Việt với bộ gõ Telex và Number-key based mới nào!". 2018-10-25.
- Ngo, Hoc Dinh; Tran, TuBinh. "5. Why Having Vietnamese Charset (Character Set – Encoding) Conversion?". Some special functions of WinVNKey.
- "Unicode & Vietnamese Legacy Character Encodings". Vietnamese Unicode FAQs.
- "VNI Character Sets". Vietnamese Unicode FAQs.
- Vietnamese Standardization Working Group. "RFC 1456: Conventions for Encoding the Vietnamese Language". IETF.
- Lunde, Ken (2009). CJKV Information Processing (2nd ed.). O'Reilly Media. pp. 47–49. ISBN 978-0-596-51447-1.