Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
openanolis
dragonwell8_jdk
提交
3fcbfb5f
D
dragonwell8_jdk
项目概览
openanolis
/
dragonwell8_jdk
通知
3
Star
2
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
dragonwell8_jdk
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
3fcbfb5f
编写于
3月 09, 2008
作者:
M
martin
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
4499288: (cs spec) Charset terminology problems
Reviewed-by: mr, iris
上级
8a716bdc
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
32 addition
and
21 deletion
+32
-21
src/share/classes/java/nio/charset/Charset.java
src/share/classes/java/nio/charset/Charset.java
+32
-21
未找到文件。
src/share/classes/java/nio/charset/Charset.java
浏览文件 @
3fcbfb5f
...
...
@@ -212,36 +212,47 @@ import sun.security.action.GetPropertyAction;
*
* <h4>Terminology</h4>
*
* <p> The name of this class is taken from the terms used in <a
* href="http://www.ietf.org/rfc/rfc2278.txt""><i>RFC 2278</i></a>. In that
* document a <i>charset</i> is defined as the combination of a coded character
* set and a character-encoding scheme.
* <p> The name of this class is taken from the terms used in
* <a href="http://www.ietf.org/rfc/rfc2278.txt"><i>RFC 2278</i></a>.
* In that document a <i>charset</i> is defined as the combination of
* one or more coded character sets and a character-encoding scheme.
* (This definition is confusing; some other software systems define
* <i>charset</i> as a synonym for <i>coded character set</i>.)
*
* <p> A <i>coded character set</i> is a mapping between a set of abstract
* characters and a set of integers. US-ASCII, ISO 8859-1,
* JIS X 0201, and full Unicode, which is the same as
* ISO 10646-1, are examples of coded character sets.
*
* <p> A <i>character-encoding scheme</i> is a mapping between a coded
* character set and a set of octet (eight-bit byte) sequences. UTF-8, UCS-2,
* UTF-16, ISO 2022, and EUC are examples of character-encoding schemes.
* Encoding schemes are often associated with a particular coded character set;
* UTF-8, for example, is used only to encode Unicode. Some schemes, however,
* are associated with multiple character sets; EUC, for example, can be used
* to encode characters in a variety of Asian character sets.
* JIS X 0201, and Unicode are examples of coded character sets.
*
* <p> Some standards have defined a <i>character set</i> to be simply a
* set of abstract characters without an associated assigned numbering.
* An alphabet is an example of such a character set. However, the subtle
* distinction between <i>character set</i> and <i>coded character set</i>
* is rarely used in practice; the former has become a short form for the
* latter, including in the Java API specification.
*
* <p> A <i>character-encoding scheme</i> is a mapping between one or more
* coded character sets and a set of octet (eight-bit byte) sequences.
* UTF-8, UTF-16, ISO 2022, and EUC are examples of
* character-encoding schemes. Encoding schemes are often associated with
* a particular coded character set; UTF-8, for example, is used only to
* encode Unicode. Some schemes, however, are associated with multiple
* coded character sets; EUC, for example, can be used to encode
* characters in a variety of Asian coded character sets.
*
* <p> When a coded character set is used exclusively with a single
* character-encoding scheme then the corresponding charset is usually named
* for the character set; otherwise a charset is usually named for the encoding
* scheme and, possibly, the locale of the character sets that it supports.
* Hence <tt>US-ASCII</tt> is the name of the charset for US-ASCII while
* character-encoding scheme then the corresponding charset is usually
* named for the coded character set; otherwise a charset is usually named
* for the encoding scheme and, possibly, the locale of the coded
* character sets that it supports. Hence <tt>US-ASCII</tt> is both the
* name of a coded character set and of the charset that encodes it, while
* <tt>EUC-JP</tt> is the name of the charset that encodes the
* JIS X 0201, JIS X 0208, and JIS X 0212
* c
haracter sets
.
* c
oded character sets for the Japanese language
.
*
* <p> The native character encoding of the Java programming language is
* UTF-16. A charset in the Java platform therefore defines a mapping between
* sequences of sixteen-bit UTF-16 code units and sequences of bytes. </p>
* UTF-16. A charset in the Java platform therefore defines a mapping
* between sequences of sixteen-bit UTF-16 code units (that is, sequences
* of chars) and sequences of bytes. </p>
*
*
* @author Mark Reinhold
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录