<?xml version="1.0" encoding="utf-8"?><!-- generator="wordpress/1.5.1.3" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Unicode in Java: not so fast (but XML is better)!</title>
	<link>http://www.orbeon.com/blog/2006/06/10/unicode-in-java-not-so-fast/</link>
	<description>XForms Everywhere</description>
	<pubDate>Fri, 29 Aug 2008 23:45:46 +0000</pubDate>
	<generator>http://wordpress.org/?v=1.5.1.3</generator>

	<item>
		<title>by: Erik Bruchez</title>
		<link>http://www.orbeon.com/blog/2006/06/10/unicode-in-java-not-so-fast/#comment-537</link>
		<pubDate>Mon, 12 Jun 2006 23:07:00 +0000</pubDate>
		<guid>http://www.orbeon.com/blog/2006/06/10/unicode-in-java-not-so-fast/#comment-537</guid>
					<description>Alex,

I have updated the post a little bit to clarify some things, but here are more details.

A &quot;surrogate code point&quot; is a &quot;Unicode code point in the range U+D800 through U+DFFF.&quot; Those are shown in gray in the image, and they are reserved for UTF-16 encoding.

A surrogate pair is a pair of two 16-bit values, each being a surrogate code point, that together identify a single character.

There is no &quot;surrogate counts&quot;: the sentence just means that a &quot;surrogate [pair]&quot; should be counted as a single character in XPath.

Finally, a plane is &quot;a range of 65,536 [...] contiguous Unicode code points&quot;. There are 17 planes in Unicode. Plane 0 is the Basic Multilingual Plane (BMP).

The Unicode glossary (http://www.unicode.org/glossary/) is quite useful to answer these questions.

-Erik</description>
		<content:encoded><![CDATA[	<p>Alex,</p>
	<p>I have updated the post a little bit to clarify some things, but here are more details.</p>
	<p>A &#8220;surrogate code point&#8221; is a &#8220;Unicode code point in the range U+D800 through U+DFFF.&#8221; Those are shown in gray in the image, and they are reserved for UTF-16 encoding.</p>
	<p>A surrogate pair is a pair of two 16-bit values, each being a surrogate code point, that together identify a single character.</p>
	<p>There is no &#8220;surrogate counts&#8221;: the sentence just means that a &#8220;surrogate [pair]&#8221; should be counted as a single character in XPath.</p>
	<p>Finally, a plane is &#8220;a range of 65,536 [&#8230;] contiguous Unicode code points&#8221;. There are 17 planes in Unicode. Plane 0 is the Basic Multilingual Plane (BMP).</p>
	<p>The Unicode glossary (http://www.unicode.org/glossary/) is quite useful to answer these questions.</p>
	<p>-Erik
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Alessandro Vernet</title>
		<link>http://www.orbeon.com/blog/2006/06/10/unicode-in-java-not-so-fast/#comment-536</link>
		<pubDate>Mon, 12 Jun 2006 20:22:40 +0000</pubDate>
		<guid>http://www.orbeon.com/blog/2006/06/10/unicode-in-java-not-so-fast/#comment-536</guid>
					<description>And Erik, what are surrogate code points, surrogate pairs, and surrogate counts?

Alex</description>
		<content:encoded><![CDATA[	<p>And Erik, what are surrogate code points, surrogate pairs, and surrogate counts?</p>
	<p>Alex
</p>
]]></content:encoded>
				</item>
</channel>
</rss>
