XForms Everywhere

6/23/2006

Automatically adding namespace declarations with IntelliJ

Filed under: General — Alessandro Vernet @ 3:23 pm

Often people ask us what XML editor we are using. When I say I use IntelliJ from JetBrain, the next question is inevitably “uh? I thought IntelliJ was a Java IDE”.

Yes, IntelliJ is primarily a Java IDE, but it also has excellent support for JavaScript, HTML, and XML. The XML editor is schema-aware and supports compound documents where you mix elements from multiple schemas, like XHTML with XForms. When you provide IntelliJ with the schema of the document you are editing, code completion and validation will be schema aware, as you would expect. Say in an XHTML document you put a paragraph into paragraph, which is not valid in (X)HTML, the error will be highlighted right away.

Showing how IntelliJ highlights invalid XML based on the schema

For those of us who have are writing Java code, IntelliJ had for as far as I can remember provided an auto-import feature: say you use the class ArrayList without importing java.util.ArrayList. Then IntelliJ suggests to add an import java.util.ArrayList for you, and you can accept the suggestion by just pressing Alt-Enter.

IntelliJ auto-import in Java

Now IntelliJ has the same feature for namespace declaration in XML. Say you start writing <xforms:model> but you don’t have any namespace declaration for the xforms prefix. After you press Alt-Enter, IntelliJ will ask if you want to create a namespace declaration. If you accept, it will add for you the xmlns:xforms="http://www.w3.org/2002/xforms" where it is needed.

IntelliJ adding a namespace declaration for XForms

IntelliJ seems to be looking all your schemas, and picking the namespace of those that define an element with a matching name. When multiple namespaces could apply, IntelliJ will give you a choice. For instance in the example below I have a config element, which could either be the start of an XPL pipeline or a page flow controller document. Hence IntelliJ gives me the choice between the two:

IntelliJ adding a namespace declaration for XForms

Want to know more about IntelliJ? Download it, check out their blog, and if you are doing open source development you can even get a free license.

6/16/2006

XForms Tip: Differentiating Between Submit Errors

Filed under: General — Erik Bruchez @ 7:43 am



With XForms 1.0, it is not possible to differenciate between a submission error due to an invalid form (or with required-but-empty values) and a submission error due to, for example, a network issue, an HTTP 404 error response, etc.

XForms 1.1 is adding many submission-related improvements that will for example allow an XForms page to directly speak the Atom Publishing Protocol, a fully REST-based API.

We will talk more about this support in the future, but already a new event, xforms-submit-serialize allows you to determine at a high-level why a submission has failed. This event is fired just after the instance has passed validation and just before the instance data is serialized.

If you set a control flag upon receiving this event, you can determine, upon receiving an xforms-submit-error event, whether the instance has already passed validation or not. If it has not, the error is a validation error. If it has, then the error has occurred during serialization, actual submission, or as a result of processing the response. Here is how you set the status flag in an instance called submission-status:

<xforms:instance id="submission-status">
    <submission-status xmlns=""/>
</xforms:instance>
<xforms:submission ...>
    <xforms:setvalue ev:event="xforms-submit-serialize"
                     ref="instance('submission-status')">validated</xforms:setvalue>
</xforms:submission>

Now when a submission error occurs, you want to be able to display an appropriate message, here stored in an instance called message. You can do this using the XForms 1.1 conditional action feature by providing the if attribute on the xforms:setvalue action:

<xforms:action ev:event="xforms-submit-error">
    <xforms:setvalue if="not(instance('submission-status') = 'validated')"
      ref="instance('message')">Please check for errors in your form.</xforms:setvalue>
    <xforms:setvalue if="instance('submission-status') = 'validated'"
      ref="instance('message')">Please contact your system administrator.</xforms:setvalue>
    <xforms:setvalue ref="instance('submission-status')"/>
</xforms:action>

Note that XForms 1.1 is still a working draft and its content may change until it is an actual recommendation.

6/14/2006

How Browsers Have Been Saving Us from Incorrect Encodings

Filed under: General — Alessandro Vernet @ 6:35 pm

Windows 1252 code pageErik had an entry just a few days ago about Unicode, and we’ll be now looking at two encodings: ISO-8859-1 and windows-1252. It is not that character encoding is the most exciting thing around, but it is one we need to get right.

ISO-8859-1, also called Latin 1, is the default encoding on most UNIX operating systems. The default encoding on Windows is not ISO-8859-1, but Windows-1252. The Windows encoding is a superset of ISO-8859-1 and only differs by using printable characters instead of control characters in the 0×80 to 0×9F range. Some relatively common characters, like the euro sign (€) and trade mark sign (™) are mapped to character in this range (shown in yellow in the image here).

It is very common for people to mistakenly specify that the encoding for a document is ISO-8859-1, while in fact the encoding is windows-1252. So what happens with all those documents incorrectly marked as ISO-8859-1? Are they rendered incorrectly? Well, no, in most cases they are rendered correctly, as if the windows-1252 encoding had been specified. The reason is that the control characters of ISO-8859-1 that map to printable characters in windows-1252 are not valid in HTML. So when those appear in an ISO-8859-1 document, the browser could either decide to consider the whole document invalid, or show the corresponding printable character from windows-1252. Most browsers go with this second options.

It looks like in quite a few cases our browsers have been saving us, maybe without us even knowing about it.

6/13/2006

FireBug: A Must-Have Firefox Extension for Web Developers

Filed under: General — Alessandro Vernet @ 2:17 pm

FireBug DebuggerIt looks like software developers in particular love to hate software, any software. Well, almost any: FireBug seems to be loved by everyone.

FireBug is an extension to Firefox that among other things let you explore the DOM of a web page, monitor Ajax requests, evaluate JavaScript expressions, and see all kind of errors and warning related to the page. And people have been describing FireBug as “an absolute godsend”, the “greatest web developer extension out there”, an “awesome”, “phenomenal”, and “absolutely, completely brilliant” extension.

The latest version, released in May 2006, even includes a JavaScript debugger. It does most of what the Venkman debugger does, but without the bloat. One note about the debugger: if the source page stays desperately empty for you, it may be because of conflict with another extension. In particular, I have noticed that the A9 Toolbar is causing this problem in my environment. If you are running A9 Toolbar and have this problem, solving it might be as simple as disabling the A9 Toolbar.

XInclude at last gets rid of xml:base

Filed under: General — Erik Bruchez @ 5:15 am

I had heard the rumor already, but by checking the XInclude errata it turns out this was right: XInclude now officially allows implementations to provide an option to disable xml:base fixup, in other words not to produce or update xml:base attributes in the result.

This was a long-awaited feature. Adding xml:base attributes can cause issues such as broken schema validation (you may have to modify your schema to handle those extra xml:base attributes, or pre-process your document before validating it) or undesired URL resolution in certain cases.

For example, with the OPS XForms engine, let’s assume you want to externalize an XForms model, and include it into your XHTML page as follows:

<xhtml:head>
    <xi:include href="oxf:/my-app/my-xforms-model.xml"/>
</xhtml:head>

The result of the inclusion, by default, looks like this:

<xhtml:head>
    <xforms:model xml:base="oxf:/my-app/my-xforms-model.xml">
        ...
        <xforms:submission action="services/save" method="post"/>
    </xforms:model>
</xhtml:head>

Now what you intended was that the relative “services/save” URL would be relative to the base URL of your XHTML page. If you loaded your page from “http://www.example.org/my-app/home”, you would submit to “http://www.example.org/my-app/services/save”. But because xml:base is inserted, you actually save to “oxf:/my-app/services/save”, which does not make any sense. And this is messed up just because you decided to externalize your XForms model for convenience.

So the OPS XInclude processor now includes an extension attribute, xxi:omit-xml-base, which you can place on xi:include elements to indicate that you don’t want the inclusion to produce or update xml:base attributes. The example above becomes:

<xhtml:head>
    <xi:include href="oxf:/my-app/my-xforms-model.xml" xxi:omit-xml-base="true"/>
</xhtml:head>

This way, all is well and your URL resolution works as expected!

6/10/2006

Unicode in Java: not so fast (but XML is better)!

Filed under: General — Erik Bruchez @ 9:06 pm

Unicode BMP Mapping

If I asked you whether Java supports Unicode, you would likely say yes, and you would be right. But did you know that this is not the end of the story? Did you know for example that a Java char (or its wrapper class Character) does not actually represent a Unicode character? Did you know that String.length() may not, in fact, always return the correct number of characters present in a string, and that String.charAt(12) does not always return the character actually at position 12? If so, you probably know more than I do and you can stop reading here!

The reason is quite simple: Unicode is able to address over one million characters (1,114,112 characters to be exact), and clearly this doesn’t fit in the 16 bits that a char represents. You need 21 bits to represent that number of characters.

In reality, a Java char represents a Basic Multilingual Plane (BMP) code point (see the image of the BMP, where each square represents 256 characters, for a total of 65,536 characters, each identified by a number or “code point”), including surrogate code points (the special code points in gray that indicate that another 16-bit value will follow), and a Java String represents a string in UTF-16 format. This is usually all right, because most modern languages are fully represented using the BMP, and each code point in the BMP fits in 16 bits. In other words, in general one Unicode character fits in a Java char and all is well.

However, there are several planes of supplementary characters that do not fit in the BMP. Such Unicode characters are represented in UTF-16 (and thus in Java strings) with two 16-bit char values, called a surrogate pair.

Amazingly enough, the standard Java library did not handle Unicode code points correctly at all until JDK 1.5 (AKA Java 5) released in late 2004. In that version, the Character and String classes have been augmented with some helper methods to handle Unicode code points using int values. However, for backward compatibility reasons, the semantic of methods such as String.length() and String.charAt() has not been modified and they are still “wrong”.

If you really want to handle Unicode, including supplementary character planes, in your Java application, you have to be really careful and to be aware of the existence of surrogate pairs. While there are historical (Unicode initially addressed 16 bits only) and convenience (most characters hold in 16 bits, so why use more memory by default?) reasons for this situation, it is a shame that the char type and Character class do not, actually, always represent characters, and that the String class’s methods do not do what you think they do.

The good news is that when working directly with XML technologies such as XPath and XSLT, you are shielded from such issues. For example, the XPath 1.0 recommendation explicitly comments: “In many programming languages, a string is represented by a sequence of 16-bit Unicode code values; implementations of XPath in such languages must take care to ensure that a surrogate pair is correctly treated as a single XPath character.” XQuery 1.0 and XPath 2.0 Functions and Operators comments as well: “A surrogate [meaning a surrogate pair, or two 16-bit values in UTF-16] counts as one character, not two.”

6/2/2006

About JSON and poor marketing strategies

Filed under: General — Erik Bruchez @ 4:34 am

OPS 3.x Logo

I am writing up a bit about JSON for our upcoming Web 2.0 book. I had heard of JSON for quite a while, but I have now taken the time to read the short spec at http://www.json.org/.

In fact, I quite like the idea of JSON. It appears quite elegant, simple, concise, readable, easy to parse, and as an alternative to XML for Ajax, I am ready to seriously consider it in some situations.

But I have to say that the way JSON is compared to XML on the JSON web site leaves a lot to be desired. You understand the gut feeling of the author (Is it Doug Crockford, the inventor of JSON? The page doesn’t say, and that’s probably better for its author.) who doesn’t appear to like XML as a data exchange format, but he proceeds to a rationalization of that feeling which leaves me with my mouth gaping. For example, that page claims:

  • JSON is much easier for human to read than XML. Well, that’s certainly a bold claim, with which I don’t agree. You have brackets in XML, you have brackets in JSON (and two types of those, one type for objects and one type for arrays). At best, my opinion is that a nicely formatted JSON document is about as readable (or unreadable) as a nicely formatted XML document, and I am comforted by these examples comparing JSON and XML. XML is going to be more verbose (elements’ end-tags), but that’s not an argument against readability, on the contrary. Claiming better readability for JSON is little more than wishful thinking.

  • JSON is not extensible because it does not need to be. Here there is a misconception about the extensibility of XML. XML is extensible because it lets you create your own element and attribute names and document structure. The same goes for JSON, since you can name your objects arbitrarily and create your own structure as well. The edge that XML would have over JSON here is that it supports namespaces, which JSON doesn’t support. Now that JSON doesn’t actually need namespaces, I don’t know. Again, that’s a bold claim. Remember, XML started without namespaces too!

  • JSON is a better data exchange format. Says who? The inventor of JSON. Based on what exactly? I am not sure, and that’s just another bold claim.

  • JSON does not provide any display capabilities because it is not a document markup language. Who said that XML provides “display capabilities”? I don’t know of any.

  • JSON does not have a <[CDATA[]]> feature, so it is not well suited to act as a carrier of sounds or images or other large binary payloads. Here you have to be puzzled. CDATA sections in XML have nothing to do with binary data. In XML like in JSON, binary data has to be encoded into text, using Base64 encoding, for example. You also learn from the JSON web site that XML can contain Java applets and ActiveX! I believe we are officially in a dream (or nightmare) now.

There is more, but I will stop for now. There is of course little chance that such a load of junk will make people think less of JSON. But certainly, I think less of the author of this comparison page because he doesn’t know anything about XML and talks about it anyway. My advice to him would be to simply take that page down, or at least remove the downright ridiculous points. As a marketing strategy, you don’t have to flame down XML to promote JSON, because JSON is in fact a pretty good alternative to XML in certain circumstances. Let JSON live off its actual merits, not off misconceptions about XML.

6/1/2006

What’s next for OPS?

Filed under: General — Erik Bruchez @ 7:16 am

OPS 3.x Logo

Back in January, we released OPS 3.0, with a 3.0.1 update in February. But since then there hasn’t been any new release, and we are regularly asked about what’s next for OPS. We thought that a little update about our plans would be welcome.

For one thing, tons of bugs have been fixed in the XForms NG engine. The current “unstable” builds are quite an improvement over 3.0.1, and we routinely recommend in ops-users that people use those builds. We have also added much needed and asked for features, ranging from improvements to xforms:submission to XForms 1.1 features to improved XForms controls, including an autocomplete field and an HTML support in textarea.

To complete this picture, we have improved caching in the XForms engine, with promises for enhanced performance, and we are working on support for the Liferay portal (with the fixes we are implementing for Liferay likely to work wity other JSR-168 portals as well). Finally, we have updated some key external components such as Saxon and eXist.

We claim we like the “release early, release often” philosophy, so what are we waiting for before we release 3.1? Here are a few of the tasks we have in the pipeline:

  • Fix a couple more high-priority bugs and performance issues.

  • Greatly improve the performance and architecture of the examples portal. The examples portal has been unnecessarily slow for a very long time, and we want to fix that.

  • A few baby steps towards making it easier for developers to get started with OPS, including providing a basic, clean WAR file with just the necessary resources to get started with XForms and a nice README file. It is likely that the full-fledged tutorial will have to wait a little bit more though.

  • Some clean-up of the OPS distribution and examples.

  • And of course, make sure all the new features are documented.

We have been busy lately and there is no guarantee that all this will make it into 3.1, but that’s the current plan for the next few weeks. When that’s done, we can start working on the great plans we have for OPS post 3.1! Be sure that XForms and lower barrier to entry for developers will be important parts of those plans.

Finally, we are of course always open to the great feedback we receive from the user community, so please let us know how we can make OPS even better in the future.

Powered by WordPress