Friday, April 21, 2006

STAX:The new Streaming API for XML

I was reading up something on JAXB which was pretty long winded, when i came accross this "STAX" or "Streaming API for XML".

It is a bit unique in it's design from both SAX and DOM, but it still is closer to SAX.

It is a parser and hence can be used to read up xml documents but additionally like DOM can also be used to write up XML.

It is a streaming parser like SAX and like SAX it works from start to finish, and can not work backwards from a certain point in an xml file.

SAX uses push style of streaming i.e. it reads up the entire xml file from start to finish and logs events which the user can use.

STAX uses pull style of streaming i.e like an iterator which checks for hasNext() and the uses next() to get the next Object, STAX also checks for hasNext() and the uses next() to get the next Element.

Why do we need a new style of parser ?

I guess the main reason is performance.

1) Threads
With sax style parsers the application thread control is with the sax parser and thus one application thread can read one document at a time.

With stax style parsers that the application thread control is with the application code that calls the parser and thus one application thread can multiple document at a time.

2) Smaller libraries
Sun tutorial claims that pull parsing libraries, can be much smaller than push libraies

3) Filters
StAX supports building of filters and thus xml content not needed, can be ignored by the parser.

Here is a quote from the Sun Tutorial:-

"Why StAX?

The StAX project was spearheaded by BEA with support from Sun Microsystems, and the JSR 173 specification passed the Java Community Process final approval ballot in March, 2004 (http://jcp.org/en/jsr/detail?id=173). The primary goal of the StAX API is to give “parsing control to the programmer by exposing a simple iterator based API. This allows the programmer to ask for the next event (pull the event) and allows state to be stored in procedural fashion.”

StAX was created to address limitations in the two most prevalent parsing APIs, SAX and DOM."

Note that BEA and Sun are both involved in this. So I guess for the future of java xml in general and java web services in particular STAX parsers are here to stay.

There are two kinds of STAX api

Cursor API
This is a light weight and fast api, used for projects like J2ME and projects where performance is essential.

It is implemented by XMLStreamReader and XMLStreamWriter interfaces.

Iterator API
In this api the Events generated can be used by the application even after the parser has moved on to subsequent Events. It is much more flexible when you want to add or remove Events, extend Events, etc..

It is implemented by XMLEventReader and XMLEventWriter interfaces.

Please see these links for more information:-

http://www.oracle.com/technology/oramag/oracle/03-sep/o53devxml.html
http://java.sun.com/webservices/docs/2.0/tutorial/doc/StAX.html#wp69937

Now back to JAXB.....

1 Comments:

Blogger anon_anon said...

you might also want to look at vtd-xml, the latest and most advanced XML processing API available today

http://vtd-xml.sf.net

4:31 PM  

Post a Comment

<< Home