|
02/28/2005 | |||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
See:
Package Summary
| Interface Summary | |
| Destination | General structure representing the three segments of the destination. |
| MultiBodyDestination | A specialization to the Destination interface that enables the using API to create/enable/disable/destroy body segments. |
| Parser | Parses text markup from a reader and creates the relevant tokens. |
| ProcessingContext | This is a general structure to enable passing of processing context to various parts of the processing stack. |
| Segment | A basic object to write to. |
| Segmenter | Interface to be implemented by a class supporting segmentation. |
| TextProcessor | Manges the process of converting characters into tokens, segmenting, and transforming the tokens. |
| Token | Represents an atomic unit of text built from a stream or reader and representing a meaningful unit of text for a particular MIME type. |
| TokenFactory | Interface for creating and pooling tokens of a particular MIME type. |
| TokenMutator | Base interface for token mutators. |
| UltimateDestination | Structure returned to the client when processing is done. |
| Class Summary | |
| TextProcessorFactory | Factory to get text processors. |
| Exception Summary | |
| ProcessingException | This exception is thrown by various entities in the Segmentation and Transformation API to indicate that a problem has occurred while processing text. |
Allows a portal developer to inspect, alter, and augment markup and other types of text based on MIME type. The initial implementation supports text/html, text/javascript, and text/css. Markup text is parsed and tokenized by MIME type-specific parser implementations. These parsers create meaningful tokens for the MIME type, which are then passed to implementations of Segmenter and TokenMutator for redirection, inspection, and alteration in series. The tokens are then placed in segments contained in the Destination object and made available to client code.
The primary interface used is the TextProcessor. This interface and its implementations serve as a controller for the entire process of parsing, tokenization, and alteration. Instances of TextProcessor are created from a factory:
Reader reader =
new StringReader("<html>\n<body>\n<a href=\"/rootfolder/resource/\">link</a>\n</body>\n</html>");
TextProcessorFactory factory = TextProcessorFactory.getInstance();
TextProcessor processor = factory.getTextProcessor("default");
A TextProcessor instance holds information needed while parsing. A Segmenter must be set using the setSegmenter(Segmenter segmenter) method for parsing to proceed:
Integer key = new Integer(0010);
processor.setSegmenter(new TestSegmenter(key)); // NOTE: TestSegmenter is not an actual segmenter class
The segmenter receives tokens as they are created, and can make decisions as to where subsequent tokens go by opening and closing segments on the MultiBodyDestination object it is passed. Optionally, the processor may be set to use one or more TokenMutator implementations to inspect and alter tokens.
processor.addTokenMutator("com.mycompany.segmenters.AnchorAbsoluter");
processor.addTokenMutator("com.mycompany.segmenters.ImageAbsoluter");
UltimateDestination destination = null;
try {
destination = processor.process("text/html", reader);
} catch (ProcessingException e) {
e.printStackTrace(System.err);
} catch (IOException e) {
e.printStackTrace(System.err);
}
The process operation produces an instance of UltimateDestination, which holds the processed tokens and their string representations:
Map map = destination.getBodyContentStrings();
Object str0 = map.get(key); // get the segment that was written to
System.out.println((String)str0);
Segmenter
|
02/28/2005 | |||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||