02/28/2005
 

Package com.vignette.portal.text.processor

Allows a portal developer to inspect, alter, and augment markup and other types of text based on MIME type.

See:
          Package Summary

Interface Summary
Destination General structure representing the three segments of the destination.
MultiBodyDestination A specialization to the Destination interface that enables the using API to create/enable/disable/destroy body segments.
Parser Parses text markup from a reader and creates the relevant tokens.
ProcessingContext This is a general structure to enable passing of processing context to various parts of the processing stack.
Segment A basic object to write to.
Segmenter Interface to be implemented by a class supporting segmentation.
TextProcessor Manges the process of converting characters into tokens, segmenting, and transforming the tokens.
Token Represents an atomic unit of text built from a stream or reader and representing a meaningful unit of text for a particular MIME type.
TokenFactory Interface for creating and pooling tokens of a particular MIME type.
TokenMutator Base interface for token mutators.
UltimateDestination Structure returned to the client when processing is done.
 

Class Summary
TextProcessorFactory Factory to get text processors.
 

Exception Summary
ProcessingException This exception is thrown by various entities in the Segmentation and Transformation API to indicate that a problem has occurred while processing text.
 

Package com.vignette.portal.text.processor Description

Allows a portal developer to inspect, alter, and augment markup and other types of text based on MIME type. The initial implementation supports text/html, text/javascript, and text/css. Markup text is parsed and tokenized by MIME type-specific parser implementations. These parsers create meaningful tokens for the MIME type, which are then passed to implementations of Segmenter and TokenMutator for redirection, inspection, and alteration in series. The tokens are then placed in segments contained in the Destination object and made available to client code.

Usage:

The primary interface used is the TextProcessor. This interface and its implementations serve as a controller for the entire process of parsing, tokenization, and alteration. Instances of TextProcessor are created from a factory:

Reader reader =
  new StringReader("<html>\n<body>\n<a href=\"/rootfolder/resource/\">link</a>\n</body>\n</html>");
TextProcessorFactory factory = TextProcessorFactory.getInstance();
TextProcessor processor = factory.getTextProcessor("default");


A TextProcessor instance holds information needed while parsing. A Segmenter must be set using the setSegmenter(Segmenter segmenter) method for parsing to proceed:

Integer key = new Integer(0010);

processor.setSegmenter(new TestSegmenter(key)); // NOTE: TestSegmenter is not an actual segmenter class


The segmenter receives tokens as they are created, and can make decisions as to where subsequent tokens go by opening and closing segments on the MultiBodyDestination object it is passed. Optionally, the processor may be set to use one or more TokenMutator implementations to inspect and alter tokens.

processor.addTokenMutator("com.mycompany.segmenters.AnchorAbsoluter");
processor.addTokenMutator("com.mycompany.segmenters.ImageAbsoluter");

UltimateDestination destination = null;
try {
  destination = processor.process("text/html", reader);
} catch (ProcessingException e) {
  e.printStackTrace(System.err);
} catch (IOException e) {
  e.printStackTrace(System.err);
}

The process operation produces an instance of UltimateDestination, which holds the processed tokens and their string representations:

Map map = destination.getBodyContentStrings();
Object str0 = map.get(key); // get the segment that was written to
System.out.println((String)str0);

See Also:
Segmenter

02/28/2005
 

Copyright and Trademark Notices