The European Project ‘SHAMAN’ aims at developing a long-term digital preservation framework and tools to analyse, ingest, manage, access and reuse digital objects. Within this project, XRCE has created ‘Xeproc’ to model document processing pipelines.
‘Xeproc’ preserves not only production processes, but also instrumented specifications of a document processing project. It is based on XML specifications and has an associated designer using EMF/GMF frameworks.
Xeproc is a dedicated DSL for modeling Document Processing as XML pipelines in Eclipse 3.5.1 or later versions. The Xeproc© DSL is defined by a dedicated XML schema available here.The Xeproc EMF model and Xeproc Designer are available under the Eclipse Public License.
- Eclipse 3.5.1 and modeling tools (or later version),
- Optional: Ajax Toolkit Framework.
Xeproc Designer is delivered with user documentation included in the download.
Join the Xeproc community and download new software from your Eclipse at the following update site: http://xeproc.xrce.xerox.com/designer/update/site.xml
If you have not already installed Eclipse, the latest version is available here.
The 4 key concepts encapsulated in Xeproc are:
- XML document
- Analysis or Transformation component
Xeproc reflects years of practice in tuning document processing pipelines at Xerox’s European Research Centre.
The Eclipse integration of the Xeproc Designer means it can coexist with any specialized XML Eclipse toolkit. The project folder contains all the necessary resources to set up a Document Processing pipeline. Each of these resources may be manipulated with your preferred toolbox, but the resource itself will then be inserted in the Xeproc to assemble a pipeline. Xeproc is one more resource type in your project, with its associated editor: Xeproc Designer.
Easy to use, focus on instrumentation and tuning.
Xeproc is easy to use to set up and tune XML Document processing. Use it for this purpose then deploy EMF based generation technologies to move the Xeproc into something else. This is the magic of the Xeproc metamodel. It is adaptable, functional and focussed.
The Xeproc Designer is extensible:
- In the design phase, you may want to load some analysis or transformation components into the palette. This is done through the ‘paletteComponent extension point’. Any active plug-in extending paletteComponent in Eclipse 3.5 will then be proposed in the Xeproc Designer palette.
- As Xeproc Designer proposes UI led interpretation, at runtime it is possible to plug in interpret engines for the various resources referenced in the Xeproc under construction:The Xeproc engine itself through the player extension point. A basic one is already part of the designer, but it can be replaced by a more specific one.
- In Xeproc, components are local or remote resources, to be interpreted by a stepPlayer i.e, an XSLT stylesheet will need an XSLT processor as an engine and a WSDL will need a parser/caller interpretation engine. These specific players must be registered through the stepPlayer extension point, with a stepPlayerTypeID to be associated with any components that need such an interpretation at runtime. Xeproc Designer proposes a basic stepPlayer for XSLT and for pdftoxml components.
- The principle exposed for step players is the same with the following enginesValidator. This interprets validation resources associated with a component at a given step in the Xeproc. Such a validation resource (i.e sthg.xsd) will be typed with the desired validator (i.e. xsd). It is then the responsibility of any validator registered with the expected type to validate the output of the step with the validation resource. Xeproc Designer provides xsd, rnc, rng validators.
- Viewer. This doesn’t validate but computes a view resource from a component output and a view resource definition. The same mechanism applies as for the validator. The registered viewer will register with a supplementary attribute, which is the kind of renderingViewer needed to visualize the computed view. Xeproc Designer provides an abstract framework to create viewers based on XSLT resources. This makes it easy to publish HTML or SVG viewers. A raw XML viewer is implicitly part of the UI.
- RenderingViewer is responsible for rendering a computed view. Two renderingViewers are part of Xeproc Designer: the external browser and the Mozilla browser proposed by ATF, if ATF has been plugged in your Eclipse. You can then debug the viewer through the DOM inspector and associated tools.