cancel
Showing results for 
Search instead for 
Did you mean: 

UpCast: Word -> XML transformation

randomman
Champ in-the-making
Champ in-the-making
Not sure if I'm posting this in the right place, so apologies if i'm wrong.

Has anyone ever integrated the UpCast java API with Alfresco? I'm interested in exploring the possiblilties of transforming Word documents into DocBook XML inside Alfresco.
1 REPLY 1

rdanner
Champ in-the-making
Champ in-the-making
Not sure if I'm posting this in the right place, so apologies if i'm wrong.

Has anyone ever integrated the UpCast java API with Alfresco? I'm interested in exploring the possiblilties of transforming Word documents into DocBook XML inside Alfresco.

This may not be the best forum,  in the future you might want to try posting this kind of question in Alfresco Discussion so that others can find it more easily.  Anyway – no big deal.

For those wondering – it looks like upcast is a product that can take a MS Word Document and convert it to XML.
———————
Features are:
    # Fully recreates the document structure with automatic section nesting (customizable) and support for Word sections
    # supports paragraph and character styles
    # powerful table translation (HTML 4 or Oasis Exchange Table model - CALS), incl. nested tables, row/column spans, cell properties, borders and backgrounds
    # processes footnotes, hyperlinks, references, forms, index entries, annotations, page headers and footers
    # supports any combination of nested lists, tables and combination of layout elements possible in RTF documents
    # support for document properties (incl. user properties), document template reference and document variables
    # Unicode and many two-byte encodings supported
    # includes a WMF renderer and image rewriting capabilities
    # API for creating custom export filters
    # new inline nesting optimization with intelligent, customizable property hoisting to surrounding container element
    # translation of most style and layout information into CSS2
    # highlighting and support for non properly nested target regions
    # pass-through import filter
    # extracting embedded object binary data
    # handles large files (only subject to available memory)
    # improved support for textbox, image, TOC and fields
There is also a product by the same company called downcast which can take a XML document which conforms to a given schema and convert it to an MS Word document.


It looks like an interesting product.  It is not open source (tisk tisk  :winkSmiley Happy

That certainly doesn't exclude the possibility of having a project in the forge that connects the two products. 

I don't know of any existing integration (doesn't mean it doesn't exist) I check the forge for new goodies pretty often and I haven't seen it.


Are you using upcast?  how do you like it? How strong are the transformation capabilities? Have you done anything with its java API?  Is it difficult?