cancel
Showing results for 
Search instead for 
Did you mean: 

Info required on Word (docx) parsing

ram
Champ in-the-making
Champ in-the-making
Hi

Can you please let me know whether Alfresco provides utility classes to parse word document (.docx).

We have a requirement wherein users need to  upload content ( as per given templates). Once uploaded, the logic needs to be parse the document and display in UI ( we need to map specific fields in UI to the document).  Also the contents can be edited and needs to be written back to the document so that the master copy can be downloaded anytime.

Appreciate your help to know whether Alfresco provides apis for the above.

Ram
1 REPLY 1

afaust
Legendary Innovator
Legendary Innovator
Hello,

Alfresco includes libraries such as Apache Tika, which are able to extract data from documents such as the metadata / custom XML in Office documents. As far as I know, Apache Tika also provides components to embed data back into documents although I am not sure if the necessary Tika version is already included in Alfresco or if this covers Office metadata / custom XML yet.
As an open platform, you can always add other libraries such as docx4j, which we use in several customer projects to read and generate Office (OOXML) documents.

Regards
Axel