cancel
Showing results for 
Search instead for 
Did you mean: 

Office 2007 (docx...) Extract Common Metadata

kenneth_thorman
Champ in-the-making
Champ in-the-making
Hi

Is there support for in the current version of Alfresco 3.0 Stable through the use of the POI version installed to extract docx document properties (custom ones?)

I have been strugelling with this for a while and it seems to work for doc files but not docx.

Anyone

Regards
Kenneth Thorman
3 REPLIES 3

kenneth_thorman
Champ in-the-making
Champ in-the-making
In the current release 3.0 Stable of Alfresco the POI library versino 3.1 is used.

If this is changed to 3.5 (beta) we're getting detailed exception and it says "The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)"

So I guess a bit of the code will have to be changed.

Any pointers, anyone?

Regards
Kenneth Thorman

jpfi
Champ in-the-making
Champ in-the-making
Hi,
yub, .doc is a binary file format, .docx is a special xml format.
You'll have to write your own metadata extrator, map it to docx file extension and use the XSSF part of jakarta POI.
I'm not sure if the old extractors are still working with a 3.5 POI jar…
Cheers, Jan

kenneth_thorman
Champ in-the-making
Champ in-the-making
We have been waiting for this feature, not wanting to implement this if alfresco was going to come out with this functionality.

I have been searching high and low now since POI 3.5 have come out.

Is it correct that this feature (office 2007 file format metadata awareness) is still not available?

Regards
Kenneth Thorman