cancel
Showing results for 
Search instead for 
Did you mean: 

configure openoffice-document-formats.xml file for .docx

javauser007
Champ in-the-making
Champ in-the-making
Hi i want to configure the file named openoffice-document-formats.xml for reading the content of ms-word 2007 files. Because i'm using alfresco 2.9 which does not support the searching the content of ms-word 2007 files.

any help is appriciated….

thanks
9 REPLIES 9

zaizi
Champ in-the-making
Champ in-the-making
Add the following lines;


  <document-format><name>Microsoft Word 2007</name>
    <family>Text</family>
    <mime-type>application/vnd.openxmlformats-officedocument.wordprocessingml.document</mime-type>
    <file-extension>docx</file-extension>
    <export-filters>
      <entry><family>Text</family><string>MS Word 97</string></entry>
    </export-filters>
  </document-format>

  <document-format><name>Microsoft Excel 2007</name>
    <family>Spreadsheet</family>
    <mime-type>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</mime-type>
    <file-extension>xlsx</file-extension>
    <export-filters>
      <entry><family>Spreadsheet</family><string>MS Excel 97</string></entry>
    </export-filters>
  </document-format>

  <document-format><name>Microsoft Powerpoint 2007</name>
    <family>Presentation</family>
    <mime-type>application/vnd.openxmlformats-officedocument.presentationml.presentation</mime-type>
    <file-extension>pptx</file-extension>
    <export-filters>
      <entry><family>Presentation</family><string>MS PowerPoint 97</string></entry>
    </export-filters>
  </document-format>

You can of course include templates and other mime types. Full Office 2007 mimetypes are;


.docm,application/vnd.ms-word.document.macroEnabled.12
.docx,application/vnd.openxmlformats-officedocument.wordprocessingml.document
.dotm,application/vnd.ms-word.template.macroEnabled.12
.dotx,application/vnd.openxmlformats-officedocument.wordprocessingml.template
.potm,application/vnd.ms-powerpoint.template.macroEnabled.12
.potx,application/vnd.openxmlformats-officedocument.presentationml.template
.ppam,application/vnd.ms-powerpoint.addin.macroEnabled.12
.ppsm,application/vnd.ms-powerpoint.slideshow.macroEnabled.12
.ppsx,application/vnd.openxmlformats-officedocument.presentationml.slideshow
.pptm,application/vnd.ms-powerpoint.presentation.macroEnabled.12
.pptx,application/vnd.openxmlformats-officedocument.presentationml.presentation
.xlam,application/vnd.ms-excel.addin.macroEnabled.12
.xlsb,application/vnd.ms-excel.sheet.binary.macroEnabled.12
.xlsm,application/vnd.ms-excel.sheet.macroEnabled.12
.xlsx,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xltm,application/vnd.ms-excel.template.macroEnabled.12
.xltx,application/vnd.openxmlformats-officedocument.spreadsheetml.template

ganesh_boil
Champ in-the-making
Champ in-the-making
Hi zaizi, i'm also facing the same kind of problem.
in my case i'm using alfresco labs stable version (you know it comes with open office 3.0).
I configured openoffice-document-formats.xml file for .docx according to ypur suggession. But still no luck.
And it is not able to display the content of docx/doc in custom dashlet. it is showing some "?" which are not there in the content.
Any help is appriciated.

dwinfield
Champ in-the-making
Champ in-the-making
I have a svn checkout of alfresco, and those suggestions were already implemented.

However, I too could not get indexing of the contents of the docx file to work.  When I view the document through share the preview works, and my server's installation of open office can open up the file. 

Was anyone able to get this to work?

ganesh_boil
Champ in-the-making
Champ in-the-making
it is working fine with labs 3.0..
Just follow the steps specified by zaizi.

It is not working for 2.9 or earlier versions (at least for me)… even if we configures the docx formats.
But with 3.0 labs stable version working like a champ.

Thanks zaizi.

rliu
Champ in-the-making
Champ in-the-making
This is merely a comment and not a solution to your problem. The "?" you are referring to is a character set issue. Based on whatever library you use to read the content, the character "in question" is returned with a "?". Locate that character and see if there's a workaround to it.

itbeb
Champ in-the-making
Champ in-the-making
For which version of openoffice does this apply? We currently have version 2.3 and I believe this only works from 3.0 onwards?

ganesh_boil
Champ in-the-making
Champ in-the-making
Yes itbeb,
It works from openoffice 3.0 and alfresco 3.0 onwords…

inspiron82
Champ in-the-making
Champ in-the-making
Add the following lines;


  <document-format><name>Microsoft Word 2007</name>
    <family>Text</family>
    <mime-type>application/vnd.openxmlformats-officedocument.wordprocessingml.document</mime-type>
    <file-extension>docx</file-extension>
    <export-filters>
      <entry><family>Text</family><string>MS Word 97</string></entry>
    </export-filters>
  </document-format>

  <document-format><name>Microsoft Excel 2007</name>
    <family>Spreadsheet</family>
    <mime-type>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</mime-type>
    <file-extension>xlsx</file-extension>
    <export-filters>
      <entry><family>Spreadsheet</family><string>MS Excel 97</string></entry>
    </export-filters>
  </document-format>

  <document-format><name>Microsoft Powerpoint 2007</name>
    <family>Presentation</family>
    <mime-type>application/vnd.openxmlformats-officedocument.presentationml.presentation</mime-type>
    <file-extension>pptx</file-extension>
    <export-filters>
      <entry><family>Presentation</family><string>MS PowerPoint 97</string></entry>
    </export-filters>
  </document-format>

You can of course include templates and other mime types. Full Office 2007 mimetypes are;


.docm,application/vnd.ms-word.document.macroEnabled.12
.docx,application/vnd.openxmlformats-officedocument.wordprocessingml.document
.dotm,application/vnd.ms-word.template.macroEnabled.12
.dotx,application/vnd.openxmlformats-officedocument.wordprocessingml.template
.potm,application/vnd.ms-powerpoint.template.macroEnabled.12
.potx,application/vnd.openxmlformats-officedocument.presentationml.template
.ppam,application/vnd.ms-powerpoint.addin.macroEnabled.12
.ppsm,application/vnd.ms-powerpoint.slideshow.macroEnabled.12
.ppsx,application/vnd.openxmlformats-officedocument.presentationml.slideshow
.pptm,application/vnd.ms-powerpoint.presentation.macroEnabled.12
.pptx,application/vnd.openxmlformats-officedocument.presentationml.presentation
.xlam,application/vnd.ms-excel.addin.macroEnabled.12
.xlsb,application/vnd.ms-excel.sheet.binary.macroEnabled.12
.xlsm,application/vnd.ms-excel.sheet.macroEnabled.12
.xlsx,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xltm,application/vnd.ms-excel.template.macroEnabled.12
.xltx,application/vnd.openxmlformats-officedocument.spreadsheetml.template

Hi guys,
i understand that the rows in the first box must be written in openoffice-document-formats.xml, but where do i have to write lines of the second box?

Is this procedure needed only to make Alfresco search in content of files with those extensions or for something else?

thanks a lot

javauser007
Champ in-the-making
Champ in-the-making
Hi inspron,

you need to put the second box's code in mimetype-map.xml by follwoing the below syntax.

 
<mimetype mimetype="application/vnd.openxmlformats-officedocument.presentationml.presentation" display="Microsoft PowerPoint 2007">
            <extension>pptx</extension>
</mimetype>

NOTE: Already some mimetypes were mapped. You need to map/write which were not there in the above file.