cancel
Showing results for 
Search instead for 
Did you mean: 

Pulling meta data from other file formats?

joel_moore
Champ in-the-making
Champ in-the-making
If we wanted to add files to the repository that aren't OpenOffice-compatible documents (for example, SolidWorks or AutoCAD documents) is there a mechanism for extracting properties from them?  Perhaps some sort of middle-man service?  A primary use we'd have for this software is to make our CAD database more sane and searchable.
7 REPLIES 7

joel_moore
Champ in-the-making
Champ in-the-making
Should I assume this can't be done and that Alfresco only works with OpenOffice/ImageMagick compatible file types?

And yes, I realize I could potentially take the source code that's available and program it in myself but that's a bit beyond what I'm prepared (or capable) to do.  I guess I was hoping there was a generic interface or plugin architecture that would allow custom file types.

mrogers
Star Contributor
Star Contributor
Have you looked at articles like,
http://wiki.alfresco.com/wiki/Metadata_Extraction

I'd expect people to already have a working auto cad extractor. 
You just need to hope that they read your thread.

mrogers
Star Contributor
Star Contributor
A little bit of googling shows that it should be fairly easy to write a metadata extractor for the DXF format used by AutoCAD.

joel_moore
Champ in-the-making
Champ in-the-making
Have you looked at articles like,
http://wiki.alfresco.com/wiki/Metadata_Extraction

I'd expect people to already have a working auto cad extractor. 
You just need to hope that they read your thread.
I had been to that wiki page but it pretty much confused me.  I couldn't understand how a few XML configuration files were able to instruct Alfresco on how to pull metadata from a new file type.  I figured at some point a new library or executable had to be involved that knows how to read the new binary format.  I guess that's what "beans" are.  Which means I probably need to learn some Java if there isn't already an extractor available.

mrogers
Star Contributor
Star Contributor
The configuration is bolting a new "extracter" into Alfresco such that files of the new type can be understood.
The configuration sections on that page are describing how to configure the adapters which are already part of Alfresco.

So the steps to implemet an AutoCad extracter would be
a) find a library that understands the AutoCad format (or failing that write your own.)   The good news here is that the format is published and seems fairly simple.
b) Write an Extracter that takes content from the above library and presents it in the format Alfresco expects.     This will have to be Java.   Use the existing extracters as a model of what to do.
c) Plug in your new extracter.

Lets hope someone out there has already done this.

chen_shaopeng
Champ in-the-making
Champ in-the-making
The problem with Autocad metadata extracter (and other engineering file formats as matter of fact) is the license issue with the library.

Every CAD software vendors guarded their library or API as one of the most precious thing, and there is no open source or free library for parsing the file.

If you already have the Autocad license, you would have to code the extracter yourself, or hire someone to do so.

Plug a metadata extracter into Alfresco is not difficult by itself, the difficulty is in parsing the file metadata. And then, in this case, you will certainly have to extend the data model to hold the metadata that you extracted from your CAD files.

We have done quite a bit of metadata extraction for other file formats, such audio, video, Visio, etc.

baby77
Champ in-the-making
Champ in-the-making
Hi there
i have a question about metadata extracter for dwg format?
I have an application which convert me dwg to xml file.ANd now i want with xml to do metadata extraction.
I found on source  XmlMetadataExtracter.java and XPathMetadataExtracter.java,which can extract metadata from xml.Are thoose two classes useful and how?
What configuration is needed.Can someone please advise me.
Some java expert please.
Thanks and nice day.