cancel
Showing results for 
Search instead for 
Did you mean: 

Connecting doc and xml

digitx
Champ in-the-making
Champ in-the-making
Hi,

as i am quite a newbie, the following question might be quite stupid:

Does alfresco support content indexing by using a pair of files where one file is the document itself and the other file is a xml file describing the contents of the document to index?
Files will habe the same name, but other post-fix…for instance:

aDocumentFile.pdf
aDocumentFile.xml

Thank you very much.

ingo
6 REPLIES 6

andy
Champ on-the-rise
Champ on-the-rise
Hi

There is no per-document configuration of indexing.

PDF docs are converted to a text representation for indexing.
Some meta data may be extracted on to attributes of the document.

Why do you want this?

Regards

Andy

digitx
Champ in-the-making
Champ in-the-making
Hi and thanks for your reply.

i'd like to add meta info to the document when doing the scan.

Our MFP is able to…:

a.) Encode MetaData into the file name (e.g. date, time, user (20060821_author_invoiceNo.doc)) as well as…

b.) adding a configurable and dynamically filled metaData.xml and file to a scanned document (There will be 2 files after scanning. e.g.: document123.doc, document123.xml)

This is to de-centralize scanning documents into DMS, like for instance eCopy does. So, what i'd like to do is parse that xml file for buzzwords and categorise it accordingly.

Regards,

Ingo

dschmalz
Champ in-the-making
Champ in-the-making
b.) adding a configurable and dynamically filled metaData.xml and file to a scanned document (There will be 2 files after scanning. e.g.: document123.doc, document123.xml)

Ingo, did you consider using an Alfresco Content Package (.acp), which is a zip file containing both the file and an XML document describing the meta-data? Then, you could use an Alfresco rule to automatically import acp files into a specific space.

It would of course require the XML containing the meta-data generated by your tool to be transformed into the Alfresco DTD. Try to put a file in a repository, update its meta-data and export the space that contains it through the admin console to have a look at the acp files.

Hope this helps,
David

digitx
Champ in-the-making
Champ in-the-making
Seems as if this might be a possible solution for my problem
Do you know, in which part of the alfresco documentation that issue can be found?

thanx, ingo

dschmalz
Champ in-the-making
Champ in-the-making
You can have a look at:

http://wiki.alfresco.com/wiki/Export_and_Import#Alfresco_Content_Package_.28ACP.29_File_Format

It seems that Alfresco has not released any XML schema that covers the meta-data . I would suggest that you run manually an export of a space containing a few files, unzip the acp and have a look at the content :-).

David

digitx
Champ in-the-making
Champ in-the-making
:lol:

thx