cancel
Showing results for 
Search instead for 
Did you mean: 

How can I import(index) documents to Nuxeo without moving or touching them (uploading)?

threecars_
Champ in-the-making
Champ in-the-making

Hi

This might be a newbie question, but I cannot seem to find the solution through searching docs, communities or questions. I also cannot do how do this on my own with the Nuxeo admin interface.

I have an absolutely massive document archive, at least in my own point of reference, standing at 3000 gigabytes. I would like to see how Nuxeo can handle this mix-mash of zip-files, images, pdf's, html's, and ms office documents and libreoffice files.

The structure of the file hierarchy and the physical location of the files cannot change, as they are used by other systems. Therefore I would like to index the files right where they are located.

Is this possible in Nuxeo? (I would think this is quite a case for a lot of people)

All the best -- TJAF

3 REPLIES 3

bruce_Grant
Elite Collaborator
Elite Collaborator

AFAIK this is not possible with Nuxeo out of the box. Nuxeo expects to ingest the binary, drop it into its own file structure, and do the necessary metadata capture and indexing. You could override this default functionality with a custom component.

Benjamin_Jalon1
Elite Collaborator
Elite Collaborator

Data recovery of huge data is something to take care.

Nuxeo gives many tools to do that:

  • nuxeo-platform-importer
  • Automation
  • Nuxeo shell

Clearly do that from a browser is not good practice. I think the best way is to do that from the server side. Maybe in your case a direct SQL script is the best way and Enable the indexation of documents after the import.

You can find documentation about that here.

Sorry bad reading. You mean you want use Nuxeo as exalead or google search appliance. Bruce has right this not a standard way to use Nuxeo. If you really needs that you can look around solr/lucene, but you will need to make the UI. Or pay a lot of money with solution above.