cancel
Showing results for 
Search instead for 
Did you mean: 

Speed up import

abc_xyz
Champ in-the-making
Champ in-the-making

Based on the nuxeo-platform-importer package we have developed our own importer tool. Up to one million documents the average import speed is about 250 docs/sec. Importing further documents takes more and more time and the average speed goes down to 50 docs/sec and less.
/>
/>We followed all performance-relevant instructions for postgres DB described here. My question is if anyone knows further measures to speed up import since we had to import several millions documents.

2 REPLIES 2

ben_
Confirmed Champ
Confirmed Champ

Hi

Yes importing few millions of documents is a question of days.

A much faster way is to generate the ad'hoc SQL dump and to populate the database with the PostgreSQL copy instruction. This is possible if the data layout to import is simple.

ben

bruce_Grant
Elite Collaborator
Elite Collaborator

The big thing I found to help with mass import tuning was batch size - that is number of documents created before a commit (save) is performed. Too small and the overhead is large (per transaction). Too big and Postgres complains (not to mention a hickup runs the risk of losing all the documents in the commit).

Getting started

Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.