Hi,
I want to import several millions of files into Alfresco. The file names are numerical values like 1000001, 1000002, etc. The files are later accessed by their file name only. The directory where they will be stored in Alfresco is irrelevant for the later access from outside Alfresco. Alfresco is accessed via CMIS.
Is there are performance difference between the following two configurations?
(1) all files are stored in one directory e.g. /base_dir/1000001, /base_dir/1000002, …
(2) the files are stored with an additional intermediate subdir derived from the last digit:
/base_dir/00/1000000, /base_dir/00/1000010, /base_dir/00/1000020, …
/base_dir/01/1000001, /base_dir/01/1000011, /base_dir/01/1000021, …
/base_dir/02/1000002, /base_dir/02/1000012, /base_dir/02/1000022, …
…
Does the creation of files in an Alfresco directory (like /base_dir) work sequentially or concurrently?
If it is sequentially with respect to one directory would the creation in different directories (e.g. /base_dir/00 and /base_dir/01) work concurrently?
Would the configuration (2) be faster when filled by multiple threads than configuration (1)?
Regards
U.Straub