cancel
Showing results for 
Search instead for 
Did you mean: 

Question about upload Issue in Alfresco 6.2 Content Services

ujkim
Confirmed Champ
Confirmed Champ

We installed the Alfresco 6.2.0-ga version(by acs-deployment docker service).

Currently, our system is operating community service with

alfresco 12G, share 12G, Tika 4G, solr 4G, proxy 128MB system.

We're having an issue with uploads.

When uploading data of 200MB or more, 504 Gateway Timeout occurs.

The above issue occurs, but the data is uploaded.

Why am I getting the above error? Should I check the proxy level?

The proxy currently uses the alfresco/acs-community-nginx:1.0.0 docker image.

The related logs are as follows.

imageUpload Issue Invoked Screen

alfresco_1               | 2020-10-28T02:45:11.806311828Z 2020-10-28 02:45:11,805  WARN  [content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-26] Metadata extraction failed (turn on DEBUG for full error):    Extracter: org.alfresco.repo.content.metadata.PoiMetadataExtracter@6dca4305   Content:   ContentAccessor[ contentUrl=store://2020/10/28/2/44/8d00f12b-53c1-4bbe-831e-8127039d9abf.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=301445085, encoding=UTF-8, locale=ko_KR]   Failure:   nullnull

Can I get any advice on this?

(For reference, a comment on a similar issue was found in JIRA. Could this be reflected in the latest version? 504 Gateway Timeout Error )

I would like to remind users that alfresco is a good service.

Thanks

1 ACCEPTED ANSWER

abhinavmishra14
World-Class Innovator
World-Class Innovator

2020-10-29 02:54:21,152 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Starting metadata extraction:    reader: ContentAccessor[ contentUrl=store://2020/10/29/2/53/5b2f49f9-7b8c-42f0-8fb9-9ca6adee161b.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=301445085, encoding=UTF-8, locale=ko_KR]   extracter: org.alfresco.repo.content.metadata.PoiMetadataExtracter@25a51c06
2020-10-29 02:54:21,154 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Concurrent extractions : 0
2020-10-29 02:54:21,155 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] New extraction accepted. Concurrent extractions : 1
2020-10-29 02:54:41,207 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Extraction finalized. Remaining concurrent extraction : 0
2020-10-29 02:54:41,950 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Metadata extraction failed:    Extracter: org.alfresco.repo.content.metadata.PoiMetadataExtracter@25a51c06   Content:   ContentAccessor[ contentUrl=store://2020/10/29/2/53/5b2f49f9-7b8c-42f0-8fb9-9ca6adee161b.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=301445085, encoding=UTF-8, locale=ko_KR]null
java.util.concurrent.TimeoutException
        at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:204)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extractRaw(AbstractMappingMetadataExtracter.java:2093)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extract(AbstractMappingMetadataExtracter.java:1185)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extract(AbstractMappingMetadataExtracter.java:1135)
        at org.alfresco.repo.action.executer.ContentMetadataExtracter.executeImpl(ContentMetadataExtracter.java:374)
       .......

Seems like it is timing out while doing metadata extraction. The default limit for metadata extractor in general is '20000 ms'. As the error log suggets, the operation is getting timedout in repository.

See here:

https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/resources/alfres...

# The default timeout for metadata mapping extracters
content.metadataExtracter.default.timeoutMs=20000

There are limits on timeout and size too: 

See here, the size limit check: https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/java/org/alfresc...

Timeout check limit setting: https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/java/org/alfresc...

Timeout check and error : https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/java/org/alfresc...

Try increasing the value and retry upload op and see if it works for you.

Also, do you really need metadata extraction for these type of docs considering the heavy size? If you don't need to have metadata extracted, simply disable them.

#Disable metadata extraction
extracter.Poi.enabled=false
extracter.TikaAuto.enabled=false
extracter.PDFBox.enabled=false
extracter.Office.doc.enabled=false
extracter.Office.xls.enabled=false
extracter.Office.ppt.enabled=false


~Abhinav
(ACSCE, AWS SAA, Azure Admin)

View answer in original post

12 REPLIES 12

abhinavmishra14
World-Class Innovator
World-Class Innovator

Can you attach the full log? Also enable following debug log and try uploading the file again and share the full log.

log4j.logger.org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter=DEBUG

or

log4j.logger.org.alfresco.repo.content.metadata=DEBUG

Yes, there could be browser/proxy level timeout involved as well. First check and confirm what is going on repository side. Depending on what is going on repo side, you can adjust the timeouts on proxy side.

You could try updating the timeout settings in ngnix config.

example:

http{
   proxy_read_timeout 1000;
   proxy_connect_timeout 1000;
   proxy_send_timeout 1000;
}

or specific to upload api call:

location /api/upload {
   proxy_read_timeout 1000;
   proxy_connect_timeout 1000;
   proxy_send_timeout 1000; 
}

Depending on the type of file, there could be limits. for example PDF has following limits by default which can be changes via alfresco-global.properties:

The limits configured for Alfresco Content Services are:

Time out configured for all extractor and all mimetypes
content.metadataExtracter.default.timeoutMs=20000

Maximum size of a document to process - configured for PdfBoxMetadataExtracter , pdf files
content.metadataExtracter.pdf.maxDocumentSizeMB=10

Maximum number of concurrent extractions - configured for PdfBoxMetadataExtracter , pdf files
content.metadataExtracter.pdf.maxConcurrentExtractionsCount=5
~Abhinav
(ACSCE, AWS SAA, Azure Admin)

Thanks for your response!

I checked upload, and full debug log is below. (but this time, Not appeared in UI as Gateway 404, But timeout is appeared in debug log. Can you assume that issue?)

2020-10-29 02:54:21,152 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Starting metadata extraction:    reader: ContentAccessor[ contentUrl=store://2020/10/29/2/53/5b2f49f9-7b8c-42f0-8fb9-9ca6adee161b.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=301445085, encoding=UTF-8, locale=ko_KR]   extracter: org.alfresco.repo.content.metadata.PoiMetadataExtracter@25a51c06
2020-10-29 02:54:21,154 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Concurrent extractions : 0
2020-10-29 02:54:21,155 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] New extraction accepted. Concurrent extractions : 1
2020-10-29 02:54:41,207 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Extraction finalized. Remaining concurrent extraction : 0
2020-10-29 02:54:41,950 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Metadata extraction failed:    Extracter: org.alfresco.repo.content.metadata.PoiMetadataExtracter@25a51c06   Content:   ContentAccessor[ contentUrl=store://2020/10/29/2/53/5b2f49f9-7b8c-42f0-8fb9-9ca6adee161b.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=301445085, encoding=UTF-8, locale=ko_KR]null
java.util.concurrent.TimeoutException
        at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:204)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extractRaw(AbstractMappingMetadataExtracter.java:2093)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extract(AbstractMappingMetadataExtracter.java:1185)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extract(AbstractMappingMetadataExtracter.java:1135)
        at org.alfresco.repo.action.executer.ContentMetadataExtracter.executeImpl(ContentMetadataExtracter.java:374)
        at org.alfresco.repo.action.executer.ActionExecuterAbstractBase.execute(ActionExecuterAbstractBase.java:273)
        at org.alfresco.repo.action.ActionServiceImpl.directActionExecution(ActionServiceImpl.java:856)
        at org.alfresco.repo.action.ActionServiceImpl.executeActionImpl(ActionServiceImpl.java:757)
        at org.alfresco.repo.action.ActionServiceImpl.executeAction(ActionServiceImpl.java:581)
        at org.alfresco.repo.action.ActionServiceImpl.executeAction(ActionServiceImpl.java:567)
        at org.alfresco.repo.action.ActionServiceImpl.executeAction(ActionServiceImpl.java:865)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.alfresco.repo.security.permissions.impl.AlwaysProceedMethodInterceptor.invoke(AlwaysProceedMethodInterceptor.java:41)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.alfresco.repo.security.permissions.impl.ExceptionTranslatorMethodInterceptor.invoke(ExceptionTranslatorMethodInterceptor.java:53)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.alfresco.repo.audit.AuditMethodInterceptor.invoke(AuditMethodInterceptor.java:166)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:295)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
        at com.sun.proxy.$Proxy54.executeAction(Unknown Source)
        at org.alfresco.repo.jscript.ScriptAction.executeImpl(ScriptAction.java:173)
        at org.alfresco.repo.jscript.ScriptAction$1.execute(ScriptAction.java:199)
        at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:450)
        at org.alfresco.repo.jscript.ScriptAction.execute(ScriptAction.java:203)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:138)
        at org.mozilla.javascript.NativeJavaMethod.call(NativeJavaMethod.java:226)
        at org.mozilla.javascript.optimizer.OptRuntime.callN(OptRuntime.java:65)
        at org.mozilla.javascript.gen.classpath__alfresco_templates_webscripts_org_alfresco_repository_upload_upload_post_js_13._c_extractMetadata_1(classpath*:alfresco/templates/webscripts/org/alfresco/repository/upload/upload.post.js:9)
        at org.mozilla.javascript.gen.classpath__alfresco_templates_webscripts_org_alfresco_repository_upload_upload_post_js_13.call(classpath*:alfresco/templates/webscripts/org/alfresco/repository/upload/upload.post.js)
        at org.mozilla.javascript.optimizer.OptRuntime.callName(OptRuntime.java:76)
        at org.mozilla.javascript.gen.classpath__alfresco_templates_webscripts_org_alfresco_repository_upload_upload_post_js_13._c_main_6(classpath*:alfresco/templates/webscripts/org/alfresco/repository/upload/upload.post.js:477)
        at org.mozilla.javascript.gen.classpath__alfresco_templates_webscripts_org_alfresco_repository_upload_upload_post_js_13.call(classpath*:alfresco/templates/webscripts/org/alfresco/repository/upload/upload.post.js)
        at org.mozilla.javascript.optimizer.OptRuntime.callName0(OptRuntime.java:87)
        at org.mozilla.javascript.gen.classpath__alfresco_templates_webscripts_org_alfresco_repository_upload_upload_post_js_13._c_script_0(classpath*:alfresco/templates/webscripts/org/alfresco/repository/upload/upload.post.js:523)
        at org.mozilla.javascript.gen.classpath__alfresco_templates_webscripts_org_alfresco_repository_upload_upload_post_js_13.call(classpath*:alfresco/templates/webscripts/org/alfresco/repository/upload/upload.post.js)
        at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:409)
        at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3566)
        at org.mozilla.javascript.gen.classpath__alfresco_templates_webscripts_org_alfresco_repository_upload_upload_post_js_13.call(classpath*:alfresco/templates/webscripts/org/alfresco/repository/upload/upload.post.js)
        at org.mozilla.javascript.gen.classpath__alfresco_templates_webscripts_org_alfresco_repository_upload_upload_post_js_13.exec(classpath*:alfresco/templates/webscripts/org/alfresco/repository/upload/upload.post.js)
        at org.alfresco.repo.jscript.RhinoScriptProcessor.executeScriptImpl(RhinoScriptProcessor.java:509)
        at org.alfresco.repo.jscript.RhinoScriptProcessor.execute(RhinoScriptProcessor.java:207)
        at org.alfresco.repo.processor.ScriptServiceImpl.execute(ScriptServiceImpl.java:219)
        at org.alfresco.repo.processor.ScriptServiceImpl.executeScript(ScriptServiceImpl.java:181)
        at org.alfresco.repo.web.scripts.RepositoryScriptProcessor.executeScript(RepositoryScriptProcessor.java:109)
        at org.springframework.extensions.webscripts.AbstractWebScript.executeScript(AbstractWebScript.java:1376)
        at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.java:86)
        at org.alfresco.repo.web.scripts.RepositoryContainer$3.execute(RepositoryContainer.java:527)
        at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:450)
        at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecute(RepositoryContainer.java:595)
        at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecuteAs(RepositoryContainer.java:664)
        at org.alfresco.repo.web.scripts.RepositoryContainer.executeScriptInternal(RepositoryContainer.java:435)
        at org.alfresco.repo.web.scripts.RepositoryContainer.executeScript(RepositoryContainer.java:315)
        at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:399)
        at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:210)
        at org.springframework.extensions.webscripts.servlet.WebScriptServlet.service(WebScriptServlet.java:132)
        at org.alfresco.repo.web.scripts.AlfrescoWebScriptServlet.service(AlfrescoWebScriptServlet.java:43)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
        at jdk.internal.reflect.GeneratedMethodAccessor437.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/javax.security.auth.Subject.doAsPrivileged(Subject.java:550)
        at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
        at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:170)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:225)
        at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at jdk.internal.reflect.GeneratedMethodAccessor433.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/javax.security.auth.Subject.doAsPrivileged(Subject.java:550)
        at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
        at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:253)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:191)
        at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
        at org.alfresco.module.aosmodule.service.ContextRootFilter.doFilter(ContextRootFilter.java:93)
        at jdk.internal.reflect.GeneratedMethodAccessor433.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/javax.security.auth.Subject.doAsPrivileged(Subject.java:550)
        at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
        at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:253)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:191)
        at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
        at org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(GlobalLocalizationFilter.java:68)
        at jdk.internal.reflect.GeneratedMethodAccessor433.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/javax.security.auth.Subject.doAsPrivileged(Subject.java:550)
        at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
        at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:253)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:191)
        at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
        at org.alfresco.web.app.servlet.ClearSecurityContextFilter.doFilter(ClearSecurityContextFilter.java:53)
        at jdk.internal.reflect.GeneratedMethodAccessor433.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
        at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/javax.security.auth.Subject.doAsPrivileged(Subject.java:550)
        at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
        at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:253)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:191)
        at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
        at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
        at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:660)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:798)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:808)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.base/java.lang.Thread.run(Thread.java:834)
2020-10-29 02:54:41,955 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Completed metadata extraction:    reader:    ContentAccessor[ contentUrl=store://2020/10/29/2/53/5b2f49f9-7b8c-42f0-8fb9-9ca6adee161b.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=301445085, encoding=UTF-8, locale=ko_KR]   extracter: org.alfresco.repo.content.metadata.PoiMetadataExtracter@25a51c06   changed:   {}

and then, I used the acs docker compose image to change the nginx proxy settings you recommended.

In case of nginx, can I get any advice on how to change it?

I'm using it now. (https://github.com/Alfresco/acs-community-deployment/blob/3.0.1/docker-compose/docker-compose.yml)

Thanks for your help!

abhinavmishra14
World-Class Innovator
World-Class Innovator

2020-10-29 02:54:21,152 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Starting metadata extraction:    reader: ContentAccessor[ contentUrl=store://2020/10/29/2/53/5b2f49f9-7b8c-42f0-8fb9-9ca6adee161b.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=301445085, encoding=UTF-8, locale=ko_KR]   extracter: org.alfresco.repo.content.metadata.PoiMetadataExtracter@25a51c06
2020-10-29 02:54:21,154 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Concurrent extractions : 0
2020-10-29 02:54:21,155 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] New extraction accepted. Concurrent extractions : 1
2020-10-29 02:54:41,207 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Extraction finalized. Remaining concurrent extraction : 0
2020-10-29 02:54:41,950 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] [http-nio-8080-exec-9] Metadata extraction failed:    Extracter: org.alfresco.repo.content.metadata.PoiMetadataExtracter@25a51c06   Content:   ContentAccessor[ contentUrl=store://2020/10/29/2/53/5b2f49f9-7b8c-42f0-8fb9-9ca6adee161b.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=301445085, encoding=UTF-8, locale=ko_KR]null
java.util.concurrent.TimeoutException
        at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:204)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extractRaw(AbstractMappingMetadataExtracter.java:2093)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extract(AbstractMappingMetadataExtracter.java:1185)
        at org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter.extract(AbstractMappingMetadataExtracter.java:1135)
        at org.alfresco.repo.action.executer.ContentMetadataExtracter.executeImpl(ContentMetadataExtracter.java:374)
       .......

Seems like it is timing out while doing metadata extraction. The default limit for metadata extractor in general is '20000 ms'. As the error log suggets, the operation is getting timedout in repository.

See here:

https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/resources/alfres...

# The default timeout for metadata mapping extracters
content.metadataExtracter.default.timeoutMs=20000

There are limits on timeout and size too: 

See here, the size limit check: https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/java/org/alfresc...

Timeout check limit setting: https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/java/org/alfresc...

Timeout check and error : https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/java/org/alfresc...

Try increasing the value and retry upload op and see if it works for you.

Also, do you really need metadata extraction for these type of docs considering the heavy size? If you don't need to have metadata extracted, simply disable them.

#Disable metadata extraction
extracter.Poi.enabled=false
extracter.TikaAuto.enabled=false
extracter.PDFBox.enabled=false
extracter.Office.doc.enabled=false
extracter.Office.xls.enabled=false
extracter.Office.ppt.enabled=false


~Abhinav
(ACSCE, AWS SAA, Azure Admin)

Thank you for your detailed explanation!

It has a default value of 20000ms, is there a recommended time limit? (where is the repository.properties file in alfresco?)

I don't think the solution is to increase it. I will consider excluding unnecessary items among metadata extraction items.

Could I possibly know what the metadata extraction entry means? Also, can I check where the config file is located?

Could the metadata extraction information be related to the solr search engine? I want to maintain alfresco's search function.


There seems to be no relevant information here. (https://docs.alfresco.com/6.0/references/dev-extension-points-custom-metadata-extractor.html)

I think your advice is right for the Tika crypto issue. Could you ever see why an error occurs when uploading an encrypted file? (Can you check the criteria for that error? If the encrypted file is uploaded, does it happen unconditionally?)

Thanks for the detailed explanation!!

abhinavmishra14
World-Class Innovator
World-Class Innovator

It has a default value of 20000ms, is there a recommended time limit? (where is the repository.properties file in alfresco?)

Could I possibly know what the metadata extraction entry means? Also, can I check where the config file is located?

Could the metadata extraction information be related to the solr search engine? I want to maintain alfresco's search function.

I think your advice is right for the Tika crypto issue. Could you ever see why an error occurs when uploading an encrypted file? (Can you check the criteria for that error? If the encrypted file is uploaded, does it happen unconditionally?)

-> There is no hard and fast rule for timeouts, however the default limit is balanced limit to make sure that upload process doesn't take too longer to complete. You can adjust by testing depending on your usecase.

-> The link you shared has enough details about metadata extractors, these are backend processes that extracts any/selected metadata from incoming documents and map it to content model properties which could be custom properties as well. In log you would see what metadata is extracted and what metadata is applied to your document after successful extraction. 

Example log:

Mapped and Accepted: {
   {http://www.alfresco.org/model/content/1.0}title={en_GB=SomeSubject}, 
   {http://www.alfresco.org/model/content/1.0}author=Xyz}

All ootb metadata extractor properties can be found here for reference: 

https://github.com/Alfresco/alfresco-community-repo/tree/master/repository/src/main/resources/alfres...

All default system properties including basic metadata extractor configs can be found here: https://github.com/Alfresco/alfresco-community-repo/blob/master/repository/src/main/resources/alfres...

-> No there is no effect on search side at all. Metadata extraction will extract common properties from the file, such as author, and set the corresponding content model property accordingly.

-> While accessing encrypted files it is likely to see these type of errors and reading encypted docs via extractors may not be allowed. I have not worked with encrypted docs and their metadata extraction so far so can't suggest anything at this moment. someone else might help out.

~Abhinav
(ACSCE, AWS SAA, Azure Admin)

I'm late to answer because of a different schedule. I'm Sorry!

Thank you so much for your detailed response!!

There is one curious part.

Where is the repository.properties file presented here?

I looked in the alfresco-content-repository-community:6.2.0-ga docker image but I couldn't find it.
I want to apply the points you have told me about Metadata Extraction Configuration. How can I apply it?

All of the other answers have been very helpful. Thank you @abhinavmishra14 ! 

abhinavmishra14
World-Class Innovator
World-Class Innovator

Where is the repository.properties file presented here?


The repository.properties file is part of alfresco-repository-xxx.jar file. The link takes you to the source code. If you want to look at the file, find the alfresco-repository-xxx.jar from your classpath and extract it. You should be able to see the properties file. 

~Abhinav
(ACSCE, AWS SAA, Azure Admin)

I have been understood what you said now!

I'll check the custom timeout limit and unnecessary field about metadata extraction!

Thanks for letting me know the pars I don't know Smiley Happy

Have a great day!  Thanks @abhinavmishra14 !

abhinavmishra14
World-Class Innovator
World-Class Innovator

Glad to hear that @ujkim 

~Abhinav
(ACSCE, AWS SAA, Azure Admin)