<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: &amp;quot;OCR Extract&amp;quot; action doesn't work well (alfresco-simple-ocr + pdfsandwich) in Alfresco Forum</title>
    <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55561#M20294</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;So, to make it works on Alfresco/Share CE 6.1.2-ga/6.1.0&amp;nbsp;I&amp;nbsp;made shared volume between alfresco and ocrmypdf containers. I replace&lt;EM&gt;&amp;nbsp;/ocr_input&lt;/EM&gt; and&amp;nbsp;&lt;EM&gt;/ocr_output&lt;/EM&gt; to one directory&lt;EM&gt; /ocr&lt;/EM&gt; and map it as volume for both containers.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Only one problem,&amp;nbsp;asynchronous mode for rule gives me error. So I turn it off.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Angel thanks!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;docker-compose.yml&lt;/EM&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;/P&gt;&lt;PRE class="" style="color: #000000; background: #f5f2f0; border: 0px; margin: 0.5em 0px; padding: 1em 1em 1em 3.8em;"&gt;&lt;CODE style="border: 0px; font-weight: inherit;"&gt;...&lt;BR /&gt;services:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;alfresco:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;volumes:&lt;BR /&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;- ocr:/ocr&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ocrmypdf:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;volumes:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;- ocr:/ocr&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;volumes:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ocr:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;driver: local&lt;BR /&gt;...&lt;/CODE&gt;&lt;/PRE&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;EM style="border: 0px; font-weight: inherit;"&gt;bin/ocrmypdf.sh&lt;/EM&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;(and remove {} from $OUTPUT_FILE_PARAM in copy output file command)&amp;nbsp;&lt;/P&gt;&lt;PRE class="" style="color: #000000; background: #f5f2f0; border: 0px; margin: 0.5em 0px; padding: 1em 1em 1em 3.8em;"&gt;&lt;CODE style="border: 0px; font-weight: inherit;"&gt;#!/bin/bash&lt;BR /&gt;&lt;BR /&gt;INPUT_DIR=&lt;STRONG style="color: #ff0000; "&gt;/ocr&lt;/STRONG&gt;&lt;BR /&gt;OUTPUT_DIR=&lt;STRONG style="color: #ff0000; "&gt;/ocr&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;# ocrmypdf hostname&lt;BR /&gt;OCRMYPDF_SERVER="ocrmypdf"&lt;BR /&gt;&lt;BR /&gt;# identify parameters, input and output file&lt;BR /&gt;array=( "$@" )&lt;BR /&gt;len=${#array[@]}&lt;BR /&gt;ARGS=${array[@]:0:$len-2}&lt;BR /&gt;&lt;BR /&gt;LAST_ARGS="${@: -2}"&lt;BR /&gt;INPUT_FILE_PARAM=`echo "$LAST_ARGS" | cut -d ' ' -f 1`&lt;BR /&gt;OUTPUT_FILE_PARAM=`echo "$LAST_ARGS" | cut -d ' ' -f 2`&lt;BR /&gt;&lt;BR /&gt;# extract filenames&lt;BR /&gt;INPUT_FILE=$(basename "$INPUT_FILE_PARAM")&lt;BR /&gt;OUTPUT_FILE=$(basename "$OUTPUT_FILE_PARAM")&lt;BR /&gt;&lt;BR /&gt;# SSH parameters&lt;BR /&gt;SCP=cp&lt;BR /&gt;SSH=ssh&lt;BR /&gt;USER=root&lt;BR /&gt;&lt;BR /&gt;# copy original pdf to ocrmypdf server&lt;BR /&gt;$SCP $INPUT_FILE_PARAM $INPUT_DIR&lt;BR /&gt;&lt;BR /&gt;# execute ocrmypdf program&lt;BR /&gt;$SSH $USER@$OCRMYPDF_SERVER "/usr/bin/ocr.sh $ARGS $INPUT_DIR/$INPUT_FILE $OUTPUT_DIR/$OUTPUT_FILE"&lt;BR /&gt;&lt;BR /&gt;# copy transformed pdf back to alfresco path&lt;BR /&gt;$SCP $OUTPUT_DIR/$OUTPUT_FILE &lt;STRONG style="color: #ff0000; "&gt;$OUTPUT_FILE_PARAM&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;# remove temporal files&lt;BR /&gt;rm -f $INPUT_DIR/$INPUT_FILE $OUTPUT_DIR/$OUTPUT_FILE&lt;SPAN class="" style="border-width: 0px 1px 0px 0px; border-style: initial solid initial initial; border-color: initial #999999 initial initial; font-weight: inherit;"&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 02 Jul 2019 13:20:07 GMT</pubDate>
    <dc:creator>fedorow</dc:creator>
    <dc:date>2019-07-02T13:20:07Z</dc:date>
    <item>
      <title>"OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55547#M20280</link>
      <description>Hello,I'm using Alfresco 5.2 community edition on CentOS7.5 and it works well itself.Now I trying to add OCR function to Alfresco, so I installed alfresco-simple-ocr (simple-ocr-repo-2.3.1.jar) and pdfsandwich to add function.When I install pdfsandwich version 1.4, ruled "Extract OCR" action do work</description>
      <pubDate>Thu, 16 Aug 2018 02:51:03 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55547#M20280</guid>
      <dc:creator>hisayo-s</dc:creator>
      <dc:date>2018-08-16T02:51:03Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55548#M20281</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Check the following link &lt;A class="link-titled" href="https://github.com/keensoft/alfresco-simple-ocr/wiki/FAQ#ive-installed-alfresco-in-linux-by-using-the-wizard-and-ocr-software-does-not-work-properly" title="https://github.com/keensoft/alfresco-simple-ocr/wiki/FAQ#ive-installed-alfresco-in-linux-by-using-the-wizard-and-ocr-software-does-not-work-properly" rel="nofollow noopener noreferrer"&gt;FAQ · keensoft/alfresco-simple-ocr Wiki · GitHub&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Maybe that can help you.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 17 Aug 2018 17:30:42 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55548#M20281</guid>
      <dc:creator>douglascrp</dc:creator>
      <dc:date>2018-08-17T17:30:42Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55549#M20282</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for your information.&lt;/P&gt;&lt;P&gt;I haven't read the FAQ page, so I will read the FAQ carefully and try to improve my environment.&lt;/P&gt;&lt;P&gt;I hope good result.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 21 Aug 2018 00:33:55 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55549#M20282</guid>
      <dc:creator>hisayo-s</dc:creator>
      <dc:date>2018-08-21T00:33:55Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55550#M20283</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My problems has been partly solved.&lt;/P&gt;&lt;P&gt;Reading FAQ, I installed&amp;nbsp;2 jar files (simple-ocr-repo-2.3.1.jar and&amp;nbsp;simple-ocr-share-2.3.1.jar) insted of&amp;nbsp; simple-ocr-repo.amp. (I had used amp file)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;After restarted alfresco, the "Extract PDF action" sometimes works well, and sometimes not.&lt;/P&gt;&lt;P&gt;When action(conversion) succeed, "tesseract" ".convers.b+" "unpaper" processes are running on "top" view.&lt;/P&gt;&lt;P&gt;Otherwise when action(conversion) fails,&amp;nbsp;their processes appears shortly and soon disappears.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;It seems&amp;nbsp;that file size and number of page are unrelated.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have no idea how to solve this problem.&lt;/P&gt;&lt;P&gt;Anyone know the solution. Please let me know!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;------&lt;/P&gt;&lt;P&gt;- CentOS 7.5&lt;/P&gt;&lt;P&gt;- Alfresco 5.2 - community edition&lt;/P&gt;&lt;P&gt;- alfresco-simple-ocr 2.3.1&lt;/P&gt;&lt;P&gt;- pdfsandwich is 1.6 (*1)&lt;/P&gt;&lt;P&gt;-&amp;nbsp;tesseract 3.04&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;(*1)&lt;/P&gt;&lt;P&gt;When I tested version 1.7 again, pdfsandwich says following message as before.&lt;/P&gt;&lt;P&gt;&amp;gt; "Fatal error: exception Unix.Unix_error(Unix.ENOTEMPTY, "rmdir", "/tmp/pdfsandwich_tmp2d3ca3")"&lt;/P&gt;&lt;P&gt;Such being the case, I use version 1.7 with "-debug" option to avoid error. (temp files should be erased manually...)&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 21 Aug 2018 05:28:45 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55550#M20283</guid>
      <dc:creator>hisayo-s</dc:creator>
      <dc:date>2018-08-21T05:28:45Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55551#M20284</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Try using OCRmyPDF (&lt;A href="https://github.com/jbarlow83/OCRmyPDF" rel="nofollow noopener noreferrer"&gt;https://github.com/jbarlow83/OCRmyPDF&lt;/A&gt;) instead of pdfsandwich.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Both pdfsandwich and OCRmyPDF have some issues on CentOS (they are developed for Ubuntu), but you can use the Docker Image for OCRmyPDF available at&amp;nbsp;&lt;A href="https://ocrmypdf.readthedocs.io/en/latest/installation.html#installing-the-docker-image" rel="nofollow noopener noreferrer"&gt;https://ocrmypdf.readthedocs.io/en/latest/installation.html#installing-the-docker-image&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 22 Aug 2018 10:04:28 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55551#M20284</guid>
      <dc:creator>angelborroy</dc:creator>
      <dc:date>2018-08-22T10:04:28Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55552#M20285</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for your suggestion.&lt;/P&gt;&lt;P&gt;Unfortunately I'm not familiar with Docker.&lt;/P&gt;&lt;P&gt;I tried to install OCRmyPDF, but I could't.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So, I'm going to continue struggling to use pdfsandwich.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 23 Aug 2018 01:48:16 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55552#M20285</guid>
      <dc:creator>hisayo-s</dc:creator>
      <dc:date>2018-08-23T01:48:16Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55553#M20286</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P class=""&gt;I don’t know if this still works, as I haven’t tested it recently, but you can find a reference for installing pdfsandwich at CentOS 7 at&amp;nbsp;&lt;A href="https://github.com/keensoft/alfresco-simple-ocr/blob/master/docker/pdfsandwich-1.6-centos-7/Dockerfile" rel="nofollow noopener noreferrer"&gt;https://github.com/keensoft/alfresco-simple-ocr/blob/master/docker/pdfsandwich-1.6-centos-7/Dockerfile&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 23 Aug 2018 05:23:24 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55553#M20286</guid>
      <dc:creator>angelborroy</dc:creator>
      <dc:date>2018-08-23T05:23:24Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55554#M20287</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I tried to install "pdfsandwich" and "OCRmyPDF"(with docker), however I couldn't set up propery.&lt;/P&gt;&lt;P&gt;It is a pitty that I give up to try.&lt;/P&gt;&lt;P&gt;Thank you very much for giving suggestions and informations.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 28 Aug 2018 07:07:20 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55554#M20287</guid>
      <dc:creator>hisayo-s</dc:creator>
      <dc:date>2018-08-28T07:07:20Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55555#M20288</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you for your information.&lt;/P&gt;&lt;P&gt;I tried to install "OCRmyPDF" using Docker and partly successed to install it.&lt;/P&gt;&lt;P&gt;On the command line, and at the directory where the inputfile exist, conversion successfully done.&lt;/P&gt;&lt;P&gt;However at the othe directory, it does not work.&lt;/P&gt;&lt;P&gt;&amp;gt; ERROR - File not found - /home/hisayo-s/AAAAA.pdf&lt;/P&gt;&lt;P&gt;I give up my challenge.&lt;/P&gt;&lt;P&gt;Thanks a lot.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 28 Aug 2018 07:53:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55555#M20288</guid>
      <dc:creator>hisayo-s</dc:creator>
      <dc:date>2018-08-28T07:53:25Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55556#M20289</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I'm currently using Docker Compose as base for my installations, so I only can give you some tips on how to configure the whole thing with Docker.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;OCRmyPDF Dockerfile&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;FROM jbarlow83/ocrmypdf:v7.0.0&lt;BR /&gt;USER root&lt;BR /&gt;&lt;BR /&gt;RUN apt-get update &amp;amp;&amp;amp; apt-get install -y openssh-server&lt;BR /&gt;RUN mkdir /var/run/sshd&lt;BR /&gt;RUN echo 'root:screencast' | chpasswd&lt;BR /&gt;RUN sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config&lt;BR /&gt;&lt;BR /&gt;# SSH login fix. Otherwise user is kicked off after login&lt;BR /&gt;RUN sed 's@session\s*required\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i /etc/pam.d/sshd&lt;BR /&gt;&lt;BR /&gt;ENV NOTVISIBLE "in users profile"&lt;BR /&gt;RUN echo "export VISIBLE=now" &amp;gt;&amp;gt; /etc/profile&lt;BR /&gt;&lt;BR /&gt;COPY assets/ssh/id_rsa.pub /root/.ssh/id_rsa.pub&lt;BR /&gt;COPY assets/ocr.sh /usr/bin/ocr.sh&lt;BR /&gt;RUN cat /root/.ssh/id_rsa.pub &amp;gt;&amp;gt; /root/.ssh/authorized_keys \&lt;BR /&gt; &amp;amp;&amp;amp; chmod 0600 /root/.ssh/authorized_keys \&lt;BR /&gt; &amp;amp;&amp;amp; chmod +x /usr/bin/ocr.sh&lt;BR /&gt;&lt;BR /&gt;EXPOSE 22&lt;BR /&gt;ENTRYPOINT ["/usr/sbin/sshd", "-D"]&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;assets/ocr.sh&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;#!/bin/bash&lt;BR /&gt;&lt;BR /&gt;export LC_ALL=C.UTF-8&lt;BR /&gt;export LANG=C.UTF-8&lt;BR /&gt;&lt;BR /&gt;/usr/bin/ocrmypdf $@&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Alfresco Dockerfile&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;FROM alfresco/alfresco-content-repository-community:6.0.7-ga&lt;BR /&gt;&lt;BR /&gt;ENV LC_ALL C.UTF-8&lt;BR /&gt;ENV LANG C.UTF-8&lt;BR /&gt;&lt;BR /&gt;# Extra software&lt;BR /&gt;RUN set -x \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;amp;&amp;amp; yum install -y \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;wget \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;unzip \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;amp;&amp;amp; yum clean all&lt;BR /&gt;&lt;BR /&gt;# Install api-explorer webapp for REST API&lt;BR /&gt;RUN set -x \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;amp;&amp;amp; wget https://artifacts.alfresco.com/nexus/service/local/repositories/releases/content/org/alfresco/api-explorer/6.0.7-ga/api-explorer-6.0.7-ga.war -O /usr/local/tomcat/webapps/api-explorer.war&lt;BR /&gt;&lt;BR /&gt;ARG TOMCAT_DIR=/usr/local/tomcat&lt;BR /&gt;&lt;BR /&gt;RUN mkdir -p $TOMCAT_DIR/amps&lt;BR /&gt;&lt;BR /&gt;# Install AOS&lt;BR /&gt;RUN set -x \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;&amp;amp; mkdir /tmp/aos \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;&amp;amp; wget --no-check-certificate https://download.alfresco.com/cloudfront/release/community/201806-GA-build-00113/alfresco-aos-module-distributionzip-1.2.0.zip \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;&amp;amp; unzip alfresco-aos-module-distributionzip-1.2.0.zip -d /tmp/aos \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;&amp;amp; mv /tmp/aos/extension/* /usr/local/tomcat/shared/classes/alfresco/extension \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;&amp;amp; mv /tmp/aos/alfresco-aos-module-1.2.0.amp amps \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;&amp;amp; mv /tmp/aos/aos-module-license.txt licenses \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;&amp;amp; mv /tmp/aos/_vti_bin.war /usr/local/tomcat/webapps \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;&amp;amp; rm -rf /tmp/aos alfresco-aos-module-distributionzip-1.2.0.zip&lt;BR /&gt;&lt;BR /&gt;# SSH keys for ocrmypdf&lt;BR /&gt;COPY ssh/ /root/.ssh/&lt;BR /&gt;&lt;BR /&gt;# Install OCR&lt;BR /&gt;COPY bin/ /opt/alfresco/bin/&lt;BR /&gt;&lt;BR /&gt;# Configure SSH Client&lt;BR /&gt;RUN set -x &amp;amp;&amp;amp; \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; chmod +x /opt/alfresco/bin/ocrmypdf.sh &amp;amp;&amp;amp; \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; # Configure ssh&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; yum install -y openssh-clients &amp;amp;&amp;amp; \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; echo "StrictHostKeyChecking no" &amp;gt;&amp;gt; /etc/ssh/ssh_config &amp;amp;&amp;amp; \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; # Alfresco Image is using POSIX as Locale (!)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; sed -i '/^\s*SendEnv/ d' /etc/ssh/ssh_config &amp;amp;&amp;amp; \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; chmod 600 /root/.ssh/id_rsa&lt;BR /&gt;&lt;BR /&gt;# Install modules and addons&lt;BR /&gt;COPY modules/amps $TOMCAT_DIR/amps&lt;BR /&gt;COPY modules/jars $TOMCAT_DIR/webapps/alfresco/WEB-INF/lib&lt;BR /&gt;&lt;BR /&gt;RUN java -jar $TOMCAT_DIR/alfresco-mmt/alfresco-mmt*.jar install \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; $TOMCAT_DIR/amps $TOMCAT_DIR/webapps/alfresco -directory -nobackup -force&lt;BR /&gt;&lt;BR /&gt;# Add services configuration to alfresco-global.properties&lt;BR /&gt;COPY conf/alfresco-global.properties /usr/local/tomcat/shared/classes/alfresco-global.properties&lt;BR /&gt;&lt;BR /&gt;EXPOSE 21 143 25 445 137/udp 138/udp 139&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;bin/ocrmypdf.sh&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;#!/bin/bash&lt;BR /&gt;&lt;BR /&gt;INPUT_DIR=/ocr_input&lt;BR /&gt;OUTPUT_DIR=/ocr_output&lt;BR /&gt;&lt;BR /&gt;# ocrmypdf hostname&lt;BR /&gt;OCRMYPDF_SERVER="ocrmypdf"&lt;BR /&gt;&lt;BR /&gt;# identify parameters, input and output file&lt;BR /&gt;array=( "$@" )&lt;BR /&gt;len=${#array[@]}&lt;BR /&gt;ARGS=${array[@]:0:$len-2}&lt;BR /&gt;&lt;BR /&gt;LAST_ARGS="${@: -2}"&lt;BR /&gt;INPUT_FILE_PARAM=`echo "$LAST_ARGS" | cut -d ' ' -f 1`&lt;BR /&gt;OUTPUT_FILE_PARAM=`echo "$LAST_ARGS" | cut -d ' ' -f 2`&lt;BR /&gt;&lt;BR /&gt;# extract filenames&lt;BR /&gt;INPUT_FILE=$(basename "$INPUT_FILE_PARAM")&lt;BR /&gt;OUTPUT_FILE=$(basename "$OUTPUT_FILE_PARAM")&lt;BR /&gt;&lt;BR /&gt;# SSH parameters&lt;BR /&gt;SCP=cp&lt;BR /&gt;SSH=ssh&lt;BR /&gt;USER=root&lt;BR /&gt;&lt;BR /&gt;# copy original pdf to ocrmypdf server&lt;BR /&gt;$SCP $INPUT_FILE_PARAM $INPUT_DIR&lt;BR /&gt;&lt;BR /&gt;# execute ocrmypdf program&lt;BR /&gt;$SSH $USER@$OCRMYPDF_SERVER "/usr/bin/ocr.sh $ARGS $INPUT_DIR/$INPUT_FILE $OUTPUT_DIR/$OUTPUT_FILE"&lt;BR /&gt;&lt;BR /&gt;# copy transformed pdf back to alfresco path&lt;BR /&gt;$SCP $OUTPUT_DIR/$OUTPUT_FILE ${OUTPUT_FILE_PARAM}&lt;BR /&gt;&lt;BR /&gt;# remove temporal files&lt;BR /&gt;rm -f $INPUT_DIR/$INPUT_FILE $OUTPUT_DIR/$OUTPUT_FILE&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;conf/alfresco-global.properties&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;(Only OCRmyPDF section)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE class="language-none line-numbers"&gt;&lt;CODE&gt;## simple-ocr&lt;BR /&gt;# https://github.com/keensoft/alfresco-simple-ocr&lt;BR /&gt;ocr.command=/opt/alfresco/bin/ocrmypdf.sh&lt;BR /&gt;ocr.output.verbose=true&lt;BR /&gt;ocr.output.file.prefix.command=&lt;BR /&gt;# https://github.com/jbarlow83/OCRmyPDF/issues/124&lt;BR /&gt;ocr.extra.commands=-j1 --author keensoft --rotate-pages -l spa+eng+fra --deskew --clean --skip-text&lt;BR /&gt;ocr.server.os=linux&lt;SPAN class="line-numbers-rows"&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;SPAN&gt;‍&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 28 Aug 2018 08:17:37 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55556#M20289</guid>
      <dc:creator>angelborroy</dc:creator>
      <dc:date>2018-08-28T08:17:37Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55557#M20290</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks you for your kindness.&lt;/P&gt;&lt;P&gt;However, my environment&amp;nbsp;is consist of&amp;nbsp;CentOS7 and&amp;nbsp;Alfresco5.2 and OCRmyPDF(docker).&lt;/P&gt;&lt;P&gt;The&amp;nbsp;scripts&amp;nbsp;you have posted aren't match my environment.&lt;/P&gt;&lt;P&gt;As I am very new to docker, I&amp;nbsp;don't know how to&amp;nbsp;change the scripts.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 30 Aug 2018 00:20:18 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55557#M20290</guid>
      <dc:creator>hisayo-s</dc:creator>
      <dc:date>2018-08-30T00:20:18Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55558#M20291</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Comparing pdfsandwich to OCRmyPDF, pdfsandwich's quality for letter recognition is better than OCRmyPDF in Japanese.&lt;/P&gt;&lt;P&gt;So I will focused on using pdfsandwich.&lt;/P&gt;&lt;P&gt;Thank you very much for your help.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 30 Aug 2018 01:25:29 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55558#M20291</guid>
      <dc:creator>hisayo-s</dc:creator>
      <dc:date>2018-08-30T01:25:29Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55559#M20292</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Did you test with these instructions?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://github.com/keensoft/alfresco-simple-ocr/blob/master/docker/pdfsandwich-1.6-centos-7/Dockerfile" rel="nofollow noopener noreferrer"&gt;https://github.com/keensoft/alfresco-simple-ocr/blob/master/docker/pdfsandwich-1.6-centos-7/Dockerfile&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I don't know if they are still working with latest CentOS releases, but it can be an starting point.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 30 Aug 2018 06:53:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55559#M20292</guid>
      <dc:creator>angelborroy</dc:creator>
      <dc:date>2018-08-30T06:53:25Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55560#M20293</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I try go over this solution. My deployment:&lt;/P&gt;&lt;P&gt;Alfresco 6.1.2-ga / Share 6.1.0&lt;/P&gt;&lt;P&gt;jbarlow83/ocrmypdf:v8.2.3 or v7.0.0&lt;/P&gt;&lt;P&gt;api-explorer-6.1.0-ea.war or&amp;nbsp;6.0.7-ga&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And I have got "failed to copy".&lt;/P&gt;&lt;P&gt;I had file&amp;nbsp;&lt;EM&gt;/usr/local/tomcat/temp/Alfresco/OCRTransformWorker_source_5503547424193883468.pdf&lt;/EM&gt;&amp;nbsp; but&amp;nbsp;&lt;EM&gt;/usr/local/tomcat/temp/Alfresco/OCRTransformWorker_source_5503547424193883468&lt;STRONG&gt;_ocr&lt;/STRONG&gt;.pdf&lt;/EM&gt; don't.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My thought, I should change&amp;nbsp;&lt;/P&gt;&lt;PRE class="" style="color: #000000; background: #f5f2f0; border: 0px; margin: 0.5em 0px; padding: 1em 1em 1em 3.8em;"&gt;&lt;CODE style="border: 0px; font-weight: inherit;"&gt;INPUT_DIR=/ocr_input&lt;BR /&gt;OUTPUT_DIR=/ocr_output&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;but i don't understand how. "ocrmypdf" container don't contain this directories.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Log:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;alfresco_1 | Exception in thread "defaultAsyncAction1" java.lang.RuntimeException: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 05270018 Failed to copy content from file: &lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | writer: ContentAccessor[ contentUrl=store://2019/6/27/18/13/0081dc19-8750-4ddb-ac3c-396b4ba1a859.bin, mimetype=application/pdf, size=0, encoding=UTF-8, locale=en_US]&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | file: /usr/local/tomcat/temp/Alfresco/OCRTransformWorker_source_5503547424193883468_ocr.pdf&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:183)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRExtractAction.access$200(OCRExtractAction.java:38)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRExtractAction$1.execute(OCRExtractAction.java:164)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRExtractAction$1.execute(OCRExtractAction.java:161)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:450)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRExtractAction.executeInNewTransaction(OCRExtractAction.java:169)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRExtractAction.access$100(OCRExtractAction.java:38)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRExtractAction$ExtractOCRTask.run(OCRExtractAction.java:151)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at java.base/java.lang.Thread.run(Thread.java:834)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | Caused by: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 05270018 Failed to copy content from file: &lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | writer: ContentAccessor[ contentUrl=store://2019/6/27/18/13/0081dc19-8750-4ddb-ac3c-396b4ba1a859.bin, mimetype=application/pdf, size=0, encoding=UTF-8, locale=en_US]&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | file: /usr/local/tomcat/temp/Alfresco/OCRTransformWorker_source_5503547424193883468_ocr.pdf&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:86)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:181)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | ... 10 more&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | Caused by: org.alfresco.service.cmr.repository.ContentIOException: 05270018 Failed to copy content from file: &lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | writer: ContentAccessor[ contentUrl=store://2019/6/27/18/13/0081dc19-8750-4ddb-ac3c-396b4ba1a859.bin, mimetype=application/pdf, size=0, encoding=UTF-8, locale=en_US]&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | file: /usr/local/tomcat/temp/Alfresco/OCRTransformWorker_source_5503547424193883468_ocr.pdf&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at org.alfresco.repo.content.AbstractContentWriter.putContent(AbstractContentWriter.java:491)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:83)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | ... 11 more&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | Caused by: java.io.FileNotFoundException: /usr/local/tomcat/temp/Alfresco/OCRTransformWorker_source_5503547424193883468_ocr.pdf (No such file or directory)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at java.base/java.io.FileInputStream.open0(Native Method)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at java.base/java.io.FileInputStream.open(FileInputStream.java:219)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at java.base/java.io.FileInputStream.&amp;lt;init&amp;gt;(FileInputStream.java:157)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | at org.alfresco.repo.content.AbstractContentWriter.putContent(AbstractContentWriter.java:485)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;alfresco_1 | ... 12 more&lt;/EM&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 27 Jun 2019 15:33:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55560#M20293</guid>
      <dc:creator>fedorow</dc:creator>
      <dc:date>2019-06-27T15:33:25Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55561#M20294</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;So, to make it works on Alfresco/Share CE 6.1.2-ga/6.1.0&amp;nbsp;I&amp;nbsp;made shared volume between alfresco and ocrmypdf containers. I replace&lt;EM&gt;&amp;nbsp;/ocr_input&lt;/EM&gt; and&amp;nbsp;&lt;EM&gt;/ocr_output&lt;/EM&gt; to one directory&lt;EM&gt; /ocr&lt;/EM&gt; and map it as volume for both containers.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Only one problem,&amp;nbsp;asynchronous mode for rule gives me error. So I turn it off.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Angel thanks!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;docker-compose.yml&lt;/EM&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;/P&gt;&lt;PRE class="" style="color: #000000; background: #f5f2f0; border: 0px; margin: 0.5em 0px; padding: 1em 1em 1em 3.8em;"&gt;&lt;CODE style="border: 0px; font-weight: inherit;"&gt;...&lt;BR /&gt;services:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;alfresco:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;volumes:&lt;BR /&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;- ocr:/ocr&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ocrmypdf:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;volumes:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;- ocr:/ocr&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;volumes:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ocr:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;driver: local&lt;BR /&gt;...&lt;/CODE&gt;&lt;/PRE&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;&lt;EM style="border: 0px; font-weight: inherit;"&gt;bin/ocrmypdf.sh&lt;/EM&gt;&lt;/P&gt;&lt;P style="color: #727174; background-color: #ffffff; border: 0px;"&gt;(and remove {} from $OUTPUT_FILE_PARAM in copy output file command)&amp;nbsp;&lt;/P&gt;&lt;PRE class="" style="color: #000000; background: #f5f2f0; border: 0px; margin: 0.5em 0px; padding: 1em 1em 1em 3.8em;"&gt;&lt;CODE style="border: 0px; font-weight: inherit;"&gt;#!/bin/bash&lt;BR /&gt;&lt;BR /&gt;INPUT_DIR=&lt;STRONG style="color: #ff0000; "&gt;/ocr&lt;/STRONG&gt;&lt;BR /&gt;OUTPUT_DIR=&lt;STRONG style="color: #ff0000; "&gt;/ocr&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;# ocrmypdf hostname&lt;BR /&gt;OCRMYPDF_SERVER="ocrmypdf"&lt;BR /&gt;&lt;BR /&gt;# identify parameters, input and output file&lt;BR /&gt;array=( "$@" )&lt;BR /&gt;len=${#array[@]}&lt;BR /&gt;ARGS=${array[@]:0:$len-2}&lt;BR /&gt;&lt;BR /&gt;LAST_ARGS="${@: -2}"&lt;BR /&gt;INPUT_FILE_PARAM=`echo "$LAST_ARGS" | cut -d ' ' -f 1`&lt;BR /&gt;OUTPUT_FILE_PARAM=`echo "$LAST_ARGS" | cut -d ' ' -f 2`&lt;BR /&gt;&lt;BR /&gt;# extract filenames&lt;BR /&gt;INPUT_FILE=$(basename "$INPUT_FILE_PARAM")&lt;BR /&gt;OUTPUT_FILE=$(basename "$OUTPUT_FILE_PARAM")&lt;BR /&gt;&lt;BR /&gt;# SSH parameters&lt;BR /&gt;SCP=cp&lt;BR /&gt;SSH=ssh&lt;BR /&gt;USER=root&lt;BR /&gt;&lt;BR /&gt;# copy original pdf to ocrmypdf server&lt;BR /&gt;$SCP $INPUT_FILE_PARAM $INPUT_DIR&lt;BR /&gt;&lt;BR /&gt;# execute ocrmypdf program&lt;BR /&gt;$SSH $USER@$OCRMYPDF_SERVER "/usr/bin/ocr.sh $ARGS $INPUT_DIR/$INPUT_FILE $OUTPUT_DIR/$OUTPUT_FILE"&lt;BR /&gt;&lt;BR /&gt;# copy transformed pdf back to alfresco path&lt;BR /&gt;$SCP $OUTPUT_DIR/$OUTPUT_FILE &lt;STRONG style="color: #ff0000; "&gt;$OUTPUT_FILE_PARAM&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;# remove temporal files&lt;BR /&gt;rm -f $INPUT_DIR/$INPUT_FILE $OUTPUT_DIR/$OUTPUT_FILE&lt;SPAN class="" style="border-width: 0px 1px 0px 0px; border-style: initial solid initial initial; border-color: initial #999999 initial initial; font-weight: inherit;"&gt;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 02 Jul 2019 13:20:07 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55561#M20294</guid>
      <dc:creator>fedorow</dc:creator>
      <dc:date>2019-07-02T13:20:07Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55562#M20295</link>
      <description>&lt;P&gt;With the approach suggested by Fedorow, I was able to make OCR work with Alfresco 6.1.0. I update&amp;nbsp;&lt;EM&gt;ocr_input&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;and&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;/ocr_output&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;to /usr/local/tomcat/ocr_input&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;and /usr/local/tomcat/ocr_out so that alfresco container can access these folders without any access issues.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Thanks Fedorow&lt;/P&gt;&lt;P&gt;&lt;EM&gt;docker-compose.yml&lt;/EM&gt;&lt;/P&gt;&lt;PRE&gt;...&lt;BR /&gt;services:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;alfresco:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;volumes:&lt;BR /&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; - ocr-input:&lt;FONT color="#FF0000"&gt;/usr/local/tomcat/ocr_input&lt;/FONT&gt;&lt;BR /&gt;         - ocr-output:&lt;FONT color="#FF0000"&gt;/usr/local/tomcat/ocr_output&lt;/FONT&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;ocrmypdf:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; volumes:&lt;BR /&gt;             - ocr-input:&lt;FONT color="#FF0000"&gt;/usr/local/tomcat/ocr_input&lt;/FONT&gt;&lt;BR /&gt;             - ocr-output:/&lt;FONT color="#FF0000"&gt;usr/local/tomcat/ocr_output&lt;/FONT&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;volumes:&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;...&lt;BR /&gt;&amp;nbsp; &lt;FONT color="#FF0000"&gt;ocr-input:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;       external: true&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;  ocr-output:&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;       external: true&lt;/FONT&gt;&lt;BR /&gt;...&lt;BR /&gt;&lt;BR /&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;EM&gt;bin/ocrmypdf.sh&lt;/EM&gt;&lt;/P&gt;&lt;PRE&gt;#!/bin/bash&lt;BR /&gt;&lt;BR /&gt;INPUT_DIR=&lt;FONT color="#FF0000"&gt;/usr/local/tomcat/ocr_input&lt;/FONT&gt;&lt;BR /&gt;OUTPUT_DIR=&lt;FONT color="#FF0000"&gt;/usr/local/tomcat/ocr_output&lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;# ocrmypdf hostname&lt;BR /&gt;OCRMYPDF_SERVER="ocrmypdf"&lt;BR /&gt;&lt;BR /&gt;# identify parameters, input and output file&lt;BR /&gt;array=( "$@" )&lt;BR /&gt;len=${#array[@]}&lt;BR /&gt;ARGS=${array[@]:0:$len-2}&lt;BR /&gt;&lt;BR /&gt;LAST_ARGS="${@: -2}"&lt;BR /&gt;INPUT_FILE_PARAM=`echo "$LAST_ARGS" | cut -d ' ' -f 1`&lt;BR /&gt;OUTPUT_FILE_PARAM=`echo "$LAST_ARGS" | cut -d ' ' -f 2`&lt;BR /&gt;&lt;BR /&gt;# extract filenames&lt;BR /&gt;INPUT_FILE=$(basename "$INPUT_FILE_PARAM")&lt;BR /&gt;OUTPUT_FILE=$(basename "$OUTPUT_FILE_PARAM")&lt;BR /&gt;&lt;BR /&gt;# SSH parameters&lt;BR /&gt;SCP=cp&lt;BR /&gt;SSH=ssh&lt;BR /&gt;USER=root&lt;BR /&gt;&lt;BR /&gt;# copy original pdf to ocrmypdf server&lt;BR /&gt;$SCP $INPUT_FILE_PARAM $INPUT_DIR&lt;BR /&gt;&lt;BR /&gt;# execute ocrmypdf program&lt;BR /&gt;$SSH $USER@$OCRMYPDF_SERVER "/usr/bin/ocr.sh $ARGS $INPUT_DIR/$INPUT_FILE $OUTPUT_DIR/$OUTPUT_FILE"&lt;BR /&gt;&lt;BR /&gt;# copy transformed pdf back to alfresco path&lt;BR /&gt;$SCP $OUTPUT_DIR/$OUTPUT_FILE $OUTPUT_FILE_PARAM&lt;BR /&gt;&lt;BR /&gt;# remove temporal files&lt;BR /&gt;rm -f $INPUT_DIR/$INPUT_FILE $OUTPUT_DIR/$OUTPUT_FILE&lt;/PRE&gt;&lt;P&gt;After the above changes I was able to successfully run OCR with Alfresco 6.1.&amp;nbsp;&lt;/P&gt;&lt;P&gt;As we are running our Alfresco instance on Kubernetes and using HELM deployment, I need to configure the&amp;nbsp; volumes in values.yaml file but I am not sure how to configure the volumes in values.yaml file. Any one has idea on how we need to make similar configuration in kubernetes.&lt;/P&gt;&lt;P&gt;Any help apprecaited.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jun 2020 21:30:16 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55562#M20295</guid>
      <dc:creator>SriramG</dc:creator>
      <dc:date>2020-06-18T21:30:16Z</dc:date>
    </item>
    <item>
      <title>Re: "OCR Extract" action doesn't work well (alfresco-simple-ocr + pdfsandwich)</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55563#M20296</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;A href="https://migration33.stage.lithium.com/t5/user/viewprofilepage/user-id/82140"&gt;@SriramG&lt;/A&gt;,&lt;/P&gt;
&lt;P&gt;Thanks for updating us on how you resolved your issue - really helpful.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe start a new thread for your question about configuring volumes?&lt;/P&gt;
&lt;P&gt;Cheers,&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 22 Jun 2020 11:00:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/quot-ocr-extract-quot-action-doesn-t-work-well-alfresco-simple/m-p/55563#M20296</guid>
      <dc:creator>EddieMay</dc:creator>
      <dc:date>2020-06-22T11:00:25Z</dc:date>
    </item>
  </channel>
</rss>

