<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Nuxeo-Platform-OCR Question in Nuxeo Forum</title>
    <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325223#M12224</link>
    <description>&lt;P&gt;Here I did it using Squeeze's own Tesseract.&lt;/P&gt;</description>
    <pubDate>Tue, 10 Jan 2012 10:01:31 GMT</pubDate>
    <dc:creator>OlivierM_</dc:creator>
    <dc:date>2012-01-10T10:01:31Z</dc:date>
    <item>
      <title>Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325215#M12216</link>
      <description>&lt;P&gt;Hi:&lt;/P&gt;
&lt;P&gt;I'm trying to install 'Nuxeo-platform-ocr' (https://github.com/nuxeo/nuxeo-platform-ocr) , but I do not know where to locate the file 'content_in_doc', so that Nuxeo can use to analyze.&lt;/P&gt;
&lt;P&gt;I have followed this manual &lt;A href="https://github.com/nuxeo/nuxeo-platform-ocr" target="test_blank"&gt;https://github.com/nuxeo/nuxeo-platform-ocr&lt;/A&gt;, but not clear where to locate.&lt;/P&gt;
&lt;P&gt;I'm using Ubuntu 10.11 + Tesseract + 3 + Nuxeo Olena (scribe)&lt;/P&gt;
&lt;P&gt;Could you tell me where I locate the file 'content_in_doc'?&lt;/P&gt;
&lt;P&gt;Thanks, and regards.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Dec 2011 17:24:09 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325215#M12216</guid>
      <dc:creator>Soni_</dc:creator>
      <dc:date>2011-12-12T17:24:09Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325216#M12217</link>
      <description>&lt;P&gt;I just tried to build against the latest stable version (2.0) of Olena and it seems to work fine. I have updated the &lt;A href="https://github.com/nuxeo/nuxeo-platform-ocr/blob/develop/README.md"&gt;README.md&lt;/A&gt; of &lt;CODE&gt;nuxeo-platform-ocr&lt;/CODE&gt; to point to the right source archive.&lt;/P&gt;
&lt;P&gt;Beware that the build of olena is has several steps and &lt;STRONG&gt;2 calls to make in 2 separate folders&lt;/STRONG&gt; (the build root and the &lt;CODE&gt;scribo/src&lt;/CODE&gt; subfolder):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;$ wget &lt;A href="http://www.lrde.epita.fr/dload/olena/2.0/olena-2.0.tar.bz2" target="test_blank"&gt;http://www.lrde.epita.fr/dload/olena/2.0/olena-2.0.tar.bz2&lt;/A&gt;
$ tar jxvf olena-*.tar.bz2
$ cd olena-2.0/
$ mkdir _build
$ cd _build
$ ../configure &amp;amp;&amp;amp; make
$ cd scribo/src
$ make
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The &lt;CODE&gt;scribo/src&lt;/CODE&gt; should then hold the &lt;CODE&gt;content_in_doc&lt;/CODE&gt; binary. If not check any error messages in the output the build. Maybe your are missing the development headers for tesseract? Have you installed tesseract 3 from the source tarball and installed it system-wide using &lt;CODE&gt;sudo make install&lt;/CODE&gt;?&lt;/P&gt;</description>
      <pubDate>Wed, 28 Dec 2011 18:56:50 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325216#M12217</guid>
      <dc:creator>Olivier_Grisel</dc:creator>
      <dc:date>2011-12-28T18:56:50Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325217#M12218</link>
      <description>&lt;P&gt;I ve compiled Olena 1.0 with Tesseract 3.0 with no problem&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2012 05:00:30 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325217#M12218</guid>
      <dc:creator>rbahntje_Bahntj</dc:creator>
      <dc:date>2012-01-03T05:00:30Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325218#M12219</link>
      <description>&lt;P&gt;As written in the &lt;CODE&gt;README.md&lt;/CODE&gt; file and as I already answered you have to run &lt;CODE&gt;make&lt;/CODE&gt; in the &lt;CODE&gt;$SOURCE_ROOT/_build/scribo/src&lt;/CODE&gt; folder as well and the &lt;CODE&gt;content_in_doc&lt;/CODE&gt; binary will be created there too.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2012 14:32:37 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325218#M12219</guid>
      <dc:creator>Olivier_Grisel</dc:creator>
      <dc:date>2012-01-03T14:32:37Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325219#M12220</link>
      <description>&lt;P&gt;I am running make inside $SOURCE_ROOT/_build/scribo/src folder&lt;/P&gt;</description>
      <pubDate>Wed, 04 Jan 2012 05:57:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325219#M12220</guid>
      <dc:creator>rbahntje_Bahntj</dc:creator>
      <dc:date>2012-01-04T05:57:25Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325220#M12221</link>
      <description>&lt;P&gt;I just tried from scratch in a new empty folder from the original tarball and the &lt;CODE&gt;content_in_doc&lt;/CODE&gt; related lines in the Makefile are not commented out and the binary is built successfully. I suspect that in your case the &lt;CODE&gt;configure&lt;/CODE&gt; script did not detect some missing dependency&lt;/P&gt;</description>
      <pubDate>Thu, 05 Jan 2012 10:23:16 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325220#M12221</guid>
      <dc:creator>Olivier_Grisel</dc:creator>
      <dc:date>2012-01-05T10:23:16Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325221#M12222</link>
      <description>&lt;P&gt;Right now I'm trying to compile Olena/content_in_doc on Debian Squeeze. I had to install the following packages to make content_in_doc enabled in Makefiles&lt;/P&gt;</description>
      <pubDate>Fri, 06 Jan 2012 16:49:28 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325221#M12222</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-01-06T16:49:28Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325222#M12223</link>
      <description>&lt;P&gt;In my case I built tesseract 3 from the source tarball (as not yet available in ubuntu, I don't know for debian). tesseract 3 gives much better results than tesseract 2 in practice.&lt;/P&gt;</description>
      <pubDate>Mon, 09 Jan 2012 12:11:09 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325222#M12223</guid>
      <dc:creator>Olivier_Grisel</dc:creator>
      <dc:date>2012-01-09T12:11:09Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325223#M12224</link>
      <description>&lt;P&gt;Here I did it using Squeeze's own Tesseract.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jan 2012 10:01:31 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325223#M12224</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-01-10T10:01:31Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325224#M12225</link>
      <description>&lt;P&gt;Yet another try. Did it by using (hand-compiled) libleptonica and libtesseract (3). Apparently, Olena 2 only detects the latter when it's compiled "--with-multiple-libraries" (so that it has libtesseract_api.so and so on, and not just libtesseract.so).&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2012 16:34:09 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325224#M12225</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-02-09T16:34:09Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325225#M12226</link>
      <description>&lt;P&gt;Ok, finally managed to get every piece together (using Olena's git repository instead of release package, and still patching here and there).&lt;/P&gt;
&lt;P&gt;First time I imported an image, I had an error about Tesseract being unable to find language data. Right (btw : how do we specify Nuxeo what language it should use to apply OCR?). Then I added the language data, and now I don't have any information about OCR anymore, this is perfectly silent. But no annotations are created.&lt;/P&gt;
&lt;P&gt;The only thing that could be related is :&lt;/P&gt;
&lt;P&gt;2012-02-09 17:02:36,993 WARN  [it.tidalwave.image.java2d.ImplementationFactoryJ2D] JAI not available: java.lang.ClassNotFoundException: javax.media.jai.PlanarImage&lt;/P&gt;
&lt;P&gt;Any idea?&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2012 18:08:55 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325225#M12226</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-02-09T18:08:55Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325226#M12227</link>
      <description>&lt;P&gt;Finally I made a fresh install from scratch in an Oracle ELinux 5U7 and I can get the content_in_doc binary (I was missing the GDCM2 library) but now I am having the same issue than OlivierM, when I upload an image the server.log show this message&lt;/P&gt;</description>
      <pubDate>Fri, 24 Feb 2012 13:32:38 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325226#M12227</guid>
      <dc:creator>rbahntje_Bahntj</dc:creator>
      <dc:date>2012-02-24T13:32:38Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325227#M12228</link>
      <description>&lt;P&gt;I ve installed the JAI package ( &lt;A href="http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-java-client-419417.html#7341-JAI-1.1.2-oth-JPR" target="test_blank"&gt;http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-java-client-419417.html#7341-JAI-1.1.2-oth-JPR&lt;/A&gt; ), and copy the jai_codec.jar,  jai_core.jar and mlibwrapper_jai.jar in mi $NUXEOP_HOME/nxserver/lib&lt;/P&gt;
&lt;P&gt;Now I does not get any error messages anymore, but nothing happens when I upload an image file to Nuxeo&lt;/P&gt;
&lt;P&gt;How can I debug what is happenning?&lt;/P&gt;</description>
      <pubDate>Fri, 24 Feb 2012 15:03:57 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325227#M12228</guid>
      <dc:creator>rbahntje_Bahntj</dc:creator>
      <dc:date>2012-02-24T15:03:57Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325228#M12229</link>
      <description>&lt;P&gt;Same here. The JAI warnings disappeared (thanks for the hint!), but nothing is happening.&lt;/P&gt;</description>
      <pubDate>Mon, 27 Feb 2012 09:57:44 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325228#M12229</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-02-27T09:57:44Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325229#M12230</link>
      <description>&lt;P&gt;Oliver, did you find a solution?&lt;/P&gt;</description>
      <pubDate>Thu, 22 Mar 2012 18:36:23 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325229#M12230</guid>
      <dc:creator>rbahntje_Bahntj</dc:creator>
      <dc:date>2012-03-22T18:36:23Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325230#M12231</link>
      <description>&lt;P&gt;Sadly no, I'm still stuck on this, and without time to investigate it further for now.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Mar 2012 12:00:01 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325230#M12231</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-03-23T12:00:01Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325231#M12232</link>
      <description>&lt;P&gt;Oliver&lt;/P&gt;
&lt;P&gt;The content_in_doc command is working fine. I try to convert an image from the commands lines and it works.&lt;/P&gt;
&lt;P&gt;When I upload an image to Nuxeo, I can see a process like this running:&lt;/P&gt;
&lt;P&gt;root     25994 25991 97 19:25 pts/0    00:00:15 content_in_doc /opt/nuxeo-cap-5.5-tomcat/tmp/cmdLineBasedConverter22108.jpg /opt/nuxeo-cap-5.5-tomcat/tmp/ocr_olena_1333236340089.xml&lt;/P&gt;
&lt;P&gt;And the file &lt;CODE&gt;ocr_olena_xxxxxxx.xml&lt;/CODE&gt; is created under $NUXEO_HOME/tmp&lt;/P&gt;
&lt;P&gt;But..... no annotations are generated in the document in Nuxeo
I will try to recompile all again&lt;/P&gt;</description>
      <pubDate>Sun, 01 Apr 2012 01:30:17 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325231#M12232</guid>
      <dc:creator>rbahntje_Bahntj</dc:creator>
      <dc:date>2012-04-01T01:30:17Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325232#M12233</link>
      <description>&lt;P&gt;Thanks to you, I just discovered ocr_olena_XX.xml files are also created in my tmp directory. Good to know.&lt;/P&gt;</description>
      <pubDate>Mon, 02 Apr 2012 10:37:25 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325232#M12233</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-04-02T10:37:25Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325233#M12234</link>
      <description>&lt;P&gt;Ok, just a little thing&lt;/P&gt;</description>
      <pubDate>Mon, 02 Apr 2012 11:28:00 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325233#M12234</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-04-02T11:28:00Z</dc:date>
    </item>
    <item>
      <title>Re: Nuxeo-Platform-OCR Question</title>
      <link>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325234#M12235</link>
      <description>&lt;P&gt;I tried to modify the UserPrincipal to an existing user, and the baseURL to my server's, but it doesn't work any better.&lt;/P&gt;</description>
      <pubDate>Mon, 02 Apr 2012 17:23:58 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/nuxeo-forum/nuxeo-platform-ocr-question/m-p/325234#M12235</guid>
      <dc:creator>OlivierM_</dc:creator>
      <dc:date>2012-04-02T17:23:58Z</dc:date>
    </item>
  </channel>
</rss>

