<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: OCR for images, pdfs etc in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297041#M250171</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;There are lots many &amp;lt;a href="&lt;/SPAN&gt;&lt;A href="http://www.rasteredge.com/dotnet-imaging/addon-ocr-sdk/" rel="nofollow noopener noreferrer"&gt;http://www.rasteredge.com/dotnet-imaging/addon-ocr-sdk/&lt;/A&gt;&lt;SPAN&gt;"&amp;gt;OCR software&amp;lt;/a&amp;gt; that can do the work.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I generally use RE.OCR.SDK.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 18 Feb 2014 09:02:00 GMT</pubDate>
    <dc:creator>susannamoore</dc:creator>
    <dc:date>2014-02-18T09:02:00Z</dc:date>
    <item>
      <title>OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297035#M250165</link>
      <description>Hi Guys,Does anyone have any advice on how to integrate an OCR service into alfresco..&amp;nbsp; I understand that OCR is normally done by apps like Kofax but our client would like to be able to upload an image or scanned pdf and let Alfresco handle the OCR step so that the docs can be found during search.&amp;nbsp;</description>
      <pubDate>Fri, 31 Jan 2014 06:51:47 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297035#M250165</guid>
      <dc:creator>boneill</dc:creator>
      <dc:date>2014-01-31T06:51:47Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297036#M250166</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Alfresco doesn't provide OCR capabilities out-of-the-box. You might take a look at &lt;/SPAN&gt;&lt;A href="http://www.ephesoft.com/" rel="nofollow noopener noreferrer"&gt;http://www.ephesoft.com/&lt;/A&gt;&lt;SPAN&gt; and see if that can be of assistance.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;The Add-Ons directory also has a number of OCR solutions: &lt;/SPAN&gt;&lt;A href="http://addons.alfresco.com/search/node/ocr" rel="nofollow noopener noreferrer"&gt;http://addons.alfresco.com/search/node/ocr&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;If you want to roll up your sleeves and do your own integration without relying on an integration that's already been built, you can find various OCR libraries out there. Here's one: &lt;/SPAN&gt;&lt;A href="http://code.google.com/p/tesseract-ocr/" rel="nofollow noopener noreferrer"&gt;http://code.google.com/p/tesseract-ocr/&lt;/A&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Jeff&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 31 Jan 2014 16:43:17 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297036#M250166</guid>
      <dc:creator>jpotts</dc:creator>
      <dc:date>2014-01-31T16:43:17Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297037#M250167</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi Jeff,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks for the response.&amp;nbsp; This is exactly the information I needed.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Brian&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 03 Feb 2014 04:48:31 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297037#M250167</guid>
      <dc:creator>boneill</dc:creator>
      <dc:date>2014-02-03T04:48:31Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297038#M250168</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi, &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Have you tried any of those solutions ?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;What is the best(even non-free) solution to have scan and save in alfresco ?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;(i think it most be compatible with alfresco to add some metadata/tags to alfresco for every document that add to alfresco for search and …)&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 05 Feb 2014 07:41:54 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297038#M250168</guid>
      <dc:creator>djnemo2</dc:creator>
      <dc:date>2014-02-05T07:41:54Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297039#M250169</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Metadata extraction is available out-of-the-box. But if you are uploading an image of the document there is no metadata to extract. You need something to convert the image to machine readable text. That's OCR and is not available out-of-the-box.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Jeff&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 05 Feb 2014 19:34:27 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297039#M250169</guid>
      <dc:creator>jpotts</dc:creator>
      <dc:date>2014-02-05T19:34:27Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297040#M250170</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Is there any third party software that someone already used for this ?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;That scan the document, Based on Contents Save it in good directory on server and give report that which document is where ?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thank you&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 06 Feb 2014 08:32:55 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297040#M250170</guid>
      <dc:creator>djnemo2</dc:creator>
      <dc:date>2014-02-06T08:32:55Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297041#M250171</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;There are lots many &amp;lt;a href="&lt;/SPAN&gt;&lt;A href="http://www.rasteredge.com/dotnet-imaging/addon-ocr-sdk/" rel="nofollow noopener noreferrer"&gt;http://www.rasteredge.com/dotnet-imaging/addon-ocr-sdk/&lt;/A&gt;&lt;SPAN&gt;"&amp;gt;OCR software&amp;lt;/a&amp;gt; that can do the work.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I generally use RE.OCR.SDK.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 18 Feb 2014 09:02:00 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297041#M250171</guid>
      <dc:creator>susannamoore</dc:creator>
      <dc:date>2014-02-18T09:02:00Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297042#M250172</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;The page you linked is for .net.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Is there a Java integration as well or was this just a spambot?&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 18 Feb 2014 09:18:41 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297042#M250172</guid>
      <dc:creator>scouil</dc:creator>
      <dc:date>2014-02-18T09:18:41Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297043#M250173</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi, Souil&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I just tried this, but not sure weather this site provides the one for java integration.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;maybe just imaging processing library for java.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 19 Feb 2014 02:04:30 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297043#M250173</guid>
      <dc:creator>susannamoore</dc:creator>
      <dc:date>2014-02-19T02:04:30Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297044#M250174</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;There was recently some discussion on IRC about this project:&lt;/SPAN&gt;&lt;BR /&gt;&lt;A href="https://code.google.com/p/alfresco-tesseract-search/" rel="nofollow noopener noreferrer"&gt;https://code.google.com/p/alfresco-tesseract-search/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Out-of-the-box it was not working with 4.2 but one of our community members did some quick repackaging and got it working on 4.2 in about 30 minutes.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;After doing that, he was able to take scanned images, check them in to Alfresco, and then do a full-text search against them. The tesseract OCR piece was responsible for extracting the text from the scanned images and making it available to the indexer.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Jeff&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 19 Feb 2014 22:37:10 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297044#M250174</guid>
      <dc:creator>jpotts</dc:creator>
      <dc:date>2014-02-19T22:37:10Z</dc:date>
    </item>
    <item>
      <title>Re: OCR for images, pdfs etc</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297045#M250175</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Adding more information on OCR in alfresco.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://www.krutikjayswal.com/2016/07/ocr-on-pdf-file-in-alfresco.html" rel="nofollow noopener noreferrer"&gt;http://www.krutikjayswal.com/2016/07/ocr-on-pdf-file-in-alfresco.html&lt;/A&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 31 Jul 2016 18:10:36 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/ocr-for-images-pdfs-etc/m-p/297045#M250175</guid>
      <dc:creator>krutik_jayswal</dc:creator>
      <dc:date>2016-07-31T18:10:36Z</dc:date>
    </item>
  </channel>
</rss>

