<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Can Alfresco be used to manage scanned paper documents? in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14989#M6531</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;But wouldn't it be nice to integrate open source OCR software into alfresco? That would be completely into the Alfresco philosophy and would save a lot of money for medium sized business without large volume production scanners. I can imagine flat bed scanners at every medium sized department.&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;Sorry to resurrect an old thread, but I am trying to achieve this, and it does not look very difficult.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;I wrote a few lines of code to add invisible text to an existing PDF (using the Open Source Java PDFBox library).&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;So now I guess I have all of the pieces, and it becomes an Alfresco question: How to best architecture this?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Maybe an Alfresco action that calls Tesseract via command line and then inserts the OCR'd text into the PDF?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Or the same as a transformer?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks for any feedback!&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Nicolas Raoul&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 28 Feb 2011 08:38:37 GMT</pubDate>
    <dc:creator>nicolasraoul</dc:creator>
    <dc:date>2011-02-28T08:38:37Z</dc:date>
    <item>
      <title>Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14980#M6522</link>
      <description>We are a small accounting firm looking at a document management solution to move towards the 'less paper' office. Seems Alfesco looks great as a broader CMS, could you outline why Alfresco would be suitable for managing scanned in documents, along the normal lines as required for any office's genera</description>
      <pubDate>Tue, 02 May 2006 04:20:43 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14980#M6522</guid>
      <dc:creator>evolve2k</dc:creator>
      <dc:date>2006-05-02T04:20:43Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14981#M6523</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I, too, need this functionality. Is this currently possible?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks,&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Sean&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 18 Jul 2006 22:03:18 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14981#M6523</guid>
      <dc:creator>seanh</dc:creator>
      <dc:date>2006-07-18T22:03:18Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14982#M6524</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Alfresco integrates with Kofax and eCopy; leading scanning and capture solutions.&amp;nbsp; This means that scanned documents can be added to Alfresco automatically.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Alfresco can then categorise and file those documents according to user defined "Rules".&amp;nbsp; These are like Inbox rules in MS Office.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Search can be performed against the content, content meta-data, folder location and category.&amp;nbsp; The scanning solutions can extract important values from scanned documents which may be used as content meta-data for advanced Alfresco searches e.g. find tax office letter with reference number 12456.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Workflows or notifications (such as an e-mail) may be triggered on addition of new content or an rss feed may be subscribed to.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;All of this available by configuring scanning integration and rules.&amp;nbsp; No coding is required.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&lt;SPAN&gt;I suggest you send an e-mail to &lt;/SPAN&gt;&lt;A class="jive-link-email-small" href="mailto:info@alfresco.com" rel="nofollow noopener noreferrer"&gt;info@alfresco.com&lt;/A&gt;&lt;SPAN&gt; with your requirements where more information about how to get the scanning integration can be made available.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 19 Jul 2006 22:29:58 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14982#M6524</guid>
      <dc:creator>davidc</dc:creator>
      <dc:date>2006-07-19T22:29:58Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14983#M6525</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi David&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Are you aware of anything similar to the Kofax release script planned or available for Nuance's OmniPage Professional 15?&amp;nbsp; I see that Nuance are not on your partners list.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;thanks&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Jason&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 04 Aug 2006 08:13:27 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14983#M6525</guid>
      <dc:creator>jharrop</dc:creator>
      <dc:date>2006-08-04T08:13:27Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14984#M6526</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Does anybody can tell us about how a Kofax Ascent Capture integration would impact economically an Alfresco implementation?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Is it licensed by scanning workstation?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;************************&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Is Alfresco going to consider implementing capture features in the near future?&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Apr 2007 03:16:28 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14984#M6526</guid>
      <dc:creator>othni</dc:creator>
      <dc:date>2007-04-24T03:16:28Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14985#M6527</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I work for a Kofax partner. Kofax licenses their software by a combination of workstations required and pages scanned. Depending on those variables, the cost is determined for software. For installation and professional services, really depends on how much configuration is required. You could look at 5K+ for an install. The release script is normally supplied by Kofax, if it is an official Kofax supported release script. I think that the 2.0 version of Alfresco includes the release scripts as part of the download. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Just gives you some ideas on how the cost would be impacted.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Apr 2007 13:37:14 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14985#M6527</guid>
      <dc:creator>rbelisle</dc:creator>
      <dc:date>2007-04-24T13:37:14Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14986#M6528</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;hi,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I recently purchased the "Alfresco book" by Munwar Shariff, and there is a Chapter (13) dedicated to Implementing Imaging and Forms Processing, which runs through an example showing how a French bank scans and processes 20,000 documents / hour… nice. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I recommend you have a read of this book.&amp;nbsp; I cannot comment anymore cause I haven't implemented any scanning yet and I do not want to infringe on the books Copyright protection. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Cheers,&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Bradley&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 25 Apr 2007 03:32:39 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14986#M6528</guid>
      <dc:creator>bsawler</dc:creator>
      <dc:date>2007-04-25T03:32:39Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14987#M6529</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;But wouldn't it be nice to integrate open source OCR software into alfresco? That would be completely into the Alfresco philisophy and would save a lot of money for medium sized business without large volume production scanners. I can imagine flat bed scanners at every medium sized department.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;eg:&lt;/SPAN&gt;&lt;BR /&gt;&lt;A href="http://code.google.com/p/tesseract-ocr/" rel="nofollow noopener noreferrer"&gt;http://code.google.com/p/tesseract-ocr/&lt;/A&gt;&lt;BR /&gt;&lt;A href="http://code.google.com/p/ocropus/" rel="nofollow noopener noreferrer"&gt;http://code.google.com/p/ocropus/&lt;/A&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 26 Dec 2007 15:27:08 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14987#M6529</guid>
      <dc:creator>rscheele</dc:creator>
      <dc:date>2007-12-26T15:27:08Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14988#M6530</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;But wouldn't it be nice to integrate open source OCR software into alfresco? That would be completely into the Alfresco philisophy and would save a lot of money for medium sized business without large volume production scanners. I can imagine flat bed scanners at every medium sized department.&lt;BR /&gt;&lt;BR /&gt;eg:&lt;BR /&gt;&lt;A href="http://code.google.com/p/tesseract-ocr/" rel="nofollow noopener noreferrer"&gt;http://code.google.com/p/tesseract-ocr/&lt;/A&gt;&lt;BR /&gt;&lt;A href="http://code.google.com/p/ocropus/" rel="nofollow noopener noreferrer"&gt;http://code.google.com/p/ocropus/&lt;/A&gt;&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; I don't think there's yet a Tesseract based package to create a searchable PDF, or at least one that's free.&amp;nbsp; (I have my suspicions that ScanWiz may be Tesseract based).&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; I guess Alfresco could index PDFs without making them searchable using Tesseract in the way that DocMGR does : &lt;/SPAN&gt;&lt;A href="http://docmgr.sourceforge.net/install.php" rel="nofollow noopener noreferrer"&gt;http://docmgr.sourceforge.net/install.php&lt;/A&gt;&lt;SPAN&gt; .&amp;nbsp; Still, the real solution is to make the PDFs searchable in the first place, and then Alfresco would index them quite happily.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; As an aside it would be nice if Alfresco could pass search-words to Acrobat Reader so that PDFs open with search-words already highlighted.&amp;nbsp; This can be done through Acrobat Readers "Open Parameters" via a URL.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 29 Dec 2007 15:30:56 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14988#M6530</guid>
      <dc:creator>pav5088</dc:creator>
      <dc:date>2007-12-29T15:30:56Z</dc:date>
    </item>
    <item>
      <title>Re: Can Alfresco be used to manage scanned paper documents?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14989#M6531</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;But wouldn't it be nice to integrate open source OCR software into alfresco? That would be completely into the Alfresco philosophy and would save a lot of money for medium sized business without large volume production scanners. I can imagine flat bed scanners at every medium sized department.&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;Sorry to resurrect an old thread, but I am trying to achieve this, and it does not look very difficult.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;I wrote a few lines of code to add invisible text to an existing PDF (using the Open Source Java PDFBox library).&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;So now I guess I have all of the pieces, and it becomes an Alfresco question: How to best architecture this?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Maybe an Alfresco action that calls Tesseract via command line and then inserts the OCR'd text into the PDF?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Or the same as a transformer?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks for any feedback!&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Nicolas Raoul&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 28 Feb 2011 08:38:37 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/can-alfresco-be-used-to-manage-scanned-paper-documents/m-p/14989#M6531</guid>
      <dc:creator>nicolasraoul</dc:creator>
      <dc:date>2011-02-28T08:38:37Z</dc:date>
    </item>
  </channel>
</rss>

