<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Is there any limit to text properties? in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231369#M184499</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I'm not sure you have to add a new property to your model. If your model inherit from the alfresco out-of-the-box cm:content model, there is alreay a cm:content property. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;How do you add your files to the repository? If you use the "add content" action, then the content of your pdf file is automatically added to the cm:content property. if you use a java class to programmatically add new content, please see the &lt;/SPAN&gt;&lt;A href="http://wiki.alfresco.com/wiki/Introducing_the_Alfresco_Java_Content_Repository_API#Adding_Content" rel="nofollow noopener noreferrer"&gt;Introduction to the Alfresco Java Content Repository API&lt;/A&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 11 Oct 2010 13:06:12 GMT</pubDate>
    <dc:creator>ethan</dc:creator>
    <dc:date>2010-10-11T13:06:12Z</dc:date>
    <item>
      <title>Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231365#M184495</link>
      <description>I have PDF containing (multi-page) scanned text document and XML containing its OCR output. Obviously I need the text data to be searchable. With my near-zero experience with Alfresco it would be much easier to store the text-data in a property of that PDF's aspect. (And to throw the XML away comple</description>
      <pubDate>Mon, 11 Oct 2010 08:51:40 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231365#M184495</guid>
      <dc:creator>jzaruba</dc:creator>
      <dc:date>2010-10-11T08:51:40Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231366#M184496</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://connect.hyland.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I tried the same approach but actually there is a limit for text property. One cannot contain more than ~65 000 characters. The best approach to perform a search on a document is to put it into the cm:content property as a stream.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 11 Oct 2010 10:37:47 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231366#M184496</guid>
      <dc:creator>ethan</dc:creator>
      <dc:date>2010-10-11T10:37:47Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231367#M184497</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Thank you for the reply…&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;…I will look more into the cm:content type, I did not know about it, or that it can provide a stream.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks!&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 11 Oct 2010 10:59:32 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231367#M184497</guid>
      <dc:creator>jzaruba</dc:creator>
      <dc:date>2010-10-11T10:59:32Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231368#M184498</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;May I ask you please what is the proper way of populating such property in Java?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;I've defined the property in my Aspect…&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;–&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;lt;property name="com:textContent"&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;type&amp;gt;cm:content&amp;lt;/type&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;lt;/property&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;–&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Should I store (and possibly hide) a text file/document somewhere and then pass its NodeRef as the property value?&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 11 Oct 2010 12:35:21 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231368#M184498</guid>
      <dc:creator>jzaruba</dc:creator>
      <dc:date>2010-10-11T12:35:21Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231369#M184499</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I'm not sure you have to add a new property to your model. If your model inherit from the alfresco out-of-the-box cm:content model, there is alreay a cm:content property. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;How do you add your files to the repository? If you use the "add content" action, then the content of your pdf file is automatically added to the cm:content property. if you use a java class to programmatically add new content, please see the &lt;/SPAN&gt;&lt;A href="http://wiki.alfresco.com/wiki/Introducing_the_Alfresco_Java_Content_Repository_API#Adding_Content" rel="nofollow noopener noreferrer"&gt;Introduction to the Alfresco Java Content Repository API&lt;/A&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 11 Oct 2010 13:06:12 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231369#M184499</guid>
      <dc:creator>ethan</dc:creator>
      <dc:date>2010-10-11T13:06:12Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231370#M184500</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Thanks for your time (and patience), ethan.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;I'm not sure you have to add a new property to your model. If your model inherit from the alfresco out-of-the-box cm:content model, there is alreay a cm:content property.&lt;BR /&gt;&lt;BR /&gt;How do you add your files to the repository? If you use the "add content" action,&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;That's the case at the moment. I'm adding the PDF-files via Web Client UI.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;then the content of your pdf file is automatically added to the cm:content property.&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;Wouldn't then assigning the text content to cm:content property result in &lt;/SPAN&gt;&lt;STRONG&gt;loss of the PDF binary data&lt;/STRONG&gt;&lt;SPAN&gt;? (Or is the data in cm:content property mere (quite useless) copy of the PDF file that Alfresco keeps in its filesystem?)&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;if you use a java class to programmatically add new content, please see the &lt;A href="http://wiki.alfresco.com/wiki/Introducing_the_Alfresco_Java_Content_Repository_API#Adding_Content" rel="nofollow noopener noreferrer"&gt;Introduction to the Alfresco Java Content Repository API&lt;/A&gt;.&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;As far as I can see there are two ways of assigning a value to cm:content property In the examples: either by passing a string value (which I guess bears the 65k limitation) or by passing a stream, as you mentioned earlier. If my understanding is correct I need to obtain somehow the stream to existing NodeRef (the OCR output (which I'd have to extract out of XML)) and assign it to cm:content…&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 11 Oct 2010 14:25:02 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231370#M184500</guid>
      <dc:creator>jzaruba</dc:creator>
      <dc:date>2010-10-11T14:25:02Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231371#M184501</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;The binary of the PDF file is inside the cm:content property and Alfresco can search within it. So if you just add your file with the "add content" action, you should be able to search for the text which is inside the xml in your pdf file. Did you try it? &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;With the java content repository API, you can indeed add content with the method Node.setProperty("cm:content", "my new content") but I think you also need to specify a mimetype. I'm not sure you can put a simple string inside the cm:content property and then search for it with alfresco.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 11 Oct 2010 15:45:15 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231371#M184501</guid>
      <dc:creator>ethan</dc:creator>
      <dc:date>2010-10-11T15:45:15Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231372#M184502</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;The binary of the PDF file is inside the cm:content property and Alfresco can search within it. So if you just add your file with the "add content" action, you should be able to search for &lt;SPAN style="color:#FF0000;"&gt;the text which is inside the xml in your pdf file&lt;/SPAN&gt;. Did you try it?&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;But how do I link the XML into my PDF? (I also took a look at actions that are available for the already uploaded PDF file, but I don't see anything that would let me create such link.)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;I need to be able to do this stuff using API anyways, but I guess I must be missing something important here…&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Just to be sure which 'action "Add content"' you mean, this is where I upload the PDF:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;[img]&lt;/SPAN&gt;&lt;A href="http://dl.dropbox.com/u/219075/Alfresco2.PNG" rel="nofollow noopener noreferrer"&gt;http://dl.dropbox.com/u/219075/Alfresco2.PNG&lt;/A&gt;&lt;SPAN&gt;[/img]&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;With the java content repository API, you can indeed add content with the method Node.setProperty("cm:content", "my new content") but I think you also need to specify a mimetype. I'm not sure you can put a simple string inside the cm:content property and then search for it with alfresco.&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;My understanding was you were actually assigning the stream into cm:content. Weren't you?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Anyways, I guess you're right about the mime-type, this is what I suppose should go into cm:content:&lt;/SPAN&gt;&lt;BR /&gt;&lt;A href="http://wiki.alfresco.com/wiki/Data_Dictionary_Guide#Data_Types" rel="nofollow noopener noreferrer"&gt;http://wiki.alfresco.com/wiki/Data_Dictionary_Guide#Data_Types&lt;/A&gt;&lt;BR /&gt;&lt;STRONG&gt;ContentData(java.lang.String contentUrl, java.lang.String mimetype, long size, java.lang.String encoding)&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Oh and BTW, this attempt of mine did not pass anyways &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://connect.hyland.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;–&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;lt;property name="com:textContent"&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;type&amp;gt;cm:content&amp;lt;/type&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;lt;/property&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;–&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 12 Oct 2010 07:38:28 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231372#M184502</guid>
      <dc:creator>jzaruba</dc:creator>
      <dc:date>2010-10-12T07:38:28Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231373#M184503</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;–&lt;BR /&gt;&amp;lt;property name="com:textContent"&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;type&amp;gt;cm:content&amp;lt;/type&amp;gt;&lt;BR /&gt;&amp;lt;/property&amp;gt;&lt;BR /&gt;–&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;This is because the type must be d:content, not cm:content &lt;img id="smileywink" class="emoticon emoticon-smileywink" src="https://connect.hyland.com/i/smilies/16x16_smiley-wink.png" alt="Smiley Wink" title="Smiley Wink" /&gt; (Look at the /alfresco/WEB-INF/classes/alfresco/model/contentModel.xml file).&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Sorry, my mistake, I thought your xml code was already inside the PDF file u_u. You could take a look at the &lt;/SPAN&gt;&lt;A href="http://wiki.alfresco.com/wiki/Metadata_Extraction" rel="nofollow noopener noreferrer"&gt;metadata extractors&lt;/A&gt;&lt;SPAN&gt; which are called after the file is uploaded on alfresco. Maybe you could implement your own extracter to parse the xml file associated to your pdf file and modify the cm:content property of your pdf file node.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;You could also check &lt;/SPAN&gt;&lt;A href="http://wiki.alfresco.com/wiki/Content_Transformations" rel="nofollow noopener noreferrer"&gt;Content transformation&lt;/A&gt;&lt;SPAN&gt; and OCR integration (&lt;/SPAN&gt;&lt;A href="http://wiki.alfresco.com/wiki/Tiger_OCR_integration" rel="nofollow noopener noreferrer"&gt;here&lt;/A&gt;&lt;SPAN&gt; and &lt;/SPAN&gt;&lt;A href="http://www.intelliant.fr/en/alfresco-ocr-bundle.php" rel="nofollow noopener noreferrer"&gt;there&lt;/A&gt;&lt;SPAN&gt;). &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;As i'm not skilled enough with this part of alfresco process, I can't provide more precise informations =( Hope it'll help though.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 12 Oct 2010 08:53:58 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231373#M184503</guid>
      <dc:creator>ethan</dc:creator>
      <dc:date>2010-10-12T08:53:58Z</dc:date>
    </item>
    <item>
      <title>Re: Is there any limit to text properties?</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231374#M184504</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;&lt;BLOCKQUOTE class="jive-quote"&gt;As i'm not skilled enough with this part of alfresco process, I can't provide more precise informations =( Hope it'll help though.&lt;/BLOCKQUOTE&gt;&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks for your time &amp;amp; effort, I'm gonna look into it.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Cheers&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; JZ&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 12 Oct 2010 09:03:46 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/is-there-any-limit-to-text-properties/m-p/231374#M184504</guid>
      <dc:creator>jzaruba</dc:creator>
      <dc:date>2010-10-12T09:03:46Z</dc:date>
    </item>
  </channel>
</rss>

