<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Find a NodeRef from File name in Alfresco Forum</title>
    <link>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115028#M31931</link>
    <description>&lt;P&gt;Deleting Alfresco files directly in the contetstore folder is very bad practice. You should handle this directory like a black box and access files thru the Alfresco API only or would you suggest to run a virus scanner directly on database files?&lt;/P&gt;&lt;P&gt;We developed an Alfresco module (&lt;A href="https://www.ecm-market.de/ecm4u-antivirus-alfresco.html?___store=english" target="_blank" rel="noopener nofollow noreferrer"&gt;ecm4u Antivirus&lt;/A&gt;) which hooks in Alfresco's IO to scan files on write and to prevent read access on content read until successful scan finished using virus scanner API (in most cases CLI). Infected nodes gets an aspect which handles quarantine function inside Alfresco and all the required processing (alarming, notifcation, batch API for cleanup, rescanning, handling false alarming).&lt;/P&gt;&lt;P&gt;Since you already may have deleted bin files you then have an inconsistent repository and need to fix that in an unsupported way. Please take into account that only binary content for nodes in workspace://Spacestore/ can be easily handled. For the other stores and for other binary data you may need to code low level java to get your data fixed (e.g. delete a single document version). Here is the sql we use in our consistency check tool to identify nodes in a specific workspace from bin file name. Since only the content_url_short column is indexed you need to cut the last 12 chars from the filename. Additionally we created an db index. The query shouldn't run longer than a second - even in big systems:&lt;/P&gt;&lt;PRE&gt;select
    an.id, concat('$WORKSPACE_PROT',an.uuid)
from
    alf_content_url cu
    join alf_content_data cd ON (cd.content_url_id=cu.id)
    join alf_node_properties as anp on ( anp.long_value=cd.id AND anp.qname_id = $CONTENT_QN_ID)
    join alf_node an ON (anp.node_id = an.id AND an.store_id=$WORKSPACE_ID)
where cu.content_url_short='$CONTENTURL_SHORT'&lt;/PRE&gt;&lt;P&gt;Hope that helps&lt;/P&gt;</description>
    <pubDate>Mon, 03 Feb 2020 19:24:16 GMT</pubDate>
    <dc:creator>heiko_robert</dc:creator>
    <dc:date>2020-02-03T19:24:16Z</dc:date>
    <item>
      <title>Find a NodeRef from File name</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115027#M31930</link>
      <description>&lt;P&gt;&lt;SPAN&gt;I am&amp;nbsp;facing&amp;nbsp;one&amp;nbsp;issue&amp;nbsp;getting&amp;nbsp;the&amp;nbsp;nodeRef&amp;nbsp;from bin file name for some further&amp;nbsp;actions&amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp; I&amp;nbsp;know&amp;nbsp;the&amp;nbsp;database&amp;nbsp;SQL&amp;nbsp;statement&amp;nbsp;to&amp;nbsp;find&amp;nbsp;the&amp;nbsp;nodeRef&amp;nbsp;based on&amp;nbsp;file&amp;nbsp;path/name&amp;nbsp;(.Bin)&amp;nbsp;from&amp;nbsp;content store&amp;nbsp;which&amp;nbsp;takes&amp;nbsp;around 5&amp;nbsp;minutes&amp;nbsp;to&amp;nbsp;get&amp;nbsp;one&amp;nbsp;&lt;/SPAN&gt;NodeRef.&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Reason&amp;nbsp;for NodeRef&amp;nbsp;-&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;I have a anti-virus software installed on content store which scan all the contents on the disk, if it finds any infected files, it will delete the files from the disk.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;As per organisation policy, It must get&amp;nbsp;deleted, but&amp;nbsp;it&amp;nbsp;will&amp;nbsp;update&amp;nbsp;the&amp;nbsp;application/alfresco repo that this&amp;nbsp;file&amp;nbsp;have been deleted, I have&amp;nbsp;developed&amp;nbsp;the&amp;nbsp;endpoint/web script&amp;nbsp;which will be&amp;nbsp;called&amp;nbsp;from&amp;nbsp;anti-virus&amp;nbsp;software with&amp;nbsp;&lt;SPAN class="s-rg-t"&gt;*&lt;/SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;Bin file name from content store.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Does anybody have any idea to do it in a better way or suggestions?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;-D&lt;/P&gt;</description>
      <pubDate>Mon, 03 Feb 2020 14:34:11 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115027#M31930</guid>
      <dc:creator>dharmendra_pan3</dc:creator>
      <dc:date>2020-02-03T14:34:11Z</dc:date>
    </item>
    <item>
      <title>Re: Find a NodeRef from File name</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115028#M31931</link>
      <description>&lt;P&gt;Deleting Alfresco files directly in the contetstore folder is very bad practice. You should handle this directory like a black box and access files thru the Alfresco API only or would you suggest to run a virus scanner directly on database files?&lt;/P&gt;&lt;P&gt;We developed an Alfresco module (&lt;A href="https://www.ecm-market.de/ecm4u-antivirus-alfresco.html?___store=english" target="_blank" rel="noopener nofollow noreferrer"&gt;ecm4u Antivirus&lt;/A&gt;) which hooks in Alfresco's IO to scan files on write and to prevent read access on content read until successful scan finished using virus scanner API (in most cases CLI). Infected nodes gets an aspect which handles quarantine function inside Alfresco and all the required processing (alarming, notifcation, batch API for cleanup, rescanning, handling false alarming).&lt;/P&gt;&lt;P&gt;Since you already may have deleted bin files you then have an inconsistent repository and need to fix that in an unsupported way. Please take into account that only binary content for nodes in workspace://Spacestore/ can be easily handled. For the other stores and for other binary data you may need to code low level java to get your data fixed (e.g. delete a single document version). Here is the sql we use in our consistency check tool to identify nodes in a specific workspace from bin file name. Since only the content_url_short column is indexed you need to cut the last 12 chars from the filename. Additionally we created an db index. The query shouldn't run longer than a second - even in big systems:&lt;/P&gt;&lt;PRE&gt;select
    an.id, concat('$WORKSPACE_PROT',an.uuid)
from
    alf_content_url cu
    join alf_content_data cd ON (cd.content_url_id=cu.id)
    join alf_node_properties as anp on ( anp.long_value=cd.id AND anp.qname_id = $CONTENT_QN_ID)
    join alf_node an ON (anp.node_id = an.id AND an.store_id=$WORKSPACE_ID)
where cu.content_url_short='$CONTENTURL_SHORT'&lt;/PRE&gt;&lt;P&gt;Hope that helps&lt;/P&gt;</description>
      <pubDate>Mon, 03 Feb 2020 19:24:16 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115028#M31931</guid>
      <dc:creator>heiko_robert</dc:creator>
      <dc:date>2020-02-03T19:24:16Z</dc:date>
    </item>
    <item>
      <title>Re: Find a NodeRef from File name</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115029#M31932</link>
      <description>&lt;P&gt;Thanks for your quick reply and SQL,&lt;/P&gt;&lt;P&gt;We have already have the custom module developed for virus scan which trigger the policy on create or update of the contents.&lt;/P&gt;&lt;P&gt;&lt;A href="https://github.com/Redpill-Linpro/alfresco-clamav" target="_blank" rel="nofollow noopener noreferrer"&gt;https://github.com/Redpill-Linpro/alfresco-clamav&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;From policy, making call to anti-virus API to scan the contents of node and based on status you can remove/keep the noderef.&lt;/P&gt;&lt;P&gt;We have a different business requirements, just to empty the contents or do some more operations further.&lt;/P&gt;&lt;P&gt;-D&lt;/P&gt;</description>
      <pubDate>Tue, 04 Feb 2020 08:20:39 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115029#M31932</guid>
      <dc:creator>dharmendra_pan3</dc:creator>
      <dc:date>2020-02-04T08:20:39Z</dc:date>
    </item>
    <item>
      <title>Re: Find a NodeRef from File name</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115030#M31933</link>
      <description>&lt;P&gt;Hello again,&lt;/P&gt;&lt;P&gt;I try to run the SQL that you provided but having problem finding&amp;nbsp;$CONTENT_QN_ID, Could you please help here.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;select an.id, concat('$WORKSPACE_PROT',an.uuid)&lt;BR /&gt;from alf_content_url cu&lt;BR /&gt;join alf_content_data cd ON (cd.content_url_id=cu.id)&lt;BR /&gt;join alf_node_properties as anp on ( anp.long_value=cd.id AND anp.qname_id = $CONTENT_QN_ID)&lt;BR /&gt;join alf_node an ON (anp.node_id = an.id AND an.store_id=(select id from alf_store where protocol='workspace' AND identifier='SpacesStore'))&lt;BR /&gt;where cu.content_url=(SELECT SUBSTRING ('91a6f583-5530-48e0-a14f-f04e1fb83dd5.bin', length('91a6f583-5530-48e0-a14f-f04e1fb83dd5.bin')-11))&lt;/P&gt;</description>
      <pubDate>Tue, 04 Feb 2020 09:30:46 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115030#M31933</guid>
      <dc:creator>dharmendra_pan3</dc:creator>
      <dc:date>2020-02-04T09:30:46Z</dc:date>
    </item>
    <item>
      <title>Re: Find a NodeRef from File name</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115031#M31934</link>
      <description>&lt;P&gt;sorry - that part was missing:&lt;/P&gt;&lt;P&gt;We preselct the qname id for the content property to filter on. To speed up the query it is queried only once and is not part of the join&lt;/P&gt;&lt;PRE&gt;SQL_CONTENT_QN_ID="SELECT qn.id
FROM alf_qname qn
JOIN alf_namespace ns ON (ns.uri='http://www.alfresco.org/model/content/1.0' AND qn.ns_id=ns.id)
WHERE qn.local_name='content';"&lt;/PRE&gt;</description>
      <pubDate>Wed, 05 Feb 2020 20:40:07 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115031#M31934</guid>
      <dc:creator>heiko_robert</dc:creator>
      <dc:date>2020-02-05T20:40:07Z</dc:date>
    </item>
    <item>
      <title>Re: Find a NodeRef from File name</title>
      <link>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115032#M31935</link>
      <description>&lt;P&gt;But you or your AV solution should never ever directly modify/remove the binary files. Your "different business requirements" should be implemented in the mentioned policy code and/or be implemented using the Alfresco APIs.&lt;/P&gt;&lt;P&gt;BTW clamAV has a very bad &lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;detection&lt;/SPAN&gt; &lt;SPAN class=""&gt;rate&lt;/SPAN&gt;&lt;/SPAN&gt; and you should rely on better AV scanners.&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2020 20:48:47 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-forum/find-a-noderef-from-file-name/m-p/115032#M31935</guid>
      <dc:creator>heiko_robert</dc:creator>
      <dc:date>2020-02-05T20:48:47Z</dc:date>
    </item>
  </channel>
</rss>

