cancel
Showing results for 
Search instead for 
Did you mean: 

Can i store only metadata in alfresco?

dynamolalit
Champ on-the-rise
Champ on-the-rise
Hi,

I have some records as below & there are millions of these. These can be considered as metadata fields.

TD|TID|Title
5869|47737|Usual Suspects
6406|47738|Fargo
6140|47739|Dead Man Walking
5019|47740|Four Weddings And A Funeral

Now i have a requirement to store all these into alfresco repository but usually all metadata is attached to a file say text etc but here i do not have any file only metadata. So i want to know is there any way to store these kind of items into alfresco repository.

Further, there are records which are interconnected so my second question is can i store an object into a metadata field of a content say a pojo kind of object in a metadata value of an attribute?

As i already mentioned there are millions of such records, i am looking for most performance efficient way to do it.

Any help.

Regards.
8 REPLIES 8

afaust
Legendary Innovator
Legendary Innovator
Hello,

Alfresco allows to store metadata-only objects. These is technically not at all different from files, except that they do not have a content property. E.g. the items in a Share data list, tags, categories or even the identity of a specific user are usually just metadata-only objects. When you model a type that never should be considered a file, you can extend it from cm:cmobject instead of the common cm:content.

In terms of supporting millions of objects, it totally depends on the organizational structure of your objects and the way you want to process or surface them in the UI. We have implemented metadata-only object collection for several customers, including instances of multi-million (largest: est. 5 million per object set) object collections.

In terms of interconnected records, you have the option of storing references via actual relation / association instances or by linking identities via properties. You can also include complex objects in properties, but this may require significant customization depending on the specific use case, actions and UI.

Regards
Axel

dynamolalit
Champ on-the-rise
Champ on-the-rise
Thanks Axel, it was helpful.

I got the point to create a model directly from cmSmiley Surprisedbject with no cm:content type inherited but as i have millions of records, i am nervous about performance as i do know the fact that under one folder in repository, ideally about 2000 content items should be stored but for millions of records, it seems to be unfit so what should be best way for organizational structure here?

Also how can render these on share UI?Finally for association, can you suggest best way possible?

Can you help here?

Regards.

dynamolalit
Champ on-the-rise
Champ on-the-rise
Hi,


I have created a new model as below.


<?xml version="1.0" encoding="UTF-8"?>

   
<model    name="ucs:ucsmodel" xmlns="http://www.alfresco.org/model/dictionary/1.0"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.alfresco.org/model/dictionary/1.0/modelSchema.xsd">

  

   <!– Imports are required to allow references to definitions in other models –>  
   <imports>
        <!– Import Alfresco Dictionary Definitions –>
      <import uri="http://www.alfresco.org/model/dictionary/1.0" prefix="d"/>
      <!– Import Alfresco Content Domain Model Definitions –>
      <import uri="http://www.alfresco.org/model/content/1.0" prefix="cm"/>
     <!– Import Alfresco Data List Model Definitions –>
      <import uri="http://www.alfresco.org/model/datalist/1.0" prefix="dl"/>
   </imports>

   <!– Introduction of new namespaces defined by this model –>

   <namespaces>
      <namespace uri="http://www.lalit.com/model/ucs/1.0" prefix="ucs"/>
   </namespaces>

     
   <!–      U C S   T Y P E   D E F I N I T I O N S      –>
   <types>  
      <!– Definition of new Content Type: UCS Object Type  –>
      <type name="ucs:object">
         <title>UCS Object</title>
         <parent>cm:cmobject</parent>
         <properties>
         <property name="ucs:IVAID">
            <description>UCS IVAID</description>
               <type>d:int</type>
            <mandatory>true</mandatory>
            </property>
         <property name="ucs:TitleID">
            <description>UCS TitleID</description>
               <type>d:int</type>
            <mandatory>true</mandatory>
            </property>
         <property name="ucs:Title">
            <description>UCS Title</description>
               <type>d:text</type>
            <mandatory>true</mandatory>
         </property>
         </properties>             
      </type>   
   </types>
</model>

And share-config-custom.xml as below


<alfresco-config>
  
   <!– Repository Library config section –>
   <config evaluator="string-compare" condition="RepositoryLibrary" replace="true">
      <!–
         Whether the link to the Repository Library appears in the header component or not.
      –>
      <visible>true</visible>
   </config>

   <config evaluator="string-compare" condition="Remote">
      <remote>
         <endpoint>
            <id>alfresco-noauth</id>
            <name>Alfresco - unauthenticated access</name>
            <description>Access to Alfresco Repository WebScripts that do not require authentication</description>
            <connector-id>alfresco</connector-id>
            <endpoint-url>http://localhost:8080/alfresco/s</endpoint-url>
            <identity>none</identity>
         </endpoint>

         <endpoint>
            <id>alfresco</id>
            <name>Alfresco - user access</name>
            <description>Access to Alfresco Repository WebScripts that require user authentication</description>
            <connector-id>alfresco</connector-id>
            <endpoint-url>http://localhost:8080/alfresco/s</endpoint-url>
            <identity>user</identity>
         </endpoint>

         <endpoint>
            <id>alfresco-feed</id>
            <name>Alfresco Feed</name>
            <description>Alfresco Feed - supports basic HTTP authentication via the EndPointProxyServlet</description>
            <connector-id>http</connector-id>
            <endpoint-url>http://localhost:8080/alfresco/s</endpoint-url>
            <basic-auth>true</basic-auth>
            <identity>user</identity>
         </endpoint>
        
         <endpoint>
            <id>activiti-admin</id>
            <name>Activiti Admin UI - user access</name>
            <description>Access to Activiti Admin UI, that requires user authentication</description>
            <connector-id>activiti-admin-connector</connector-id>
            <endpoint-url>http://localhost:8080/alfresco/activiti-admin</endpoint-url>
            <identity>user</identity>
         </endpoint>
      </remote>
   </config>

<!– Global config section –>
   <config replace="true">
      <flags>
         <!–
            Developer debugging setting to turn on DEBUG mode for client scripts in the browser
         –>
         <client-debug>false</client-debug>

         <!–
            LOGGING can always be toggled at runtime when in DEBUG mode (Ctrl, Ctrl, Shift, Shift).
            This flag automatically activates logging on page load.
         –>
         <client-debug-autologging>false</client-debug-autologging>
      </flags>
   </config>

   <!– Document Library config section –>
   <config evaluator="string-compare" condition="DocumentLibrary" >

  
      <tree>
         <!–
            Whether the folder Tree component should enumerate child folders or not.
            This is a relatively expensive operation, so should be set to "false" for Repositories with broad folder structures.
         –>
         <evaluate-child-folders>false</evaluate-child-folders>
        
         <!–
            Optionally limit the number of folders shown in treeview throughout Share.
         –>
         <maximum-folder-count>-1</maximum-folder-count>
      </tree>

    

      <!–
         Used by the "Change Type" action

         Define valid subtypes using the following example:
            <type name="cm:content">
               <subtype name="cm:mysubtype" />
            </type>

         Remember to also add the relevant i18n string(s):
            cm_mysubtype=My SubType
      –>
      <types>
         <type name="cm:content">
              <subtype name="ta:ta" />
           <subtype name="ucs:object" />
         </type>

         <type name="cm:folder">
         </type>
   </types>

       <create-content>
         <!–<content mimetype="text/xml" icon="xml" label="Audit-Node" itemId="ta:ta"  />–>
         <content id="html" label="UCS Object" type="pagelink" index="40">
            <!– <param name="page">create-content?destination={nodeRef}&amp;itemId=ta:ta&amp;mimeType=text/html</param>  –>
         <param name="page">create-content?destination={nodeRef}&amp;itemId=ucs:object&amp;mimeType=text/html</param>
         </content>
      </create-content>
    

     </config>





      <config evaluator="node-type" condition="ucs:object">
   <set id="ucsPanel" appearance="fieldset" label="UCS Fields" />
         
         <forms>
            <form>
               <field-visibility>
                  <!– add the fields we want back in so that they are displayed –>
                  <show id="cm:name" />            
               <!–    <show id="cm:description" force="true" /> –>
                                                               
                  <!– fields from our custom type –>               
                  <!–<show id="ta:auditName" /> –>
                  <show id="ucs:IVAID"  />
                  <show id="ucs:TitleID" />
                  <show id="ucs:Title" />
                  
               </field-visibility>            
               <appearance>
               <set id="taPanel" appearance="fieldset" label="UCS Fields" />
                  
                  <field id="cm:name"               set="ucsPanel"   label="Object Name"/>                  
                  <field id="ucs:IVAID"             set="ucsPanel"   label="Object IVAID" />
                  <field id="ucs:TitleID"          set="ucsPanel"   label="Object TitleId"/>   
                  <field id="ucs:Title"             set="ucsPanel"   label="Object Title"/>      
                     
               </appearance>   
            </form>
         </forms>
         </config>



<!– create model    –>
     <config evaluator="model-type" condition="ucs:object">
      <set id="ucsPanel" appearance="fieldset" label="UCS Fields" />         
         <forms>
            <form>
               <field-visibility>
                  <!– add the fields we want back in so that they are displayed –>
                  <show id="cm:name" />            
                  <show id="ucs:IVAID"  />
                  <show id="ucs:TitleID" />
                  <show id="ucs:Title" />                  
               </field-visibility>            
               <appearance>
               <set id="ucsPanel" appearance="fieldset" label="UCS Fields" />
                  <field id="cm:name"               set="ucsPanel"   label="Object Name"/>                  
                  <field id="ucs:IVAID"             set="ucsPanel"   label="Object IVAID" />
                  <field id="ucs:TitleID"          set="ucsPanel"   label="Object TitleId"/>   
                  <field id="ucs:Title"             set="ucsPanel"   label="Object Title"/>                              
               </appearance>   
            </form>
         </forms>
         </config>

      


</alfresco-config>


Model was deployed successfully & i was able to create ucs type content but when i save content, i get below message on sahre

The item cannot be found. Either you do not have permissions to view the item, it has been removed or it never existed.

And below errors in log


16:56:15,966 WARN  [org.alfresco.web.bean.LoginBean] Security violation. Unable to redirect to external location:
16:58:01,562 ERROR [org.springframework.extensions.webscripts.AbstractRuntime] Exception from executeScript - redirecting to status template error: 11050003 Wrapped Exception (with status template): 11050018 Failed to execute script 'classpath*:alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js': null
org.springframework.extensions.webscripts.WebScriptException: 11050003 Wrapped Exception (with status template): 11050018 Failed to execute script 'classpath*:alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js': null
   at org.springframework.extensions.webscripts.AbstractWebScript.createStatusException(AbstractWebScript.java:1050)
   at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.java:171)
   at org.alfresco.repo.web.scripts.RepositoryContainer$3.execute(RepositoryContainer.java:422)
   at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:433)
   at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecute(RepositoryContainer.java:491)
   at org.alfresco.repo.web.scripts.RepositoryContainer.transactionedExecuteAs(RepositoryContainer.java:529)
   at org.alfresco.repo.web.scripts.RepositoryContainer.executeScript(RepositoryContainer.java:345)
   at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:377)
   at org.springframework.extensions.webscripts.AbstractRuntime.executeScript(AbstractRuntime.java:209)
   at org.springframework.extensions.webscripts.servlet.WebScriptServlet.service(WebScriptServlet.java:118)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
   at org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(GlobalLocalizationFilter.java:61)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
   at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
   at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
   at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
   at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
   at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
   at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:929)
   at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
   at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
   at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1002)
   at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:585)
   at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1813)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
Caused by: org.alfresco.scripts.ScriptException: 11050018 Failed to execute script 'classpath*:alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js': null
   at org.alfresco.repo.jscript.RhinoScriptProcessor.execute(RhinoScriptProcessor.java:202)
   at org.alfresco.repo.processor.ScriptServiceImpl.execute(ScriptServiceImpl.java:212)
   at org.alfresco.repo.processor.ScriptServiceImpl.executeScript(ScriptServiceImpl.java:174)
   at org.alfresco.repo.web.scripts.RepositoryScriptProcessor.executeScript(RepositoryScriptProcessor.java:102)
   at org.springframework.extensions.webscripts.AbstractWebScript.executeScript(AbstractWebScript.java:1288)
   at org.springframework.extensions.webscripts.DeclarativeWebScript.execute(DeclarativeWebScript.java:86)
   … 28 more
Caused by: java.lang.NullPointerException
   at org.alfresco.repo.jscript.app.JSONConversionComponent.setRootValues(JSONConversionComponent.java:188)
   at org.alfresco.repo.jscript.app.JSONConversionComponent.toJSON(JSONConversionComponent.java:162)
   at org.alfresco.repo.jscript.ApplicationScriptUtils.toJSON(ApplicationScriptUtils.java:69)
   at sun.reflect.GeneratedMethodAccessor481.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:155)
   at org.mozilla.javascript.NativeJavaMethod.call(NativeJavaMethod.java:243)
   at org.mozilla.javascript.optimizer.OptRuntime.call2(OptRuntime.java:76)
   at org.mozilla.javascript.gen.c10._c2(file:/C:/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js:108)
   at org.mozilla.javascript.gen.c10.call(file:/C:/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js)
   at org.mozilla.javascript.optimizer.OptRuntime.call1(OptRuntime.java:66)
   at org.mozilla.javascript.gen.c10._c12(file:/C:/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js:590)
   at org.mozilla.javascript.gen.c10.call(file:/C:/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js)
   at org.mozilla.javascript.optimizer.OptRuntime.callName0(OptRuntime.java:108)
   at org.mozilla.javascript.gen.c10._c0(file:/C:/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js:654)
   at org.mozilla.javascript.gen.c10.call(file:/C:/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js)
   at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:393)
   at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:2834)
   at org.mozilla.javascript.gen.c10.call(file:/C:/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js)
   at org.mozilla.javascript.gen.c10.exec(file:/C:/alfresco/tomcat/webapps/alfresco/WEB-INF/classes/alfresco/templates/webscripts/org/alfresco/slingshot/documentlibrary-v2/node.get.js)
   at org.alfresco.repo.jscript.RhinoScriptProcessor.executeScriptImpl(RhinoScriptProcessor.java:492)
   at org.alfresco.repo.jscript.RhinoScriptProcessor.execute(RhinoScriptProcessor.java:198)
   … 33 more

Please help.

afaust
Legendary Innovator
Legendary Innovator
Hello,

the NullPointerException from your log is the relevant information. Unfortunately, it seems that there have been some changes in Alfresco with 4.x that make it impossible to use cm:cmobject as a parent type for elements in the document library. This was not the case in the 3.x I used previously. You now actually do need to use cm:content as the parent type. This is due to the internals of the changes for SOLR, CannedQuery and Document Library Extension support, where the code makes use of the FileFolderService, requiring all elements to be either cm:content or cm:folder derived.

So, you need to change your model to reflect this and use cm:content as parent type.

Regards
Axel

dynamolalit
Champ on-the-rise
Champ on-the-rise
Thanks Axel,

As per your suggestion, i have updated my model & error is gone.

Can you input on content organization & content association for this huge content set?

Regards.

afaust
Legendary Innovator
Legendary Innovator
Hello,

what you want to do is partition your data set as best as possible, so that you have similar sized chunks of objects that are by themselves very small subsets of your overall data. The criteria of partition depends on your specific data set.
For example, if you want to store a couple of million emails that have accumulated over a year, you might chose the date of receipt as the partition criteria and create a directory structure based on year, month and day. This leaves you with a few couple of thousand mails per lowest-level folder, which are fine. If you have a few hundred of million of mails in the same period, basically add another dimension into your hierarchy, say the hour of day, to further subdivide your data.

It is important to note that any structure you create is primarily technical / for partitioning. If possible, you may want to avoid exposing too much of this to the end user. This can be achieved by using SOLR path searches instead of DB navigation, but may require you to write your own set of navigation logic.

Regards
Axel

mildsteel
Champ in-the-making
Champ in-the-making
Hello,

I was following the thread above as I am in a similar situation i.e creating "metadata-only" content. If we change the parent type to "cm:content" (as required in versions 4.x and above), are we not back to the same situation i.e how does this help us create metadata-only content?

Regards
ms

benswitzer
Champ in-the-making
Champ in-the-making
Hi mildsteel,

If you don't require cm:content for your object model, you can simple hide the field from view in the form. 

To do so, in your share-config-custom.xml, simply remove <show id="cm:content" /> for your model's form. Alternately you could use <hide id="cm:content" />.


Ben