cancel
Showing results for 
Search instead for 
Did you mean: 

XML metadata extracter

golden_eye
Champ in-the-making
Champ in-the-making
Hello everybody
i am new member here.and i need help.
Please give me some advise how to start with metadata extraction for xml file.
I have read on wiki about Metadata Extraction but that  didnt lead  me to the successful implementation.
So i  turned to the community to ask .
How to write metadata extracter for my custom properties define in aspect?
On source code i found some XML metadata extracter.java classes.
Which classes do i need implement in java?
After that the configuration is simple as i understand.The next step is to register my extracter in extension folder in custom-metadata-extractor-context.xml.
And add extractor-mappings.properties for my properties from aspect?
Is that right way?
And run action extract common metadata from fields?
And should work?

Sorry for my english.
Thanks a lot for any answer. Smiley Happy
9 REPLIES 9

golden_eye
Champ in-the-making
Champ in-the-making
Hello
After i wrote java class for xml extracter and inlcude all in bean in extension folder i got error:
Error creating bean with name 'test.xml.XmlMetadataExtracter' defined in file [/opt/Alfresco/Alfresco/tomcat/shared/classes/alfresco/extension/custom-repository-context.xml]: Invocation of init method failed; nested exception is org.alfresco.error.AlfrescoRuntimeException: : Variable 'selectors' has not been settest.xml.XmlMetadataExtracter@d311da.
What that mean?How can i fix it.
Appreciate any help.

golden_eye
Champ in-the-making
Champ in-the-making
After some testing i am still on the same state.not working xml extracter.
When i run my java classes i am getting the below error:I don't know if problem is here,because i can compile classes to jar so i can put jar in lib folder.
I think problem is in extension folder in custom-metadata-extrators-context.xml.
There must be the example how to right modife xml extracter.Or not?
Someone with the same problem?
And maybe with the solution?
I would be very thankful. Smiley Very Happy
java.lang.ExceptionInInitializerError
Caused by: org.springframework.beans.factory.BeanDefinitionStoreException: IOException parsing XML document from class path resource [model.xml]; nested exception is java.io.FileNotFoundException: class path resource [modelPravi.xml] cannot be opened because it does not exist
   at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:347)
   at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:317)
   at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:125)
   at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:141)
   at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:167)
   at org.springframework.context.support.AbstractXmlApplicationContext.loadBeanDefinitions(AbstractXmlApplicationContext.java:113)
   at org.springframework.context.support.AbstractXmlApplicationContext.loadBeanDefinitions(AbstractXmlApplicationContext.java:79)
   at org.springframework.context.support.AbstractRefreshableApplicationContext.refreshBeanFactory(AbstractRefreshableApplicationContext.java:94)
   at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:292)
   at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:92)
   at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:77)
   at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:68)
   at test.xml.XmlMetadataExtracterTest.<clinit>(XmlMetadataExtracterTest.java:76)
   … 13 more
Caused by: java.io.FileNotFoundException: class path resource [model.xml] cannot be opened because it does not exist
   at org.springframework.core.io.ClassPathResource.getInputStream(ClassPathResource.java:135)
   at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:334)
   … 25 more

golden_eye
Champ in-the-making
Champ in-the-making
Hello again i haven't got no response from u guys.
I can belive this is so hard to do, or is it me stupid Smiley Sad  and just don't understand concept of alfresco.
I get log  where i can see my property. set in  file custom-metadata-extrators-context.xml + custom-xml-extractor-mappings.properties
10:12:23,492 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] Added mapping from firstName to [{model}firstName]
but when i look at default extarcters there is also log where extarcter load mapped properties in this case for OpenOfficeMetadataExtracter,which is missing for my extracter. Could this be wrong?
10:18:58,748 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] Loaded mapping properties from resource: org/alfresco/repo/content/metadata/OpenOfficeMetadataExtracter.properties
And when i upload xml file to space nothing happens,metadata extraction won't start.
Default extractors works just fine.
Can please someone confirmed is my approach right.Am i closer to solution or not?
Please people.Some response thoose who already implement xml extracter.Please.
I 've lost a lot of days and my energy study,testing this.

golden_eye
Champ in-the-making
Champ in-the-making
Hi all again,
i have next error.How much i figure it out,problem is he doesn't use my java class to recognize metadata extracter so he could start extracting.

But strange thing here is if i add aspect and fill value handy i can see first name is added with my value {model}firstName=rihanna.
Now all i need to do that this is automatically using java classes?
I try all combination to add my variable in java classes like

private static final String KEY_FirstName = "FirstName";
putRawValue(KEY_FirstName, docInfo.getFirstName(), rawProperties);

nothing seemes to work!

I am going crazy here.Some response.Some advise.

Thanks

09:31:24,182 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] Added mapping from firstName to [{http://www.alfresco.org/alfresco/model}firstName]
09:31:24,203 DEBUG [org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter] Added mapping from model:firstName to /model/firstName/text()
09:31:24,402 DEBUG [org.alfresco.repo.content.metadata.MetadataExtracterRegistry] Registering metadata extracter: org.alfresco.repo.content.metadata.xml.XmlMetadataExtracter@473a14

09:39:10,526 DEBUG [org.alfresco.repo.content.selector.XPathContentWorkerSelector]
Chosen content worker for reader:
   Reader:       ContentAccessor[ contentUrl=store://2009/7/15/9/35/c66f1c39-4bcc-44ae-bef9-b8a31c652be3.bin, mimetype=text/xml, size=1108, encoding=UTF-8, locale=en_US]
   XPath:        null
   Worker:    null
09:39:10,526 DEBUG [org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter]
No working metadata extractor could be found:
   Document: ContentAccessor[ contentUrl=store://2009/7/15/9/35/c66f1c39-4bcc-44ae-bef9-b8a31c652be3.bin, mimetype=text/xml, size=1108, encoding=UTF-8, locale=en_US]
09:39:10,526 DEBUG [org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter]
XML metadata extractor redirected:
   Reader:    ContentAccessor[ contentUrl=store://2009/7/15/9/35/c66f1c39-4bcc-44ae-bef9-b8a31c652be3.bin, mimetype=text/xml, size=1108, encoding=UTF-8, locale=en_US]
   Extracter: null
   Metadata: {{http://www.alfresco.org/model/content/1.0}name=6.xml, {model}firstName=rihanna, {http://www.alfresco.org/model/system/1.0}node-dbid=204406784, {http://www.alfresco.org/model/system/1.0}store-identifier=SpacesStore, {http://www.alfresco.org/model/content/1.0}content=contentUrl=store://2009/7/15/9/35/c66f1c39-4bcc-44..., {http://www.alfresco.org/model/content/1.0}title=6.xml, {http://www.alfresco.org/model/system/1.0}node-uuid=00836780-daf6-4c3e-a706-d47e5db0c5bb, {http://www.alfresco.org/model/content/1.0}modified=Wed Jul 15 09:38:56 CEST 2009, {http://www.alfresco.org/model/content/1.0}author=, {http://www.alfresco.org/model/application/1.0}editInline=true, {http://www.alfresco.org/model/content/1.0}created=Wed Jul 15 09:35:43 CEST 2009, {http://www.alfresco.org/model/system/1.0}store-protocol=workspace, {http://www.alfresco.org/model/content/1.0}creator=admin, {http://www.alfresco.org/model/content/1.0}description=, {http://www.alfresco.org/model/content/1.0}modifier=admin}

golden_eye
Champ in-the-making
Champ in-the-making
Hi all,
at this point i register my xml metadata extracter,and metadata extraction started,but the problem is the result is empty.
From the log i can see he doesn't found XPath.I try all combination with XPath to write values into aspect. I don't know any more what to do.
my XPath looks:
<property name="xpathMappingProperties">
<bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
<property name="properties">
<props>
<prop key="namespace.prefix.test">model</prop>
<prop key="firstName">/model/firstName/text()</prop>
</props>
Someone with the similar problem,issue.
Please,i think i am very close to my solution.
Thanks.

08:30:12,393 DEBUG [org.alfresco.repo.content.selector.XPathContentWorkerSelector]
Chosen content worker for reader:
   Reader:       ContentAccessor[ contentUrl=store://2009/7/29/8/18/de5f39da-24d5-4aa7-b7f2-991747c1e609.bin, mimetype=text/xml, size=90, encoding=UTF-8, locale=en_US]
   XPath:        null
   Worker:    org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter@1a3d191
08:30:12,393 DEBUG [org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter]
Found metadata extracter to process XML document:
   Selector: XPathContentWorkerSelector[ workers={/model=org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter@1a3d191}]
   Document: ContentAccessor[ contentUrl=store://2009/7/29/8/18/de5f39da-24d5-4aa7-b7f2-991747c1e609.bin, mimetype=text/xml, size=90, encoding=UTF-8, locale=en_US]
08:30:12,393 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] Starting metadata extraction:
   reader: ContentAccessor[ contentUrl=store://2009/7/29/8/18/de5f39da-24d5-4aa7-b7f2-991747c1e609.bin, mimetype=text/xml, size=90, encoding=UTF-8, locale=en_US]
   extracter: org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter@1a3d191
08:30:12,394 DEBUG [org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter]
Extracted XML metadata:
   Reader:  ContentAccessor[ contentUrl=store://2009/7/29/8/18/de5f39da-24d5-4aa7-b7f2-991747c1e609.bin, mimetype=text/xml, size=90, encoding=UTF-8, locale=en_US]
   Results: {}
08:30:12,394 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] Converted extracted raw values to system values:
   Raw Properties:    {}
   System Properties: {}
08:30:12,394 DEBUG [org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter] Completed metadata extraction:
   reader:    ContentAccessor[ contentUrl=store://2009/7/29/8/18/de5f39da-24d5-4aa7-b7f2-991747c1e609.bin, mimetype=text/xml, size=90, encoding=UTF-8, locale=en_US]
   extracter: org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter@1a3d191
   changed:   {}
08:30:12,394 DEBUG [org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter]
XML metadata extractor redirected:
   Reader:    ContentAccessor[ contentUrl=store://2009/7/29/8/18/de5f39da-24d5-4aa7-b7f2-991747c1e609.bin, mimetype=text/xml, size=90, encoding=UTF-8, locale=en_US]
   Extracter: org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter@1a3d191
   Metadata: {}

golden_eye
Champ in-the-making
Champ in-the-making
Still no response from anyone. I can&#39;t believe no one want&#39;s to help or at least share opinion with me. Smiley Sad
Is it so hard to read one post and response back those who have successfully implemented it?  Smiley Sad

golden_eye
Champ in-the-making
Champ in-the-making
Hello
can someone please attach me the working example,source,code anything of xml metadata extracter.
After reading almost all posts i still can't get to work it.
Only 3 files are needed,custom model where i define aspect,appropriate xml and custom-metadata-extrators-context.xml.
My custom-metadata-extrators-context.xml looks like this:

<beans>
      <bean id="extracter.XML"
            class="org.alfresco.repo.content.metadata.xml.XmlMetadataExtracter"
            parent="baseMetadataExtracter" >
         <property name="selectors">
            <list>
               <ref bean="extracter.XMLXPathSelector" />
            </list>
         </property>
      </bean>  
          <bean id="extracter.XMLXPathSelector" class="org.alfresco.repo.content.selector.XPathContentWorkerSelector"
           init-method="init">
             <property name="workers">
                <map>
                   <entry key="/model">
                      <ref bean="extracter.xml.sample.AlfrescoModelMetadataExtracter" />
                   </entry>
                </map>
             </property>
          </bean>
         
          <bean id="extracter.xml.sample.AlfrescoModelMetadataExtracter"
           class="org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter"
           parent="baseMetadataExtracter" init-method="init">
                      <props>
                         <prop key="namespace.prefix.test">model</prop>
                         <prop key="firstName">test:firstName</prop>
                       </props>
                   </property>
                </bean>
             </property>
           
             <property name="xpathMappingProperties">
                <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
                   <property name="properties">
                      <props>
                         <prop key="namespace.prefix.test">model</prop>
                         <prop key="firstName">/model/firstName/text()</prop>
                              </props>
                   </property>
                </bean>
             </property>
          </bean>
         
         <!–
         <bean id="extracter.xml.sample.XMLMetadataExtracter"
               class="org.alfresco.repo.content.metadata.xml.XmlMetadataExtracter"
               parent="baseMetadataExtracter">
            <property name="registry">
               <ref bean="avmMetadataExtracterRegistry" />
            </property>
            <property name="overwritePolicy">
               <value>EAGER</value>
            </property>
            <property name="selectors">
               <list>
                  <ref bean="extracter.xml.sample.selector.XPathSelector" />
               </list>
            </property>
         </bean>
       –>
      </beans>

miguel_gil_mart
Champ in-the-making
Champ in-the-making
You closed </property>, but did not open. Use a xml validator to check you code before deploy.http://validator.w3.org

romschn
Star Collaborator
Star Collaborator
I just read this post.

And sharing my knowledge about how to configure the metadata extractor.

Below are the basic steps required to configure the metadata extractor.

1. Create a model file and put an entry for aspect. e.g create a file called testModel.xml.
                <aspects>
      <!– Test Aspect –>
      <aspect name="test:testAspect">
         <title>Test Aspect</title>
         <properties>
            <property name="test:launchdate">
               <title>Launch Date</title>
               <type>d:date</type>
            </property>
            <property name="test:expirationdate">
               <title>Expiration Date</title>
               <type>d:date</type>
            </property>
         </properties>
      </aspect>
      </aspects>
2. Register the Model to the spring context. Create a file called test-model-context.xml and put below entry in it.
        <bean id="extension.dictionaryBootstrap" parent="dictionaryModelBootstrap" depends-on="dictionaryBootstrap">
        <property name="models">
            <list>
                <value>alfresco/extension/testModel.xml</value>
            </list>
        </property>
    </bean>
3. Create a spring context file for metadata extractor. Create a file called test-metadata-extractor-context.xml and put below entry in it.
               <bean id="com.test.NewsEventMetadataExtracter"
      class="org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter"
      parent="baseMetadataExtracter" init-method="init">
      <property name="supportedDateFormats">
         <list>
            <value>yyyy-MM-dd</value>
         </list>
      </property>
      <property name="mappingProperties">
         <bean
            class="org.springframework.beans.factory.config.PropertiesFactoryBean">
            <property name="properties">
               <props>
                  <prop key="namespace.prefix.test">http://www.test.com/model/content/1.0</prop>
                  <prop key="newsevent_launch">test:launchdate</prop>
                  <prop key="newsevent_expiration">test:expirationdate</prop>
               </props>
            </property>
         </bean>
      </property>
      <property name="xpathMappingProperties">
         <bean
            class="org.springframework.beans.factory.config.PropertiesFactoryBean">
            <property name="properties">
               <props>
                  <prop key="namespace.prefix.test">http://www.test.com/model/content/1.0</prop>
                  <prop key="newsevent_launch">/news_events/launch_date/text()</prop>
                  <prop key="newsevent_expiration">/news_events/expiration_date/text()</prop>
               </props>
            </property>
         </bean>
      </property>
   </bean>
   <bean id="extracter.xml.sample.selector.XPathSelector"
      class="org.alfresco.repo.content.selector.XPathContentWorkerSelector"
      init-method="init">
      <property name="workers">
         <map>
            <entry key="/news_events">
               <ref bean="com.test.NewsEventMetadataExtracter" />
            </entry>
         </map>
      </property>
   </bean>

Above files need to placed into tomcat/shared/classes/alfresco/extension folder. Re-start the server.

Above code snippet configurations will have metadata extractor in place for you and all the contents created for news-events.xsd will have the aspect applied on it.

I hope it helps.

Thanks,