cancel
Showing results for 
Search instead for 
Did you mean: 

WCM web form metadata extraction

wshen
Champ in-the-making
Champ in-the-making
Does WCM web form metadata extraction still available in 3.2?

I downloaded and installed 3.2 preview2 lab edition, follow the instruction from http://wiki.alfresco.com/wiki/Metadata_Extraction#XML_Meta-data_Extractor_Configuration_for_WCM (I simply rename the wcm-xml-metadata-extracter-context.xml.sample to .xml in tomcat/shared/classes/alfresco/extension, also add debug …content.metadata.* to log), upon starting tomcat, I can see that XpathMetadataExtracter has been registered.

13:15:17,086  DEBUG [content.metadata.AbstractMappingMetadataExtracter] Added mapping from author to [{http://www.alfresco.org/model/content/1.0}author]
13:15:17,086  DEBUG [content.metadata.AbstractMappingMetadataExtracter] Added mapping from description to [{http://www.alfresco.org/model/content/1.0}description]
13:15:17,086  DEBUG [content.metadata.AbstractMappingMetadataExtracter] Added mapping from title to [{http://www.alfresco.org/model/content/1.0}title]
13:15:17,096  DEBUG [metadata.xml.XPathMetadataExtracter] Added mapping from version to /model/version/text()
13:15:17,096  DEBUG [metadata.xml.XPathMetadataExtracter] Added mapping from author to /model/author/text()
13:15:17,096  DEBUG [metadata.xml.XPathMetadataExtracter] Added mapping from description to /model/description/text()
13:15:17,096  DEBUG [metadata.xml.XPathMetadataExtracter] Added mapping from title to /model/@name
13:15:17,108  DEBUG [content.metadata.MetadataExtracterRegistry] Registering metadata extracter: org.alfresco.repo.content.metadata.xml.XmlMetadataExtracter@c5ad0b

Then I created a "demo" web project, and use web client to add alfresco stock forumModel.xml to the root of "demo", the log only prints out
13:25:10,883 User:admin DEBUG [content.metadata.MetadataExtracterRegistry] Finding extractors for text/xml
nothing else, and metadata are not extracted from it.

Second, I tried web form metadata extraction from http://blogs.alfresco.com/wp/jbarmash/2008/12/01/xml-metadata-extraction-for-wcm. no print out at all this time. no metadata has been extracted.

I did some source digging from public SVN head, found that in the first case, the MetadataExtractorRegistry used is by Repository.extractMetadata is spring bean named "metadataExtracerRegistry", not the one named "avmMetadataExtracerRegistry" defined in wcm-xml-metadata-extracter-context.xml.

In second case, when a content created via "press release" web form, the AVMNodeService gets called to save the content as content property. I don't see the policy.onContentUpdate is invoked in this case, no extraMetadata will be called, even though AvmMetadataExtracter has registered itself to policyComponent.

My question is:
1. is there any extra configurations need to done in addition to following the above samples on a packaged installation?
2. is this because its preview2, that some of the existing features expected not be working as previous edition?
3. if (2) is true, are those features be available in 3.2 final this coming July?

Any help will be appreciated,
-wayne
2 REPLIES 2

pmonks
Star Contributor
Star Contributor
Did you submit the content to the staging sandbox?  This shouldn't be necessary (metadata extraction happens in all types of sandbox), but just trying to determine if that behaviour has changed.

If you submit to staging and the issue still occurs, I'd suggest raising a JIRA ticket (at http://issues.alfresco.com/), then posting the ticket # back here so that we can all vote and/or comment on it.

Cheers,
Peter

wshen
Champ in-the-making
Champ in-the-making
Thanks Peter. I verified that on Lab3 Final the WCM metadata extraction seems to be working, but not in 3.2 Preview2. So I created a new issue at http://issues.alfresco.com/jira/browse/ALFCOM-2965.