05-16-2020 10:30 PM
Hi,
I want to preserve original document create and modified date during upload. how can I achieve that?
If this is possible then will It preserve during FTP upload?
05-18-2020 08:44 AM
Ok, if you are able to locate the extracted metadata in log by AbstractMappingMetadataExtracter/PdfBoxMetadataExtracter and check that Found: {..............} has 'created/modified' metadata but Mapped and Accepted: {............} doesn't show it then here is what could be happening.
So, these values are set at the time of node creation and marked read-only after that.
<property name="cm:created"> <title>Created</title> <type>d:datetime</type> <protected>true</protected> <mandatory enforced="true">true</mandatory> <index enabled="true"> <atomic>true</atomic> <stored>false</stored> <tokenised>both</tokenised> <facetable>true</facetable> </index> </property> <property name="cm:modified"> <title>Modified</title> <type>d:datetime</type> <protected>true</protected> <mandatory enforced="true">true</mandatory> <index enabled="true"> <atomic>true</atomic> <stored>false</stored> <tokenised>both</tokenised> <facetable>true</facetable> </index> </property>
The alternative solution for this would be create your custom properties in your custom content model; and keep the created/modified matadata values on those custom properties for your use. Unless you want to override the default behavior of auditable aspect properties which i believe would not be a good idea.
For example:::::
Create following properties in your custom content model:
<aspect name="demo:customAuditMetadata"> <title>Custom Audit Metadata</title> <description>Custom Audit Metadata</description> <properties> <property name="demo:originCreatedDate"> <title>Original Created Date</title> <description>Created date of files based on incoming metadata extracted from metadata extractor</description> <type>d:text</type> </property> <property name="demo:originModifiedDate"> <title>Original Modified Date</title> <description>Modified date of files based on incoming metadata extracted from metadata extractor</description> <type>d:text</type> </property> </properties> </aspect>
Add following bean definition and add the above properties in the mappingProperties:
<bean id="extracter.PDFBox" class="org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter" parent="baseMetadataExtracter"> <property name="documentSelector" ref="pdfBoxEmbededDocumentSelector" /> <property name="inheritDefaultMapping"> <value>false</value> </property> <property name="overwritePolicy"> <!-- Allow extraction happens all the time (e.g. when content is updated or new version is uploaded).--> <value>EAGER</value> </property> <property name="mappingProperties"> <props> <prop key="namespace.prefix.demo">http://www.github.com/model/demo/1.0</prop> <prop key="created">demo:originCreatedDate</prop> <prop key="modified">demo:originModifiedDate</prop> </props> </property> </bean>
- Update the share config to display the newly added properties on document-details page as needed.
05-25-2020 11:54 AM
@sanjaybandhniya Find the demo project here:
https://github.com/abhinavmishra14/alfresco-metadataextraction-demo
I had an observation between community and enterprise versions. Examples i gave above works perfectly fine with enterprise versions of 5.2.x (i used 5.2.6) and 6.1.x(used 6.1), but properties files are not getting picked correctly (its some sort of intermittent behavior) on community editions.
Only change i did is highlighted below for community edition and it picks up always corretly.
<property name="mappingProperties"> <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean"> <property name="location"> <value>classpath:alfresco/module/${project.artifactId}/metadataextraction/TikaAutoMetadataExtracter.properties</value> </property> </bean> </property>
On enterprise version both works fine, above path and below given path as well:
<value>classpath:alfresco/metadata/TikaAutoMetadataExtracter.properties</value>
This one also works on both versions:
<value>classpath:alfresco/extension/metadata/TikaAutoMetadataExtracter.properties</value>
I am not sure what difference the two type of versions (community and enterprise) has in terms of extension points, tried looking at source code but no clues. But good news is that the other path i shared above (available in demo project) works fine for both community and enterprise versions.
Hope this helps trim down your issue.
08-01-2024 07:55 AM
I have created an alternative deployment for Docker that allows apply this configuration.
Sample project available in https://github.com/aborroy/alfresco-custom-metadata-extractor
05-17-2020 07:52 AM
I had a similar requirement in my project. And to keep the original create/creator/modified/modifier values when documents are uploaded initially, we added custom auditable aspect in our custom model. And we apply the aspect with same values as we can see on olfresco's auditable aspect while documents are created and we used document creation behavior to do this.
Alfresco then keeps on updating its original auditable aspect based on further updates on document. And our custom aspect remains unchanged
05-17-2020 08:51 PM
I am not talking about alfresco upload date,I want to preserve document original creation date during upload.
Please check below Image.
05-18-2020 06:19 AM
OOTB Metadata extrator does maps extraction and application of created date metadata. If you look at the pdfbox metadata extactor properties you would notice that "created" metadata is mapped to "cm:created".
# Mappings author=cm:author title=cm:title description=cm:description created=cm:created
This is the class which may be parsing the newly uploaded pdf files and extracting their available metadata and map them to content model metadata:
You can enable following logs to see if metafdata is getting extracted or not:
log4j.logger.org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter=DEBUG log4j.logger.org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter=DEBUG log4j.logger.org.alfresco.repo.content.metadata.TikaAutoMetadataExtracter=DEBUG
Have a look at this test class as well which tests about "createdate" metadata.:
You can also look at auto metada extractor impl as well for reference: https://github.com/Alfresco/alfresco-repository/blob/master/src/main/java/org/alfresco/repo/content/...
05-18-2020 07:50 AM
Hi,
I have checked all these class but I am not getting idea what customization I have to make to unable mapping of original document created date with cm:created.
Explore our Alfresco products with the links below. Use labels to filter content by product module.