cancel
Showing results for 
Search instead for 
Did you mean: 

How to keep original file creation date

marc1911
Champ in-the-making
Champ in-the-making
Hi there,

When I upload a Word document through the Webclient- or CIFS interface, the original creation date of my file changes automatically to the date of today. This is not what I anticipated. Is there any way to keep the original file date when (bulk) uploading? I need to upload financial documents from last year. If I continue the process it looks that all these documents are from july 2007 and this is really not what we want.

Regards,

Marc
54 REPLIES 54

finner
Champ in-the-making
Champ in-the-making
Hi,

From the Javadocs

public class OpenDocumentMetadataExtracter
extends AbstractMappingMetadataExtracter

Metadata extractor for the MIMETYPE_OPENDOCUMENT_XXX mimetypes.

   creationDate:           –      cm:created
   creator:                –      cm:author
   date:
   description:            –      cm:description
   generator:
   initialCreator:
   keyword:
   language:
   printDate:
   printedBy:
   subject:
   title:                  –      cm:title
   All user properties

… the OpenDocumentExtractor needs to be overridden for ODT documents.



. . .
   <bean class="org.alfresco.repo.content.metadata.OpenDocumentMetadataExtracter" parent="baseMetadataExtracter" >

        <property name="inheritDefaultMapping">
            <value>true</value>
        </property>
      <property name="overwritePolicy">
            <value>EAGER</value>
      </property>
        <property name="mappingProperties">
            <props>
                <prop key="namespace.prefix.cm">http://www.alfresco.org/model/content/1.0</prop>
                <prop key="creationDate">cm:created</prop>
            </props>
        </property>
   </bean>
. . .

I actually have ALL the extractors overridden.

This is my PDF extractor :



. . .
   <bean class="org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter" parent="baseMetadataExtracter" >

        <property name="inheritDefaultMapping">
            <value>true</value>
        </property>
      <property name="overwritePolicy">
            <value>EAGER</value>
      </property>
        <property name="mappingProperties">
            <props>
                <prop key="namespace.prefix.cm">http://www.alfresco.org/model/content/1.0</prop>
                <prop key="created">cm:created</prop>
            </props>
        </property>
   </bean>
. . .

When I upload a PDF (using Add Content from menu) the Created Date and Modified Date are today. I also ran the Extract Common metadata rule again on the document. The rule is applied to the Company Home space and the Author of the document is being extracted.
I'm obviously missing a step ….?????..


Finner

derek
Star Contributor
Star Contributor
Are the cm:title and cm:description fields being correctly set?

finner
Champ in-the-making
Champ in-the-making
Title, description and Athor are being extracted.
Do the extracters work when in CIFS as well ?
Finner

derek
Star Contributor
Star Contributor
Yes.  If the cm:title and cm:description are being extracted when running via CIFS, then there is a bug with the overriding.  If not, then it's a rules issue - maybe a bug, even.

tschiller
Champ in-the-making
Champ in-the-making
Same thing is happening here. This is my custom-metadata-extracters-context.xml file:
<beans>
    <bean id="extracter.Office" class="org.alfresco.repo.content.metadata.OfficeMetadataExtracter" parent="baseMetadataExtracter" >
        <property name="inheritDefaultMapping">
            <value>true</value>
        </property>
   <property name="overwritePolicy">
       <value>EAGER</value>
   </property>
        <property name="mappingProperties">
            <props>
                <prop key="namespace.prefix.cm">http://www.alfresco.org/model/content/1.0</prop>
                <prop key="createDateTime">cm:created</prop>
      <prop key="subject">cm:description</prop>
            </props>
        </property>
    </bean>
</beans>

The subject gets mapped to cm:description (I used a test string which worked fine), but the creation date/time does not get mapped to cm:created (still comes out as date/time of upload). This happens both when I upload via CIFS and through normal document management spaces. I also tried putting in a local rule to extract common metadata (even though there's a rule at the Company Home space that applies to all subspaces), and still no luck.

It seems that Finner and I are having the exact same problem (with different extractors). This may not be a coincidence… What are your thoughts, derek?

derek
Star Contributor
Star Contributor
Indeed.  I've refered this to QA for further testing.

Regards

tschiller
Champ in-the-making
Champ in-the-making
Thanks! Looking forward to an answer  Smiley Very Happy

tschiller
Champ in-the-making
Champ in-the-making
So in a strange turn of events… I was working on a different thing in Alfresco, and I got a file to upload with a different creation date/time than the upload date/time, however it is not the one that Windows shows under the file properties.

Windows properties under General tab:
Created- Yesterday, November 13, 2007, 12:12:23 PM [date I put the file on my system]
Modified- Wednesday, January 03, 2007, 10:16:52 AM [the last date the file was modified]
Accessed- Today, November 14, 2007, 1:37:15 PM

Windows properties under Summary/Advanced tab:
Date Created- 7/29/2003 7:10am [the actual creation date of the template file]
Date Last Saved- 1/3/2007 10:16am
Last Printed- 1/2/2007 11:34am

When I upload to Alfresco, its properties show as:
Created date- 29 July 2003 08:10 [technically "correct", but I wanted the last save date in practice]
Modified date- 14 November 2007 13:53 [upload date/time]

So…. I took a closer look at the files I was trying to get work previously. When I use Alfresco to check the properties of the file, I do get the correct Created Date; however, it's the Modified Date that shows with the icon under the space, which may have been where I was going wrong to begin with. It looked like the file was being uploaded with the wrong creation date, but it actually was working.

I will try changing the metadata extractor to pull the "modified" date/time from the file and let you know how it turns out. Stay tuned!

tschiller
Champ in-the-making
Champ in-the-making
[update]
Ok. Sorta weird. Can't get the modified date to work, even when I changed the extractor xml file and added the last line:
<prop key="createDateTime">cm:created</prop>
<prop key="subject">cm:description</prop>
<prop key="lastSaveDateTime">cm:modified</prop>
It still shows the upload date/time as the modified date/time, although the creation date/time does work. (Yay!!)

Oddly, when I use CIFS and check the Windows properties of the uploaded file, under Summary/Advanced tab, I do get the same (correct) Date Created, Date Last Saved, and Last Printed date/times as in the last post, but under the General tab it only shows the correct Created date/time, and the Modified and Accessed date/times are those of the upload date/time. Hmm…

     A [global]
   __|__
   |
   B [space]
__|_________
|          |
|          D [subspace that it won't work in without local rule]
C [subspace that it worked in]

Another weird thing– I have to make the "extract common metadata" rule a local rule in that subspace (D) for it to work. Even though I have a "global" rule at the Company Home space (A- applied to all subspaces) that should extract the metadata, it does not seem to work.

However, with the other thing I was working on where I first discovered the creation date/time did populate correctly, which is in a different subspace (C) within the same space (B), the global rule of extracting the metadata did seem to work, without it being a local rule.

Any ideas as to what is going on?

davidd
Champ in-the-making
Champ in-the-making
did you ever get to the bottom of this one…?