Select Page

Background

I have recently blogged about integrating Open Migrate and the Alfresco Bulk Filesystem Import Tool.  As part of that exercise I also spent time with a colleague to implement support for importing versioned content into Alfresco.  Once again this was already supported in Open Migrate’s direct to Alfresco implementation, but was not already present in the Bulk FileSystem Import Tool.  Background on why I’ve chosen to combine these two tools is in my previous post.  The implementation was ultimately fairly straightforward.  Along the way I learned more about the Alfresco API, some of its nuances and how to implement this feature in a backwards compatible manner that we could contribute back to the community.

Design Decisions

The initial design goal was simply to import versioned content from disk into Alfresco. While that sounds good in theory it does present a problem. Namely, how should we represent the content on disk?  One option would have been to enable the user to provide a separate directory structure for your versioned content and configure the tool to have a separate versioning importer.  However, it was my preference to use the same directory structure for both versioned and non-versioned content.

The user is responsible for providing versioned content files if required.  The current file format is simply a directory optionally with subdirectories and each with content files and optionally a content file with the extension of metadata.properties to supply metadata.

To support versioned content, the user may optionally specify files with a new extension.  Any file ending with the pattern v[0-9]*.  This applies to both content and metadata properties files.  Versions are imported into Alfresco as follows:

Find all files in a given series, for example:

  • Head Revision
    • manual.pdf
    • manual.pdf.metadata.properties
  • First Version
    • manual.pdf.v1
    • manual.pdf.metadata.properties.v1
  • Second Version
    • manual.pdf.v2
    • manual.pdf.metadata.properties.v2

Note that the head revision of the file is not appended with a version identifier. This allows for backwards compatibility with any existing file structures used with the tool.  A file without any versions will simply be created just like the tool has worked to date.  Files with version extensions (and their associated metadata pieces) will be used to create the document and associated subsequent versions until the final head revision has been created.

The sorting is a simple alphabetical sort on the versioned extension.  This allows for gaps in version history (e.g. v1, v2, v4) as represented on disk.  Versions created in alfresco will simply be created in the order in which they are found with no gaps (if any are found on disk).  This also implies that the user has the option to number v1, v2, etc or v01, v02, etc.  Be cautious though, if versions are named v1, v10, v2 will be imported in that order, so the proper names for the extensions should be v01, v02, v10.

Implementation Details

The current implementation is checked into a branch in the google code project here:  https://alfresco-bulk-filesystem-import.googlecode.com/svn/branches/versioning/.  The existing bulk-filesystem-import-web-scripts-context.xml file checked into the branch includes the relevant configuration which I’ve listed here.

The existing importer class is redefined to point to the versioning importer, and the versioning importer is defined. In this instance the async importer is used, however the synchronous implementation works as well.

  

  
          
    
    
  
  
  
    
    
    
  

Note that the importers refer to versioning-metadata-loaders which are defined as follows. This bean refers to a new metadata loader for versioned properties:

 
    
      
        
        
        
      
    
  

  
    
  

As you can see, there are two new classes which make up the bulk of implementation: VersioningImporter and VersioningPropertiesFileMetadataLoader. There are also some modifications to the status implementation to include number of versions created. These two classes drive the import operation. In a future version, the functionality could be merged directly into the existing abstract importer and properties metadata loader.

If you needed to include version history or modify other version properties, just include them in the metadata.properties.v for your version. For example:

  cm:versionLabel=Fixed typo.

I’ve found this implementation useful and have used it successfully to import versioned content. If you have a need to import multiple versions this way could be used. Likewise if you read my previous post, you can see how the Open Migrate writer for the Bulk Filesystem Import Tool could be extended to write out versioned content.

I mentioned in the post background that I learned more about the Alfresco API and its nuances.  What that boiled down to was the behavior in calling create version.  This may be obvious to most but since it was new to me, here’s what I found.  It is necessary to call the createVersion API after creating a document to create the first version. This wasn’t obvious to me and I ended up initially writing code that created a document and set properties, then created a version with the second version’s properties, inadvertently overwriting the first version’s properties. The control flow should be:

// Version 1
NodeRef doc = fileFolderService.create(parentNodeRef, name, typeQName).getNodeRef();
versionService.createVersion(doc, versionProperties);
// Version 2...n
versionService.createVersion(doc, versionProperties);

Next Steps

I’m working together with Peter Monks (@pmonks) to incorporate this into the Bulk Filesystem Import Tool. I’ve checked the code into a branch in the google code repository for the tool here. The work isn’t yet ready for a multi-threaded execution of the tool, but Peter is refactoring for that and we’ll merge the overall implementation into the base classes I’ve extended. We’re also adding support for metadata only versions (e.g. on disk you can just represent a version by a metadata.properties file without a supporting binary.)

Pin It on Pinterest

Sharing is caring

Share this post with your friends!