Avoiding duplicate content in archives
Learn how to keep archives in check and avoiding duplicate content, either by routing duplicates aside or by adding version numbering to files.
When configuring a workflow system that feeds content into an archive, it's useful to make sure you don't create duplicate content in the system. At times you may want to keep several versions of the same file though, and Color Factory can also help you do that, using the Archive Management feature that can be enabled by switching on the Image ID feature on the channel.
A fundamental prerequisite for this functionality is that the archive is indexed by an index Manager server. That way, when a file is processed in the Color Factory channel, Color Factory will query the Index Manager server and establish whether a duplicate file exists. Normally, this should be based on the file's unique Image ID, but it can also be based on the filename or even Image Similarity, which is a "checksum" that Color Factory writes to a special metadata field when a file is processed. In this topic, however, we will assume that we have a unique Image ID to look for.
If a duplicate is found, you can choose to allow them and add version numbering. Color Factory will then add versioning information to the Unique Image ID field to allow you to search for all versions of a file using its Image ID, for example in FotoStation.
Your other option is to replace the archived version with the file that is currently being processed. In this case there are two possibilities; either you can store the processed file in the output folder defined in the channel settings, while the archived original will be deleted. Alternatively, since Color Factory gets information about the original file's location when querying the Index Manager server, it can store the processed file in that location, overwriting the original.
Finally, you can choose not not to allow duplicates at all. Thus, if Color Factory finds a matching file in the Index Manager archive, it will move the duplicate file that is being processed in the channel to a separate folder that you determine. This folder can for example be monitored manually, and you can set up automatic purging to remove old files so they don't pile up in the long run.