Using data mining to find your assets

Data mining is used for filtering the content in your archives by metadata or dates

About data mining

Data mining is used to find files, sort files and change the metadata for files. Data mining consists of two parts: The Calendar dates option (the files are grouped together based on date attributes) and the Word list option (the files are grouped together based on metadata).

To access the data mining tool, click on the icon on the left hand side of the Quick Search field. A screenshot of the icon is shown above. The data mining pane expands and you can choose between Calendar dates and Word lists in the dropdown list at the top of the data mining pane.


Data mining and searching

You can search the archive using the Quick Search or Advanced search when you use data mining, and Calendar date or Word list will be updated according to the search you perform. You can for example easily search for all files that contain the words “bike”. You can then open the Calendar date view (as described earlier) to know when these files were created. You can also view e.g. who was Caption Writer for all files with “bike” as metadata by opening the Word list and opening Caption Writer from the list, as described earlier.

Special behavior for some metadata fields

Metadata fields without drag and drop functionality

When using Word List data mining, you can drag a file onto an entry in the word list to add that information to the file's metadata. This way, you can for example quickly add keywords to a number of files that don't have that keyword. However, for some metadata fields it doesn’t make sense to be able to drag and drop a file to change the field in question and hence dragging and dropping is disabled for these fields. The fields in question are Similarity Index, Caption, Master Document ID, Short Document ID, Unique Image ID, Owner ID, Padding, Raw file info, Classify state, EXIF Camera Info and Document History.

Repeatable metadata fields

The three metadata fields Keywords, Supplemental Category and Byline are special in the way that they allow multiple separate entries. This means that it is for example possible to add several keywords to each file. Each keyword will generate a separate entry in the Words list. Since it is possible to have more than one entry in the Words list for each file, drag and drop will only add the new keyword and not replace any old ones. Note especially the Byline field, which under the XMP standard is a repeatable field while it used to be a single entry field under the IPTC standard.

Clearing a metadata field

It is not possible to drop one or more files into the <empty> section of the Word list. This to make sure that you don’t clear a field by accident. We suggest using macros to clear fields.

Power Tip: Choosing the Similarity Index field in the Word list data mining will allow you to easily find duplicate files in your archive; any entry reporting more than one file indicates a duplicate.
