Using data mining to find your assets

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Data mining is used for filtering the content in your archives by metadata or dates

About data mining

Data mining is used to find files, sort files, and change the metadata for files. Data mining consists of two parts: the Calendar dates option (the files are grouped together based on date attributes) and the Word list option (the files are grouped together based on metadata).

To access the data mining tool, select the icon on the left-hand side of the Quick Search field. The data mining pane expands and you can choose between Calendar dates and Word lists in the dropdown list at the top of the data mining pane.

Data mining and searching

You can search the archive using the Quick Search or Advanced search when you use data mining. Calendar date or Word list is updated according to the search you perform. You can, for example, search for all files that contain the words bike. You can then open the Calendar date view (as described earlier) to see when these files were created. You can also see, for example, who was caption writer for all files with bike as metadata by opening the Word list and opening Caption Writer from the list, as described earlier.

Special behavior for some metadata fields

Metadata fields without drag and drop functionality

When using Word List data mining, you can drag a file onto an entry in the word list to add that information to the file's metadata. By doing so, you can, for example, quickly add keywords to a number of files that don't have that keyword. However, for some metadata fields it doesn’t make sense to be able to drag and drop a file to change the field in question so dragging and dropping is disabled for these fields. The fields in question are Similarity Index, Caption, Master Document ID, Short Document ID, Unique Image ID, Owner ID, Padding, Raw file info, Classify state, EXIF Camera Info, and Document History.

Repeatable metadata fields

The three metadata fields (Keywords, Supplemental Category, and Byline) are special in the way that they allow multiple separate entries. This means that you can, for example, add several keywords to each file. Each keyword generates a separate entry in the Words list. Since it is possible to have more than one entry in the Words list for each file, drag and drop only adds the new keyword and does not replace any old ones. Note especially the Byline field, which under the XMP standard is a repeatable field while it used to be a single entry field under the IPTC standard.

Clearing a metadata field

It is not possible to drop one or more files into the <empty> section of the Word list. This ensures that you don’t clear a field by accident. We suggest using macros to clear fields.

Tip: Choosing the Similarity Index field in the Word list data mining allows you to easily find duplicate files in your archive; any entry reporting more than one file indicates a duplicate.