Auto-tagging
Auto-tagging
In Fotoware, auto-tagging refers to the automated process of assigning descriptive tags to images.
Using Artificial Intelligence (AI), Fotoware can automatically detect keywords, brands, and faces and extract text content from images. Fotoware uses Azure Cognitive Services for AI. For more information, see Cognitive Services—APIs for AI Solutions | Microsoft Azure. Auto-tagging uses a set of features supported by Computer Vision in Azure Cognitive Services.
Fotoware sends images to Azure Cognitive Services by using either an action or an asset webhook, and the tagging results will be saved to selected metadata fields on the asset in Fotoware. Auto-tagging is only available for image files; the file sent to Azure Cognitive Services is always a jpeg.
In Fotoware, auto-tagging is configured as an action or asset-ingested webhook. Both use the same URL format to call auto-tagging, and the configuration settings are provided as part of the URL in both cases. For more information, see Configuring auto-tagging. Since auto-tagging results are machine-generated, we recommend they are reviewed by a person and, as such, recommend tagging images on demand.
Fotoware supports two versions of Azure Cognitive Services: version 3.2 or version 4, but you can only use one version per webhook. We recommend using version 4 if that matches your workflow requirements.
Note: Auto-tagging is a billable feature. For more information, contact Fotoware.
Features
Customers can configure which features they would like to use, and they can mix and match features (within the same version of Azure Cognitive Services) as necessary. Some features (descriptions, tags, and OCR) are available in both version 3.2 and version 4, whereas some features are available in only one of the versions.
The following table lists the available features, their corresponding functions, and the associated Azure Cognitive Services version.
Feature | Description | Version |
Description |
Auto-tagging can analyze an image and generate a human-readable phrase that describes its content. The algorithm returns several descriptions based on different visual features, and each description is given a confidence score. The final output is a list of descriptions ordered from highest to lowest confidence. Azure Computer Vision only returns descriptions in English, but we use Azure Translator services to translate the descriptions to all languages. Image Analysis can return content tags for thousands of recognizable objects, living beings, scenery, and actions that appear in images. Tagging is not limited to the main subject, such as a person in the foreground; it also includes the setting (indoor or outdoor), furniture, tools, plants, animals, accessories, gadgets, and so on. |
3.2 and 4 |
Tags |
Image Analysis can return content tags for thousands of recognizable objects, living beings, scenery, and actions that appear in images. Tagging is not limited to the main subject, such as a person in the foreground; it also includes the setting (indoor or outdoor), furniture, tools, plants, animals, accessories, gadgets, and so on. | 3.2 and 4 |
Optical Character Recognition (OCR) |
OCR is also referred to as text recognition or text extraction. With machine-learning-based OCR techniques, you can extract printed or handwritten text from images, such as posters, street signs, and product labels. The text is typically extracted as words, text lines, and paragraphs or text blocks, enabling access to the digital version of the scanned text. This eliminates or significantly reduces the need for manual data entry. | 3.2 and 4 |
Brands/Logos |
Brand detection uses a database of thousands of global logos to identify commercial brands in images. The Computer Vision service detects whether there are brand logos in a given image; if there are, it returns the brand name. Note that this feature identifies company logos in images. Brand names, for instance, will not be identified but could be picked up by OCR. | 3.2 |
Faces |
Image Analysis can detect if there are human faces within an image and return a true/false value. | 3.2 |
People |
Version 4 of Image Analysis can detect people appearing in images. The feature returns the number of people detected or the text ‘No people detected' if no people detected in the image. | 4 |
Objects |
Object detection is similar to tagging, but the feature returns a tag for each object found in the image. For example, if an image contains a dog, a cat, and a person, the object detection operation will list each object. | 4 |
Languages supported
Azure Cognitive Services - Computer Vision has different language support for different features. For an overview of supported languages, see Azure Cognitive Services - Computer Vision - Language support.
To help support the same multiple languages for all features in auto-tagging in Fotoware, we use the Azure translation service to translate texts from English for the languages not natively supported in Computer Vision. This extends the language set supported, but auto-translation does not always guarantee 100% accuracy.