Your One-Stop-Shop for all GCP Vision and AWS Rekognition capabilities
The GCPAWS Vision API offers a comprehensive suite of features for analyzing images using pre-trained machine learning models. It provides the best of both Google Cloud Vision APIs and AWS Rekognition APIs all under a simplified, consumable interface at a reduced cost. These features allow developers to extract valuable information from images and integrate visual intelligence into their applications.
Face Detection: Locates human faces within an image and provides bounding polygons around each face. It can also identify facial "landmarks" such as eyes, nose, mouth, and ears, along with confidence values. Additionally, it estimates emotional states (joy, sorrow, anger, surprise) and general face properties (underexposed, blurred, headwear).
Label Detection: Identifies objects, locations, activities, animal species, products, and more present in an image, providing descriptive labels and confidence scores. This is useful for automatic image tagging and categorization.
Text Detection (OCR): Performs Optical Character Recognition (OCR) to extract text from images. It identifies and extracts UTF-8 text, providing bounding boxes for words and the recognized text. It's optimized for sparse text areas within larger images.
Safe Search Detection: Detects explicit content within an image across categories like adult, spoof, medical, violence, and racy, providing likelihood ratings for each. This is crucial for content moderation.
Web Detection: Discovers web references to an image by identifying Web Entities, Matching Images and Best Guess Labels
Logo Detection: Detects common product and brand logos within an image and provides bounding boxes and confidence scores.
Landmark Detection: Identifies popular natural and human-made landmarks in an image and provides their name, geographic coordinates (latitude and longitude), and a bounding box.
Image Properties: Analyzes the overall visual characteristics of an image, such as dominant colors.
Face Detection: Detects up to 100 faces in the image using Rekognition ML models. For each face detected, the operation returns face details including bounding box and a fixed set of attributes such as facial, pose, gender, age & emotional state.
Face Matching: Leverage the power of AWS Rekognition services to to perform a biometric match between a source face image and a target face image
Content Moderation: Detect explicit adult or suggestive content, violence, drugs, tobacco, alcohol, hate symbols, gambling, and disturbing content in images with AWS Rekognition's Content Moderation capabilities.
Celebrity Search: AWS Celebrity recognition automatically recognize tens of thousands of well-known personalities in images and videos using machine learning. The metadata provided by the celebrity recognition API significantly reduces the repetitive manual effort required to tag content and make it readily searchable.
GCPAWS offers pre-trained or customizable computer vision APIs that can be quickly added to applications, allowing businesses to analyse images efficiently and augment human review tasks with AI.
Key business use cases include:
Content Moderation: Businesses can quickly and accurately identify unsafe or inappropriate content across their image and video assets based on general or business-specific standards and practices. This helps maintain brand safety and compliance.
Identity Verification: Utilizing facial comparison and analysis in user onboarding and authentication workflows allows businesses to remotely verify the identity of opted-in users. Features like Face Liveness detect real users and deter spoofs, while Face Compare and Search determine face similarity for verification against another picture or a private image repository.
Media Analysis: Businesses in media and content industries can automatically detect key segments in videos, such as black frames, credits, slates, colour bars, and shots. This helps to reduce time, effort, and costs of video ad insertion, content operations, and content production. Celebrity recognition is specifically mentioned for helping to catalogue photos and footage for media, marketing, and and advertising.
Custom Object Detection: Businesses can detect custom objects, such as brand logos, by training models with as few as 10 images using Custom Labels . This is useful for brand monitoring or asset tracking.
Text Extraction: Extracting text from images and videos, even if skewed or distorted, such as street signs, social media posts, and product packaging, can be valuable for businesses processing visual information.
Quick Integration of Basic Vision Features: The Cloud Vision API is best for businesses needing to easily integrate common vision detection features like image labeling, face and landmark detection, OCR, and tagging of explicit content
Automating Document Workflows: Document AI is designed for businesses to extract text and data from scanned documents, transforming unstructured data into structured information and business insights. It uses OCR, NLP, and ML for document understanding and text extraction. A common use case is to detect text in raw files and automatically summarize large documents. Document AI can also be used to unlock insights from nuanced documents with higher accuracy and faster processing