OCR, or Optical Character Recognition, is the process of extracting text from images.Most OCR models will return the text in the image along with the bounding boxes of the text, which represent where the text is located in the image. Running OCR withDocumentation Index
Fetch the complete documentation index at: https://docs.supercontrast.com/llms.txt
Use this file to discover all available pages before exploring further.
supercontrast is easy, all you have to do is provide the image URL and the provider you want to use.
In this example, we’ll specify that for Task.OCR, we want to use Provider.GCP, which uses Google’s Vision AI API.
Task.OCR, the request schema is defined by OCRRequest and the response schema is defined by OCRResponse. The OCRResponse has a list of OCRBoundingBox objects, each representing a bounding box of text in the image.
OCRRequest
image: a string that can be either a URL to an image or a local file path.
all_text: a string of all the text in the imagebounding_boxes: a list ofOCRBoundingBoxobjects, each representing a bounding box of text in the image
text: a string of the text in the bounding boxcoordinates: a list of tuples, each representing a coordinate in the bounding box