How It Works
Revolutionizing Label Reading with AI: Kargo's Breakthrough Method
| Kargo | 8 min
What You Need to Know
Warehouses, much like everything in the real-world, are built with people in mind. From aisle layout to label placement, everything is designed for the people who work there.
The challenge lies in enabling technology to perceive and interpret as efficiently as a person. Kargo’s innovative AI-driven method, outlined in our latest patent, harnesses the power of advanced image processing to autonomously extract information from freight labels during loading and unloading.
Kargo's AI approach stands out for its ability to rapidly process data from labels with over 99% accuracy, irrespective of the environmental challenges
Here, we’ll discuss this new method in the context of reading freight labels. We’ll explore Kargo’s approach, how it differs from existing methods, and its use in logistics applications.
Challenges of Label Reading
Reading labels in the physical world is hard for machines.
Labels can be damaged in transit, obscured by packaging, wrapped in plastic or in some way illegible. Traditional methods struggle due to their reliance on high-quality, standardized images and an inability to adapt to context or geometric variations.
Endless label variations make it impossible to train models on every potential configuration. Multiple labels in one image adds to the complexity.
The challenge of turning vectorized text into structured data is fourfold:
- Varying orientations. Every label image has a slight or drastic change in perspective. The label can be rotated, tilted, up-side-down, etc.
- Missing information. Key words on the label might be missing because of damage to the image or the label itself.
- Unlimited variations. Labels have varying layouts, with text in different formats and orientations.
- Different data formats. Information on the label can be presented in varying formats. For example, dates can be written as 10/09/23, Oct-9th-23, or 09.10.2023.
Kargo’s method offers a new way of turning OCR results into structured data.
Kargo's Method
Kargo’s groundbreaking method draws inspiration from the human approach to reading labels. When encountering a new label, a person analyzes each field to identify the required data. With familiarity, this process becomes more efficient; locating each data element becomes almost instinctual. Kargo's AI mirrors this learning process to rapidly identify and extract relevant information, mirroring human efficiency and understanding.
Knowing Where to Look
Kargo’s method relies on “templates” to understand label formats. A “template” functions as a format fingerprint, making it easy to repeatedly locate and extract relevant information.
For each new label format, a corresponding template is created to encode the locations of pertinent data fields, and enable machines to recognize and match other instances of the label. New templates can be created in seconds, and require no model retraining or downtime. The AI instantly adapts to new labels.
Extracting the Information
Once a template is created, it is immediately ready to be used to structure data. Here are the three key steps Kargo performs to extract information from an image:
1. Image to vectorized text conversion. The label image is transformed into vectorized text using OCR technology.
2. Template selection. This text is analyzed to identify the label format, aligning it with the appropriate template. Unmatched labels are flagged as new formats, prompting the creation of a new template.
3. Data extraction. By referencing the template, Kargo precisely locates and extracts the relevant information, structuring it into easily interpretable key-value pairs.
Leveraging Kargo’s Method
Kargo’s method helps overcome the challenges associated with label reading in the physical world.
It’s robust against low quality or complicated images. The method emphasizes the relative position of key fields, rendering the quality of the image itself less critical.
It’s fast. This method is ten times faster and more resource efficient than other reading methods. Speed is important because it unlocks real-time, actionable data.
It can handle unlimited label formats. Kargo’s method works with any label. There is no limit to the template library size. Adding or changing a template is a simple operational task.
It’s explainable. AI models can hallucinate, or generate incorrect information, without warning. Large deep learning models are “black boxes” that yield stochastic results. Kargo’s method is predictable and explainable with easily tunable parameters.
Deep Learning Methods of Label Reading
In the realm of AI-driven label reading, Kargo’s approach stands out against other prevalent methods, each with their distinct challenges.
Large Language Models
Large language models (LLMs) can be used to read labels by leveraging their advanced linguistics capabilities. However, they struggle with:
Speed. LLMs require more processing time because they have to read every word on the label, making this method slow and expensive. By comparison, Kargo’s method is 1,000 times more resource efficient.
Imperfect images. LLMs have to read every label as if for the first time. That makes LLMs susceptible to small fluctuations in image quality. If the label image is not perfect or the label itself is damaged, LLMs lack the prior context to extrapolate the reading locations.
Label formats. Since LLMs are linguistic, they do not fully understand geometric structure. LLMs struggle with complicated label formats or data elements that lack descriptor key phrases.
Deep Learning Models
Deep learning models can be used together with OCR to identify the correct label template, but struggle to read the label. When used to choose a template, these models are less than ideal because they:
Need training. Deep learning models need to be trained on vast datasets, resulting in significant deployment delays.
Get one result. Deep learning models are only able to read one label per image.
Regex
After a deep learning model identifies and reads the text on labels, regular expressions (regex) can be used to parse and structure the data. Regex is powerful for pattern matching, but has its limitations when it comes to label reading:
Depends on Consistency. Regex lacks flexibility, making it less effective in handling variations or unexpected formats. Its dependency on consistent formatting poses a challenge in scenarios where label formats are not standardized.
Requires Context. While great for pattern recognition, Regex lacks the ability to understand context. This can lead to errors if the pattern matches, but the context is incorrect. For example, it could mistake a part number for a date or struggle to distinguish between an expiration date and a production date.
Real World Application: Logistics
Kargo’s new method for extracting data from an image was developed for a highly variable environment - the loading dock.
Kargo Towers are installed on either side of loading dock doors to capture images of freight as the forklift loads or unloads the trailer. Kargo uses computer vision to extract information from freight labels and identify any visible damage. These signals enable Kargo to verify shipments, flag exceptions, and update existing inventory systems in real time.
Kargo's AI approach stands out for its ability to rapidly process data from images of labels captured at the loading dock with over 99% accuracy.
Fast label reading. Excelling in real-time data extraction, Kargo’s AI is able to process and act on freight data in real-time. There can often be 10s or 100s of labels per pallet that need to be captured and processed simultaneously. Kargo’s method is able to read them all in time - before the next pallet is loaded/unloaded - ensuring timely updates to the Kargo Platform and other digital systems.
Hard to see labels. Lighting, packaging, and varying label placement can make it difficult to capture high-quality images. Kargo’s method was designed to overcome these challenges, delivering high-quality results even with poor quality, cropped or damaged images.
Multiple different labels. Pallets are covered with many different label formats (case labels, master labels, LPNs etc.) Warehouses and distribution centers receive products from a multitude of suppliers- each one opting for a custom format. Kargo’s method makes it simple for operations teams to index and maintain all these variations.
New labels. Using Kargo’s method requires no re-training. Kargo can begin capturing data from new labels in seconds, rather than hours or days. There is no downtime or “learning periods” for the system when a new label format is discovered.
Kargo’s method is a game changer for logistics, as well as other highly variable, fast-paced environments like live events or airports.
The AI Era
Kargo’s approach transcends the limitations of traditional methods by imitating how people are able to read and understand labels. Kargo’s technology doesn’t just “see,” it perceives, employing AI to dynamically recognize and interpret information from any image. Our commitment to developing innovative AI technologies has positioned Kargo not just as a participant but as a trailblazer in ushering the logistics sector into the new age of artificial intelligence.
Questions?
Connect With us
Ask us anything or drop your email to stay in touch