Azuer OCR C#

Quick Way To Identify A Number Plate By Using Azure Computer Vision

Matt 2021/04/12 16:06:39
4708

Cognitive Services

Azure Cognitive Services provide us bunch of apis, such as Computer Vision, to help creating useful service for us.

And here is a goal, how to identify a vehicle's number plate ? Since we choose the Azure Cognitive Services, create a computer vision instance of Azure Cognitive Service. let's try it.

img "Computer Vision instance of Azure Cognitive Service"

Once the service created, we could manage its keys.

img "manage keys"

Keep them in mind, these two keys are important for the further using for ocr api.

Ocr Service api

Optical Character Recognition aka. OCR, its included in Azure Computer Vision. It can help us to extract printed or handwritten text from images.

Try Computer Vision API (v3.1) easily by using Postman.

We picked a sample number plate from google images.

img "sample XG-SF-67"

After api successfully called. We would see the result from Postman,

img "ocr result from postman"

it would bring us a json string as result (indeed, it came from Azure Cognitive Services OCR api) shown blow.

{
    "language": "zh-Hant",
    "textAngle": 0.0,
    "orientation": "Up",
    "regions": [
        {
            "boundingBox": "22,296,1547,578",
            "lines": [
                {
                    "boundingBox": "288,296,1184,208",
                    "words": [
                        {
                            "boundingBox": "288,296,272,208",
                            "text": "XG"
                        },
                        {
                            "boundingBox": "608,400,104,32",
                            "text": "-"
                        },
                        {
                            "boundingBox": "760,296,256,208",
                            "text": "SF"
                        },
                        {
                            "boundingBox": "1064,400,96,32",
                            "text": "-"
                        },
                        {
                            "boundingBox": "1216,296,256,208",
                            "text": "67"
                        }
                    ]
                },
                {
                    ......
                },
                {
                    ......
                }
            ]
        }
    ]
}

Of course, as we can see, that result contains our expected data of number plate. Every image's identify result would contain one or more regions, every region would contain many lines, and every line would contain many words.

First of all, let us find out what's boundingBox means in every section.

Look at the number plate with coordinates (maybe), every boundingBox including 4 number groups,

img "number plate with coordinate"

ex: "288,296,1184,208". So, what are they? They are Top Left X, Top Left Y, Width, Height from the zero point of that image, they covered as a rectangle. All that were identified areas.

Next step, we should notice each line contains some words, combined all these "text" in words. Finally, a vehicle license number will be revealed, it's XG-SF-67.

Also, we would find other noise included, we should get a way to rid them off, but that is not our concern for now.

Work with OCR api

As document, Computer Vision API (v3.1) showing us, we could just make an easy request by C#.

Stay sharp.

    static void Main(string[] args)
    {
        const string ocrServiceUrl = "https://<YOUR_SERVICE_NAME_HERE>.cognitiveservices.azure.com/vision/v3.1/ocr?language=zh-Hant&detectOrientation=true";
        const string imgUrl = "https://thumbs.dreamstime.com/z/netherlands-car-plate-vehicle-registration-number-127253077.jpg";
        
        var http = HttpClientFactory.Create();
        http.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "YOUR_KEY_HERE");

        var policy = Policy.Handle<HttpRequestException>().WaitAndRetry(new TimeSpan[] {
            TimeSpan.FromSeconds(3)
        });

        policy.Execute(() => {
            SendProcedure(http, ocrServiceUrl, imgUrl);
        });
    }

    private static void SendProcedure(HttpClient http, string ocrServiceUrl, string imgUrl)
    {
        var req = new HttpRequestMessage
        {
            Method = HttpMethod.Post,
            RequestUri = new Uri(ocrServiceUrl),
            Content = new StringContent(JsonSerializer.Serialize(new
            {
                Url = imgUrl
            }), Encoding.UTF8, "application/json")
        };

        var resp = http.SendAsync(req).Result;
        FindWords(resp);
    }
	
    private static void FindWords(HttpResponseMessage resp)
    {
        if (resp.IsSuccessStatusCode)
        {
            var ret = JsonSerializer.Deserialize<DetectedResult>(resp.Content.ReadAsStringAsync().Result);
            if (ret is not null)
            {
                foreach (var region in ret.Regions)
                {
                    foreach (var line in region.Lines)
                    {
                        var wordsInLine = new StringBuilder();
                        _ = line.Words.Where(x => !string.IsNullOrEmpty(x.Text))
                                        .Aggregate(wordsInLine, (result, item) => result.Append(item.Text));
                        Console.WriteLine($"{wordsInLine}");
                    }
                }
            }
        }
        else
        {
            Console.WriteLine($"{resp.ReasonPhrase}\n{resp.Content.ReadAsStringAsync().Result}");
        }
    }

Write some codes, and run it. We will see the vehicle numbers "XG-SF-67" shown on console.

In these codes, we using a number plate image from internet. What if, we already had a physic image file on local storage, can we send it directly ? Yes, sure. As the document showing us, ocr api accept application/octet-stream, multipart/form-data* also, it could send binary image data through the api.

Can you do that ? Thinking, and trying. If you are still interesting in it, please try the sample code on gist.

Polly

If you're smart enough, you might notice a code snippet about polly. What was that ?

"Polly is a .NET resilience and transient-fault-handling library that allows developers to express policies."

Try it here. It's really useful for our api calling.

Custom Vision Service

Somehow, we did not satisfy that result, we still have chances to train our own models to build our own Computer Vision.

img "custom vision ai"

Here, we will introduce Custom Vision Service to train our own computer vision model, to fit our purpose.

BUT, remember, it's a paid service, so we will not demo it at this time.

Otherwise, we might try some other providers ocr service to try our targeting detection. Such as, Cloud Vision API, Free OCR Api.


Cover image from: https://unsplash.com/photos/5_o-FheeEi0


References:

Azure Cognitive Services

Optical Character Recognition ,OCR

Computer Vision API (v3.1)

Computer Vision API v3.1 - OCR

Building a Number Plate Identification Service in 5 Minutes with Microsoft’s Custom Vision Service

Matt