Barely a day goes by without a blog, article, or tv segment about the rise of computer vision. And let’s be honest, most of the time these pieces of content are focusing on the potential pitfalls of the technology.
How Does it work?
What these articles fail to mention is that deep-down computer vision relies on a few simple elements:
- Hardware: Basically you need a really good camera (even better if it can do 3d recording) and a capture device. It can be a computer as we know it or the new “in-fashion” word: An edge computing unit. Which means a computer that does most of the processing locally before sending the information to the cloud for further analysis.
- Data: Computer vision does not work without a training model. It’s when you feed lots of data to a computer so that it learns. It’s exactly like school. You feed it information and it retains it. Of course if you feed it wrong information it will learn it as well and can give you some interesting results.
- Algorithm: Which is a fancy word for a math formula! Really, you are telling your code to do something based on what the hardware has captured and what the data has learnt.
What can the computer see?
When it comes to facial analysis and facial recognition, a computer looks at an image, or a video (which is a series of images) to find patterns it knows and understands.
Most basic systems will start by looking at the eyes because it’s one of the most recognizable landmarks on a face. Then it will try and see other data points such as the nose, and the mouth.
At this stage, what can be seen and analyzed will very much depend on the quality of the hardware and how good your code/analysis is.
For Okaya, we are often asked how computer vision can see fatigue. Well, just like someone can tell you “Hey, you look tired”, a computer can see it too.
We look at basic landmarks and we also expand our analysis onto other patterns.
There is one other thing that we do which is really important to us: We do not rely on just basic data sets.
As we mentioned in the first part of the blog, algorithms rely on training data to learn how to operate. Without this training data, the results are not all that reliable.
But, we know that this data is often biased or incomplete so complement base-models with our own models and approach.
And of course we do all this while being mindful of GDPR, Hipaa, and basically treating other people’s data the way we’d like ours to be treated.
If you want to have some fun try our demo to get a small taste of how computer vision really works!