Number Recognition Tutorial

'''This tutorial requires decent math skills and a good understanding of programming concepts. You need to know about vectors, linear interpolation, and how to work with data. This is not recommended for beginners!'''

I created a number recognition system in Petit Computer for my Sudoku program, and although I promised I would release a stand-alone version of the recognition system, I never did. Here's the next best thing: a (hopefully) descriptive rundown of the system itself. Here we go!

Basic Idea
In a number recognition system, you want the computer to convert a hand-drawn number or symbol into something the computer can understand, like the literal number "4". The computer isn't able to do this unless it has some previous data to compare against. The computer can't magically know that a bunch of lines that somebody drew just happens to be a 4; it'll have to look at previously drawn numbers and say "Oh, this bunch of lines looks like another bunch of lines which you previously told me was a 4, so I think this new bunch of lines is also a 4". In a sense, this is a standard process of learning: using previous knowledge to identify similar patterns.

In the broadest sense, there's only three steps to the recognition system:
 * Save original drawings as data and pair them with the actual number it represents. This is "training" the computer.
 * Compare a new drawing against original drawings and assign a "closeness" score for each one
 * The best fit number is the one with the lowest score (i.e. the closest drawing)

The first and last points are pretty straightforward, but the second point is the complicated one (and the most important). Let's break down how to assign a closeness score:
 * Break drawing down into a set number of line segments
 * Compare line segments between a saved drawing and the new drawing
 * Use vector math to determine the "difference" between each line segment and add them all up. This is the closeness score

What kind of vector math, you ask?
 * Calculate the vector difference between the two line segments (one from new drawing, one from old)
 * Calculate the length of this vector
 * Square this length

That's all there is to it! The core of the whole recognition system is just subtracting lines and adding lengths. All the fluff around it is what makes it complicated; the main part is easy.

How To Save A Drawing As Data
First, we need to store a drawing as data. Since Petit Computer (and most other touchscreen systems) allow you to query the X and Y position that is currently being touched, we can simply save a drawn number as the series of points which make up the number. This means we can't use an image; we'll have to take the data directly from the touchscreen while the user is drawing it. For every frame the touchscreen is being used, we're going to store the X and Y position. If we took a whole second to draw a number, we'd have 60 points of data (because 60 frames per second). It's up to you to handle the minutia; all we really need are the points which make up the number. Note that this will be the same for storing the data for the "original" drawings as well as storing the (temporary) data for the new drawings we want to recognize. What you'll want to do is ask the user to draw the numbers 0-9 and save the points for each drawing along with the actual number it represents. It doesn't matter how you do this, as long as you have something like: 1 - (X1,Y1) (X2,Y2) (X3,Y3) etc. 2 - (X1,Y1) (X2,Y2) (X3,Y3) etc. 3 - etc.

How To Turn Drawing Into N Line Segments
We could technically attempt to recognize numbers with only the points. We could say "How close are the points in a new drawing to the points in all the old drawings" and leave it at that. The problem is that this is a terrible recognizer. What if the user drew a number in a different spot on the screen? What if they drew it a different size? What if they drew it fast this time and it doesn't have the same number of points? There's a whole bunch of things which make comparing just the points a bad thing. Instead, what we're going to do is convert a sequence of points into a set number of line segments. I think for Sudoku I only use 5 line segments, so even if your number has 60 lines, I convert it to 5. It's important to realize that even though our drawing data is a sequence of points, we can use these points to determine the lines which connect the points. But how do we get from more lines to less? Let's start with an easy example: I want to convert 10 points down to 5. This should be as easy as taking every other point... cool. I'm still working with points instead of lines because this is how the data is stored. But what if we want to convert 10 points down to 9? We don't want to just throw away points, because that would deform our number. Instead, we're going to use linear interpolation to determine a new set of points based on the midway points on the "continuous" line which represents the drawing. Going back to our 9 points from 10 example, the first new point would still be the old first point; the second new point would be 1/9 of the way between the old second and third point; the third new point would be 2/9 of the way between the old third and forth; and so on and so forth. How does this work? We're basically going to view our series of lines as one continuous line again, then move along it a certain amount and pick our new points. The amount we move along the super line is determined by how many points we want to end up with. If we want to end up with 5 points, we'll want to move through the super line 1/5 at a time. If the line is originally 10 points, we can move along the line at a pace of 10/5 of a point. Oh hey, this is 2. If we move along this line of 10 points at a pace of 2 points at a time, it's like selecting every other point, and we still end up with 5 points. If we want to take 3 points from 10, we'd move 10/3 of a point. This is where linear interpolation comes into play: the first extracted point ends up being between the 3rd and 4th point (3.3333), so we'll use linear interpolation to get the point which is 0.3333 between the 3rd and 4th point.

HOWEVER! These are numbers we're talking about, and the starting and ending locations are REALLY important. As a result, we don't want to shift the starting point nor the ending point, which changes how we determine where to place new points. Remember that we keep the same starting point though, so we really only need to worry about the ending point being the same. This is as easy as subtracting one from both the original number of points and the desired number of points. So instead of saying we're going from 10 to 9 points, we say we're going from 9 to 8 points and just tack on the original last point.

Oops, I got tired and didn't finish the tutorial. I'll finish it soon!