Comment Location:
http://christalks624.blogspot.com/2010/09/reading-8-lightweight-multistroke.html
Summary:
The previous paper (and most of the previous papers I believe) focused on recognizing the shape of a single stroke. This paper seeks to do sketch recognition for object made of multiple strokes. Named the $N, the algorithm is a little more than twice the $1 in number of lines of code and claims to be accurate; it's also a template-matcher. $N claims to improve upon all of $1's weaknesses and achieved an impressive 96.7% recognition rate for a set of 15 templates.
$N seeks to improve upon $1 by "(1) recognizing gestures comprising
multiple strokes, (2) automatically generalizing from one
multistroke template to all possible multistrokes with alternative
stroke orderings and directions, (3) recognizing 1D gestures such
as lines, and (4) providing bounded rotation invariance.
The paper can be reviewed for details on exactly how those 4 goals are accomplished. $N still suffers from the shortcomings of any template-recognizer (not good with new shapes). Still, it was able to recognize letters and some mathematical symbols.
Discussion:
I've noticed this in other papers, but the authors tend to use small data sets and the sketches are rather similar to their intended images; there are some flaws in the sketches, of course. It's good that the sketchers could draw well and a small data set was used to prove the concept, but that does not indicate an algorithm's robustness and ability to recognize (or clean up) ANY image the user draws (like words or an airplane, cat, or car, for instance). Overall, the results have been somewhat biased; that seems to be true for a LOT of papers, but it's still rather discouraging to see it so frequently in a singular discipline.
That said, I LOVE how the author included psuedocode for his algorithm at the end of the paper; that's very rare, in any paper of any discipline of computer science.
On a side note, is it possible to include pressure exerted in the data of each point?
Point p = (x,y,time,pressure)
While implementing this is primarily a hardware (and some back-end software) issue, I believe the user would exert more pressure on mores significant lines and curves; that could aid in deciphering the user's intention behind his sketch.
Actually, I think sketch recognition is the application of HCI. What authors focus is to make users comfortable, rather than design a robust algorithm. Rubine, $1, protractor and $N are all examples.
ReplyDeleteThe pressure is good for recognition. I think it will improve the recognition rate.
All tests are sort of biased in my opinion. It is impossible to test in all kinds of databases. I think a standard database should be used to test. However, there seems to be very few databases as standard.
ReplyDelete