Saturday, October 16, 2010

Reading #14. Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams (Bhat)

Comment Location:
http://pacocomputer.blogspot.com/2010/11/reading-14-using-entropy-to-distinguish.html

Summary:
This is another paper that distinguishes between text and shapes--good.  The algorithm relies on one feature: entropy--even better.  The author used zero-order entropy (meaning symbols are independent of each other) and this simplifies the algorithm.  It creates an alphabet of angle differences between stroke points and uses that as the basis of entropy.  Text characters have significant more variance in the angles than shapes. 

The evaluation metric appears sound.  A total of 756 strokes were used and the algorithm attained a high accuracy for the strokes it identified.  The problem is 25% of the strokes were not classified as text or shape with high confidence.  The algorithm identified the extreme cases, but balked at the ambigious ones.  The algorithm was more accurate with text strokes than geometric strokes.

Discussion:
The algorithm uses one feature to classify between shape and text.  I like it.  This is even better than the last paper.  Assuming the accuracy is to be believed, the algorithm is of great use to our programming project.  An additional filter to handle the ambiguous strokes could prove very useful. 

I'm a little unclear how the alphabet angles are grouped; I imagine it's the angle between 2 different stroke points, but I'm not sure.  If anyone knows for sure, please share. 

The paper said it used the data from the COA domain.  Isn't that the data WE are using in our 2nd programming assignment?  If so, then we should definitely use an algorithm that already works for this data set.  If anybody has an existing implementation of this algorithm, please post it to the Google groups for the class.

Reading #13. Ink Features for Diagram Recognition (Plimmer)

Comment Location:
http://pacocomputer.blogspot.com/2010/11/reading-13-ink-features-for-diagram.html

Summary:
The author attempted to distinguish between text and shapes.  The author attempted a linear split of the data.  First, the data is protted onto a graph according to its "bounding box width"; then the best fit vertical line is placed and the strokes are classified depending on which side of the line they are located.  The initial results showed a large number of misclassifications, particularly in the shape department.  The text had a much lower misclassification rate.

I did not find a step-by-step algorithm on how the bounding box was calculated.  I noticed Figure 3 revealed some information about the bounding box.  It seems the bounding box is a summary of the features of a given stroke.  Inter-stroke gaps (distance between strokes) was the biggest key feature of the feature set; it is smaller from text-to-text than shape-to-shape.

Here are the significant features:

Interstroke gaps: Time till next stroke, Speed till next stroke, Distance from last stroke, Distance to next stroke
Size: Bounding box width, Perimeter to area, Amount of ink inside
Curvature: Total angle

Discussion:
This algorithm has immediate potential for the 2nd programming project.  I plan on doing one of the sets involving characters, so this algorithm warrants close examination.  I am dissappointed in the results, but there is potential.  The author managed to find text remarkably well.  If an additional filter is used to eliminate the misclassified shapes, then the results will improve significantly.

Monday, October 11, 2010

Reading #12. Constellation Models for Sketch Recognition. (Sharon)

Comment Location:
http://martysimpossibletorememberurl.blogspot.com/2010/10/reading-12-constellation-models.html

Summary:
The paper takes a new twist on recognition by using constellation shapes as basis for shape recognition.  Remarkably, it demonstrates some success when recognizing crude facial sketches.  It relies on the connectivity between the shapes in the facial sketch.  The recognition algorithm may be called a constellation model, but it's really a graph-type recognition algoirthm.  Strokes/individual shapes form a node, and the connection between the nodes are used as part of the recognition algorithm.  I won't go into the specifics of the algorithm.

For evaluation, the author used 5 classes of objects with 7–15 labels each.  They have been tested on drawings having 3–200 strokes. The author used 20-60 training examples for each class.  The algorithm's major weakness was its computation time.  It took over an hour to recognize a face. 

Discussion:
This paper takes an...intersting...approach.  Truthfully, I would never have considered using constellation shapes as part of ANY recognition system; the thought never occurred to me.  I wonder if this will be of any help in the 2nd homework assingment...If anyone has any ideas regarding this, please share.  The author demonstrated (some) success in recognizing complex shapes.

I must admit, the numbers for evaluation (3-200 strokes in a drawing) were impressive.  The computation time is a problem, but the algorithm recognized complex shapes such as sailboats and faces.

Reading #11. LADDER, a sketching language for user interface developers. (Hammond)

Comment Location:
http://martysimpossibletorememberurl.blogspot.com/2010/10/reading-11-ladder.html

Summary:
LADDER is a subset of Paleo, if memory serves.  The paper says, "LADDER allows interface designers to describe how shapes in a domain are drawn, displayed, and edited".  Translation: LADDER gets some basic information about the stroke the programmer then plays with (like Paleo).  LADDER deals with primitive shapes only, so it can't recognize a drawing of a cat; the programmer would need to use the line information collected by LADDER to determine the sketch is a cat. 

LADDER defines a shape "in terms of previously defined shapes and constraints between them".  The pre-defined primitive shapes are everything we saw in the first homework except the multiple line types (like Polyline).  A sizable portion of the paper is devoted towards explaining the speicifics of the shape recognition, such as the definition of variables, variable values, procedure of recognition, etc.

Discussion:
Since we've already used LADDER (and Paleo) to do the 1st homework assignment and (I think) it went well, I've got no complaints.  It organizes strokes with enough diverse information so the programmer can do further recognition on the shape.  LADDER claims to have some more complex shapes such as rectangle and diamond.  If there was a triangle shape, I definitely would have liked to have seen it for the first homework assignment.