Classes and Miscellaneous: Reading #14. Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams (Bhat)

Saturday, October 16, 2010

Reading #14. Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams (Bhat)

Comment Location:
http://pacocomputer.blogspot.com/2010/11/reading-14-using-entropy-to-distinguish.html

Summary:
This is another paper that distinguishes between text and shapes--good. The algorithm relies on one feature: entropy--even better. The author used zero-order entropy (meaning symbols are independent of each other) and this simplifies the algorithm. It creates an alphabet of angle differences between stroke points and uses that as the basis of entropy. Text characters have significant more variance in the angles than shapes.

The evaluation metric appears sound. A total of 756 strokes were used and the algorithm attained a high accuracy for the strokes it identified. The problem is 25% of the strokes were not classified as text or shape with high confidence. The algorithm identified the extreme cases, but balked at the ambigious ones. The algorithm was more accurate with text strokes than geometric strokes.

Discussion:
The algorithm uses one feature to classify between shape and text. I like it. This is even better than the last paper. Assuming the accuracy is to be believed, the algorithm is of great use to our programming project. An additional filter to handle the ambiguous strokes could prove very useful.

I'm a little unclear how the alphabet angles are grouped; I imagine it's the angle between 2 different stroke points, but I'm not sure. If anyone knows for sure, please share.

The paper said it used the data from the COA domain. Isn't that the data WE are using in our 2nd programming assignment? If so, then we should definitely use an algorithm that already works for this data set. If anybody has an existing implementation of this algorithm, please post it to the Google groups for the class.

2 comments:

Jianjie ZhangOctober 17, 2010 at 8:44 PM
I guess, the author uses the time interval. If the time interval of two strokes are more than 100ms, then they will be divided into two groups. Then for each group of stroke, calculate its information entropy.

I guess maybe someone has the code, because it was done in sketch recognition lab.
ReplyDelete
Replies
Francisco (Paco) VidesNovember 3, 2010 at 11:36 AM
I agree with JianJie about the fact that the strokes are grouped on a time basis, furthermore a longer time between strokes is allowed if the strokes overlap, since they are more likely the same shape. However I think the question here was about the groping of alphabet symbols which is a later process. In this case each stroke is resampled and the result is a list of ordered dots, which you can join with straight lines as in a "connect the dots" puzzle. Every consecutive pair of these lines joined by a dot form an angle between them. And this is the angle used for symbol classification.
ReplyDelete
Replies

Add comment