Week Eight
Back to Journal
Week Eight: I'll Let the Machine Do the Learning
After seeing that the original morphological text detection algorithm was not so robust to my fuzzy, lo-res natural scene images, I decided to spend a bit of time researching possible improvements. Recognizing text out of a background is a pretty classic example of a machine needing intelligence (though some odd few may argue that vision is not an intelligence-based task), so I wanted to come up with more features from my data set that could help my programs detect and isolate a pattern. I spent a few hours looking up signal processing and machine-learning techniques - do wavelet transforms, Gabor filters, or Markov random fields ring a bell to anyone? - and quickly realized that nothing so sophisticated could be implemented in the next two weeks for adequate testing. (It was really fun to read about though! I was particularly fascinated by
this paper; though it was studied more from the angle of the human perception of randomness, I thought it may be interesting to try to model text and background textures using Markov random fields. A project for the future - if I ever have the free time. Ha!)
I decided instead to use the low-tech information already within my test images - the distribution of colors and gray values of what my code deems "foreground" and "background", the contrast along edges in the image, the absolute and relative areas of the connected components in a region, and so on. I wrote a bunch of code to pull out 17 features of text candidate regions and 18 features of the connected (by pixel color) components within them, and since all that information is too much for my human brain to make sense of, I plopped my new data corpora into
Weka, everyone's favorite data mining system. I am so glad I had used Weka in a past research mini-project; it's been great to accumulate knowledge of what research tools are out there so I don't have to keep reinventing the wheel. In this case, I didn't want to have to reinvent clustering algorithms or attribute selection - they're complex algorithms already and probably a beast to debug! So now I have a round in my preprocessing suite in which Weka chews on my image metadata to give me a better idea of what might be text and what might be background. It's not a perfect science - I'm seeing different sets of attributes from that vector of features I pull out of the image be deemed statistically significant for each image in my test set, so now I don't feel so crazy that without Weka I couldn't find one that divided text from background properly in every case! - so the next bit of time in my internship is devoted to making sense of Weka's judgment so my code can decide on each test run which of the data clusters is important. And after that, finally, the paper! *pant, wheeze*
The baking is still going strong - I made a chocolate souffle cake from my Nigella book and it turned out delicious! I was worried about it; in my absent-mindedness I left it cooking for 3/4 of the total time at 50 degrees too high, but thankfully I did catch it and the water bath the pan was in probably kept the texture light and moist. It did not turn out so pretty because of the aforementioned error - the top may have hardened a bit more than it should, and it cracked a bit - but with a bit of powdered sugar on top it looked and tasted like a restaurant-quality chocolate cake, if I do say so myself. Thanks, Nigella! So You Think You Can Dance is also going strong - Twitch is my favorite person on the show, but I think Katee is the best dancer [at least, the best left! sigh] on the show, in terms of technique. I also like Chelsea a lot - yay, two of my top three picks are girls! Usually it is harder to go "wow" about the girls, since the guys they get on the show are so athletic and they throw a lot more stunts into their solos. The girls this season though are so technically strong, and they communicate really well through their art. Let's hope my picks make it to the top four!