Week Nine
Back to Journal
Week Nine: The Code is Done!
I worked Weka's clustering techniques into my text candidate analysis this week, and after a bit of toying around I'd say I'm done with my code - well, done for the scope of the internship. Margaret and I have been discussing trying to publish some of this work, and I'm a bit conflicted about that - I've really liked my project and want to see the entire application take workable shape and not just the image processing stuff I've been focusing on here, but I'm still grappling with how to fit in everything I already have planned for this coming semester. I am applying to grad school this year, which is daunting in itself, and I already know of one paper my lab at Tech wants to submit (to NSDI! due in October! WE HAVE BARELY STARTED OH MY GOD) which will keep me completely busy. And to think I was planning on starting that work-life balance thing soon...
Anyway, the code! The code is good, kind of. What's kind of sad is that while to me, looking at my test images after I process them, the region-based clustering makes a pretty good partition of text and non-text regions, to the computer they're just "cluster1" and "cluster2"; picking which cluster is the one that contains text takes us right back to the original problem, not knowing what "text" looks like mathematically. I'm now seeing the work I did for this internship as a good starting point for future work (should I ever get less busy to work on it again) - I've done a lot of background research and know the basics of the field I've gotten myself into, but in the future it's time to start trying the sophisticated stuff like using some transform or training a machine learner on my corpus to tell us more about the images in it. There's much progress to be made, though my little foray has been enlightening at least for me.
So I've started writing things up - namely all that background reading I did - and see a busy week of paper writing coming up. Thankfully I have had a few diversions, most interesting of which was my boyfriend's graduation. I've actually been back in Atlanta for two weeks now doing my DMP work remotely (which has been working very well! I'm still productive here, and Margaret and I work well by email and Skype) so I was glad to be able to see my guy walk across the stage looking like all the other guys getting their not-diplomas from the interim President. ;) Lots of his family came, too, and it's always nice to see them. Glad he is now free - I only wish I were done with my work too! And so starts the grind to finish this paper. :)