Optimizing an image cropping algorithm

In my impending migration from aspimage.dll to a more efficient (crash free) image component I have discovered a grave shortfall of imageglue for asp classic.  There is one particular operation my code must perform that must be used.  That simple operation is GetPixel(x,y).  This has a counterpart in imageglue known as GetColor(“x,y”), both return the color at the current location.  By my estimate aspimage performs this operation nearly 50-100x faster than imageglue does.  This is huge when you have nested loops that need to read each pixel of the image.  In my testing today I plugged my nearly complete wrapper class into my existing code (see this post for more information).  The code was processing, but eventually timed out.  After some examination I discovered what was happening.  Imageglue was performing so much slower the code actually caused the page to time out :(.  So much for a quick and easy transition.  Immediately I realized what I was facing, this could be difficult to make work. 

First I should explain what this actually does. As you would assume this is generating images, specifically text on images.  Neither aspimage nor Imageglue have precise enough text measurement features to allow you to measure down to the pixel the text height.  This presents a problem if you intend to crop the picture directly to the edge of the text.  Enter the problem: In aspimage the previous coder (let’s not even go there) used a very inefficient algorithm to locate the edges of the text.  I put the image below as an example.  What is the most simple was to find the edges of this text on the image?  Read each pixel, starting from each of the 4 corners (separate loops) and when you meet up with a pixel that is a different color you have found your edge.  Once you have traversed all 4 edges you now have your crop area.  So, in my testing an image like this has over 50,000 pixel reads.  Ouch…

sometext.jpg

Amazingly, that approached worked nearly instantaneous with ASPimage.  I was really surprised by this since it is a lot of memory access to accomplish this quickly.  Imageglue on the other hand as I discovered takes over 50x longer to complete the same operation.  Luckily I have semi solution to the problem.   Obviously the algorithm needs some work.  Instead of sampling every pixel why not break the samples up into steps?  That is what I tried and it worked, but still really slow.  My goal was to get an image generated very quickly < 500ms if possible.  I was at about 2 seconds with this approach.  Below is an illustration of the second algorithm.  What you see is the sampling of pixels.  When it finds an edge, it stops and goes to the next side to sample.

sometext2.jpg

Shown above is one of 4 loops to find the edge, this particular one demonstrates locating the top X coordinate.  Now what is noteworthy by looking at the image is the location which the stepping found the edge.  If we were to crop the image at the inner most part it would cut off the S a bit.  In fact the range of this occuring is simply the same as the stepping value.  So if you step 10 pixels at a time you can “overshoot” your text as much 10 pixels.  So we have a trade off, the 1st algorithm finds the edge perfectly down to the nearest pixel, but took 50,000 reads.  This one looks to do alot better (about 1,500-2,000 reads) but now we have the issue of overshooting our target.  I will confess on one of my sites (using aspimage) in production I simply accepted this was the best the algorithm would do.  I tweaked the stepping and simply cropped the dimensions that it returned plusthe stepping value.  This insured a nicely cropped image with no cut off points, but it could be miss-cropped as much as my stepping value was set at (10px).  Now fast forward to this algorithm on imageglue.  We are still fighting slow pixel reads, about 1.5-2 seconds to generate this image.

My current solution is to take algorithm 2 and tweak further, we simply need less pixel reads, but more accuracy.  My idea was to essentially do a quick stepping with a large stepping value such as 20-25 pixels to locate the edge of the text quickly, then backup one column and do a fine pass (about 5px steps).  Below is an illustration of what this looks like.


sometext3.jpg

It may not be very clear on this example, but image a larger image where more scanning needs to be done, this is much faster.  It is about 2x as fast in-fact just from my initial test.  So I cut we now cut our pixel samples in roughly half.  This is where I am at right now with this issue.  I am now in the usable zone for use with image glue, but I’m frustrated by not getting a snappy image generated.  

I can hear my algorithm analysis professor now from college giving me a lecture on this issue.  Amazing, you never think you will need to know that stuff, but here I am :).  My question goes out to anyone reading this, what if any is a more efficient approach to this problem?

Tagged: , , ,