## Straight Outta Hough: Line Detection with Hough Transform

Although machine learning is great for shape classification, for shape recognition, we must still use the old methods. Methods such as Hough Transform, and RANSAC.

In this post, we’ll look into using Hough Transform for recognizing straight lines. The following is taken from E. R. Davies’ book, Computer Vision: Principals, Algorithms, Application, Learning and Image Digital Image Processing by Gonzalez and Woods.

Straight edges are amongst the most common features of the modern world, arising in perhaps the majority of manufactured objects and components – not least in the very buildings in which we live. Yet, it is arguable whether true straight lines ever arise in the natural state: possibly the only example of their appearance in virgin outdoor scenes is in the horizon – although even this is clearly seen from space as a circular boundary! The surface of water is essentially planar, although it is important to realize that this is a deduction: the fact remains that straight lines seldom appear in completely natural scenes. Be all this as it may, it is clearly vital both in city pictures and in the factory to have effective means of detecting straight edges. This chapter studies available methods for locating these important features.

Historically, HT has been the main means of detecting straight edges, and since the method was originally invented by Hough in 1962, it has been developed and refined for this purpose. We’re going to concentrate on it on this blog post, and this also prepares you to use HT to detect circles, ellipses, corners, etc, which we’ll talk about in the not-too-distant future. We start by examining the original Hough scheme, even thoupgh it is now seen to be wasteful in computation since it has evolved.

First, let us introduce Hough Transform. Often, we have to work in unstructured environments in which all we have is an edge map and no knowledge about where objects of interest might be. In such situations, all pixels are candidates for linking, and thus have to be accepted or eliminated based on predefined global properties. In this section, we develop an approach based on whether set of pixels lie on curves of a specified shape. Once detected, these curves form the edge or region boundaries of interest.

Given $n$ points in the image, suppose that we want to find subsets of these points that lie on straight lines. One possible soltion is to fine all lines determined by every pair of points, then find all subsets of points that are close to particular lines. This approach involves finding $n(n-1)/2 \sim n^2$ lines, then performing $(n)(n(n-1))/2 \sim n^3$ comparisons for every points to all lines. As you might have guessed, this is extremely computationally expensive task. Imagine this, we check every pixel for neighboring pixels and compare their distance to see if they form a straight line. Impossible!

Hough, as we said, in 1962, proposed an alternative approach to this scanline method. Commonly referred to as the Hough transform. Let $(x_i, y_i)$ denote p point in the xy-plane and consider the general equation of a straight line in slope-intercept form: $y_i = ax_i + b$. Infinitely many lines pass through $(x_i, y_i)$ but they all satisfy the equation we saw, for varying values of $a$ and $b$. However, writing this equation as $b = -x_i a+y_i$ and considering the ab-plane – also called parameter space, yields the equation of a single line in parameter space associated with it, which intersects the line associated with $(x_i, y_i)$ at some point $(a\prime, b\prime)$ in parameter space, where $a\prime$ is the slope and $b\prime$ is the intercept of the line containing the both $(x_i, y_i)$s in the xy-plane, and of course, we are assuming that lines are not parallel, in fact, all points on this line have lines in parameter space that intersect at $(a\prime, b\prime)$. Here, this figure illustrates s the concepts:

In principle, the parameter space lines corresponding to all points $(x_k, y_k)$ in the xy-plane could be plotted, and the principal (goddammit, principle, principal, fuck this language!) lines in that plane could be found by identifying points in parameter space where large numbers of parameter-space lines intersect. However, a difficulty with this approach is that $a$, approaches infinity as the lines approaches vertical direction. One way around this difficulty is to use the normal representation of a line:

$x \cos(\theta) + y sin(\theta) = \rho$

Figure on the right below demonstrates the geometrical interpretation of parameters $\rho$ and $\theta$. A horizontal line has $\theta = 0^\circ$, with $\rho$ being equal to the positive x-intercept. Similarly, a vertical line has $\theta = 90^\circ$, with $\rho$ being equal to positive y-intercept. Each sinusoidal curve in the middle of the figure below represents the family of lines that pass through a particular point $(x_k. y_k)$ in xy-plane.

Let’s talk about the properties of Hough transform. Figure below illustrates the Hough transform based on the equation above.

On the top, you see an image of size $M\times M \bigvee M=101$ with five labeled white points, and below it shows each of these points mapped into the parameter space, $\rho\theta$-plane using subdivisions of one unit for the $\rho$ and $\theta$ axes. The range of $\theta$ values is $\pm 90^\circ$ and the range of $\rho$ values is $\pm \sqrt{2} M$ As the bottom image shows, each curve has a different sinusoidal shape. The horizontal line resulting from the mapping of point 1 is a sinusoid of zero amplitude.

The points labeled A and B in the image on the bottom illustrate the colinearity detection property of the Hough transform. For exampele, point B marks the intersection of the curves corresponding to points 2, 3, and 4 in the xy image plane. The location of point A indicates that these thre points line on a straight line passing through the origin $(\rho = 1)$ and oriented at $-45^\circ$. Similarly, the curves intersecting at point B in parameter space indicate that 2, 3, and 4 line on a straight like oriented at $45^\circ$, and whose distance from origin is $\rho = 71$. Finally, Q, R, and S illustrate the fact that Hough transform exhibits a reflective adjacency relationship at the right and left edges of the parameter space.

Now that we know the basics of HT and line detection using HT, let’s take a look at Longitudinal Line Localization.

The previous method is insensitive to where along the infinite idealized line an observed segment appear. He reason for this is that we only have two parameters, $\rho$ and $\theta$.There is some advantage to be gained in this, in parital occlusion of line does not prevent its detection: indeed, if several segments of a line are visible, they can all contribute to the peak in parameter space, hence improving senitivity. On the other hand, for full image interpretation, it is useful to have information about the longitudinal placement of line segments.

Ths is achieved by a further stage of processing. The additional stage involves finding which points contributed to each peak in the main parameter space, and carrying out connectivity analysis in each case. Some call this process xy-gruping. It is not vital that the line segments should be 4-connected (meaning, a neighborhood with only the vertical and horizontal neighbors) or 8-connected (with diagonal neighbors) – just that there should be sufficient points on them so that adjacent points are withing a threshold distance apart, i.e. groups of points arem erged if they are withing prespecified distance. Finally, segments shorter than a certain minimum length can be ignored too insignificant to help with image interpretation.

The alternative method for saving computation time is the Foot-of-Normal method. Created by the author of book I’m quoting from, it eliminates the use of trigonometric functions such as arctan by employing a different parametrization scheme. Both the methods we’ve described employ abstract parameter spaces in which poitns bear no immediately obvious relation to image space. N the alternative scheme, the parameter spaces in which points bear no immediately obvious visual relation to image space. In this alternative scheme, the parameter space is a second image space, whifcfh is congruent to image space.

This type of parameter space is obtained in the following way. First, each edge fragment in the image is produced much as required previously so that $\rho%” can be measured, but this time the foot of the normal from the origin is taken as a voting position in the parameter space. Taking %(x_0, y_0) as the foot of the normal from the origin to the relevant line, it is found that: $b/a = y_0/x_0$ $(x-x_0)x_0 + (y-y_o)y_0$ Thes etwo equations are sufficient to compute the two coordinates,$(x_0, y_0)$. Solving for$x_0$and$y_0\$ gives:

$x_0 = va$

$y_0 = vb$

Where:

$\frac{ax + by}{a^2 + b^2}$

Well, we’re done for now! It’s time to take a shower, then study regression, as I’m done with classification. I’m going to write a post about regression, stay tuned!

## A Short Stint with Principal Component Analysis

I get like, ~100 visitors daily, and I know it’s a lot for a person who’s just started, and I’m not even blogging for visitors, because this blog is basically my study notes, but I think, if you look hard enough, if you slant your eyes, you may find something of value in it. So why not introduce it to your friends? Buddies? Comrades? Homies? Compadres? Dusts? Rafighs? Ais? Do it!

I wanna be honest with you. One thing that I never understood is Principal Component Analysis. It’s very hard. At least to me. Because my college doesn’t offer any courses in the way of statistics (yes, I’m an Associates student!) I thought to myself, “Shieeet, I’m gonna read every resource there is. From Youtube, which I kinda feel like is the worst resource for learning – At least, sometimes – to a book about Computer Vision!” and ta-dah! Here we are. A book about Computer Vision is about to clear everything PCA. Let us commence learning! Put your learning hat and goggles on, and wear your diaphragm, because we’re gonna get screwed!

The following is based on Computer Vision, Principles, Algorithms, Applications, Learning by E. R. Davies.

One powerful way of approaching the task of data representation is Principal Component Analysis, or PCA. This involves finding the mean of a cluster of points in feature space, and then finding the principal axes of the cluster in the following way: First, an axis is found which passes through the mean position and which gives the maximum variance when the data are projected onto it. Then, a second such axis is found which maximizes variance in a directional normal to the first. This process is carried out until a total of N principal axes have been found for N-dimensional feature space. The process is illustrated in this figure:

In fact, the process is entirely mathematical and need not be undertaken in the strict sequence indicated above, it merely involves finding a set of orthogonal axes which diagonalize the covariance matrix. A matrix is diagonalizable if it has distinct eigenvalues in , the eigenspace. Eigenvalues are defined as (I know what eigenvalues are, this is just for people who are reading this post in the future and don’t wanna google):

Where is the matrix, in this case, the covariance matrix, is the eigenvector, and is the eigenvalue.

The covariance matrix for the input population is defined as:

where is the location of the th data point and is the mean of data points, and indicates expectation value for the underlying population. We can estimate from the equations:

Where:

As is real and symmetric, it is possible to diagonalize it using a suitable orthogonal transformation matrix , obtaining a set of orthonormal eigenvectors with eigenvalues given by:

The vectors are derived from the original vectors by:

And the inverse transformation needed to recover the original data vectors is:

Here, we have recalled that for an orthogonal matrix

In fact, it may be shown that is the matrix whose rows are formed from the eigenvectors of , and that the diagonalized covariance matrix is given by:

So that:

In what follows, we shall assume that eigenvalues have been placed in an ordered sequence, starting with the largest. In that case, represents the most characteristic of the set data points, with later eigenvalues representing successively less significant characteristics. We could even go so far as to say that, in some sense, represents the most interesting characteristic of the data, whereas would be devoid of interest. More practically, if we ignored we might not lose much information. And indeed the last few eigenvalues would frequently represent characteristics which are not statistically significant and are essentially noise. For these reasons, PCA is commonly used for reduction in dimensionality of the feature space to some lower value .

Well, this was a short blog post. I have understood the concept behind PCA, I just haven’t understood how I can calculate it. No worries! There’s always NumPy… Unless I’m in C++, then I’m screwed!

## Pseudocolor Image Processing in a Nutshell

First, let’s take care of the pleasantries. I’m ok. Are you ok? That’s fine. How’s the missus? Oh, she is thinking about suicide? No worries, just chock a handful of Zoloft at her! How’s the kid? You don’t have a kid? Why? Because you take Risperidone? No comments!

I haven’t talked about DIP, for like, ever. I talked about DSP in one of my opi, but that was mostly about DSP rather than DIP. So it’s time dive into Gonzalez and Woods 4th edition, chapter 6, and learn about Pseudocolor Image Processing – Something I always wanted to learn about. Let’s hope this is going to be a short post because I wanna watch muh Veronica Mars1.

Pseudocolor image processing consists of assigning colors to gray values based on a specified criterion. The term pseudo or false color is used to differentiate the process of assigning colors to achromatic images from the process associated with true color images, a topic which I’m going to learn, and “teach” in the near future.

The principal use of pseudocolor is for human visualization and interpretation of grayscale events in a single image, or a sequence of images. But why do we want to convert a grayscale image to color? Are we all affected by the disease aptly called Ted Turneritus? No, we do this because humans can discern thousands of color shades, as compared to less than two dozen shades of gray.

The techniques of intensity slicing, and color coding are the simplest and earliest examples of pseudocolor processing of digital images. If an image is interpreted as a 3D function, like this:

the method can be viewed as one of placing planes parallel to the coordinate plane of the image; each plane then “slices” the function in the area of intersection. In the image below you see an example of using a plate at to slice the image intensity function into two levels.

If a different color is assigned to each side of the plane, any pixel whose intensity level is above the plane will be coded with one color, and any pixel below the plane will be coded with another. Levels that lie on the plane itself may be arbitrarily assigned one of the two colors, or they could be given a third color to highlight all the pixels at that level. The result is between two to three color images whose relative appearance can be controlled by moving the slice plane up and down in the intensity axis.

A simple but practical use of intensity slicing is shown below, with the grayscale image on the left and the pseudocolor image on the right:

This is an image of the Picker Thyroid Phantom, a radiation test pattern. Regions that appear of constant intensity in the grayscale image are actually quite variable, as shown by the various colors in the intensity image. For instance, the left lobe is a dull gray in the grayscale image, but has various colors of red, yellow and green in the intensity image. Below you see the representation of one of the grayscale intervals and its slicing:

In the preceding simple example the grayscale was divided into intervals, and a different color was assigned to each, with no regard for the meaning of the gray levels in the image. Interest in that case was simply to view the different gray levels constituting the image. Intensity slicing assumes a much more meaningful and useful role when subdivision of the grayscale is based on the physical characteristics of the image. For instance, in this image:

We see an X-ray image of a weld, containing several cracks and porosities. When there is porosity or crack in the weld, the full strength of the X-rays going through the object saturates the image sensor on the other side of the object. Thus, intensity values of 255, white, in an 8-bit image coming from such a system automatically imply a problem with the weld. If the human visual analysis is used to inspect welds, a simple color coding assigns one color to 255, and one color to the rest. Thus helping the inspectors to assess the problem better.

One of the things that interests geologists is measuring rainfall levels, especially in tropical regions of Earth. Tropical Rainfall Measuring Missions, TRMM satellite utilizes among others, three sensors especially designed to detect rain: a precipitation radar, a microwave imager, and a visible and infrared scanner. The results from the various rain sensors are processed, resulting in estimates of average rainfall over a given time period in the area monitored by the sensors. From these estimates, it is not difficult to generate grayscale images whose intensity values correspond directly to the rainfall, with each pixel representing a physical land area whose size depends on the resolution of the sensors. Such an intensity image is shown below:

Visual examination of such picture is prone to error. However, suppose that we code intensity levels from 0 to 255 using the colors shown here:

Which results in:

Thus making it easier for the scientists to meter up the rainfall levels.

There are other methods that are more general, and thus are capable of achieving a wider range of pseudocolor enhancement results than the simple slicing techniques discussed is the preceding section. Below you see an approach that is particularily attractive. Basically, the idea underlying this approach is to perform three independent transformations on the intensity of input levels. The three results are then fed separately into the red green, and blue channels of a color monitor. This method produces a composite image whose color content is modulated by the nature of the transformation function.

Now, look at this image:

It shows two monochrome images of a luggage obtained from an airport X-ray scanning system. The image on the left contains ordinary articles. The image on the right contains the same articles, as well as a block of simulated plastic explosives. The purpose of this example is to illustrate the use of intensity to color transformations to facilitate detection of explosives.

Look at this image:

It shows the transformation functions used. These sinusoidal functions contain regions of relatively constant value around the peaks as well as regions that change rapidly in the valleys. Changing the phase and frequency of each sinusoid can emphasize ranges in the grayscale. For instance, if all three transformations have the same phase and frequency, the output a grayscale image. A small change in the phase between the three transformations produces little change in pixels whose intensity correspond to peaks in the sinusoids, especially if the sinusoids have broad profiles (low frequencies). Pixels with intensity values in the steep section of the sinusoids are assigned a much stronger color content as a result of significant differences between amplitudes of the three sinusoids caused by the phase displacement between them.

The pseudocolored image of the luggage was obtained using the transformations we just saw in the above on the left, which shows the gray-level bands corresponding to the explosive, garment back, and background, respectively. Note that the explosive and background have quite different intensity levels, but they were both coded with approximately the same color as a result of the periodicity of the sine waves. The second pseudocolored image of the luggage was obtained with the transformation on the right. In this case, the explosives and garment bag intensity bands were mapped by similar transformations, and thus received essentially the same color assignment. Note that this mapping allows and observer to “see” through explosives.

It is often in our interest to combine several grayscale images, and then transform them to pseudocolor. Frequent use of this approach is in multispectral image processing, where different sensors produce individual grayscale images, each in a different spectral band.

Ok, that shall be it for tonight! I will be back tomorrow with a post about machine learning… Deep learning, perhaps? I’m not sure. There’s a lot to learn. There’s a lot to process.

My brother broke my hookah, and beat up my mom. I think, just like me, he’s got bipolarity. But well, he’s a proud 5th semester psychology student and there’s no way in hell we could get him to take medication for his mental handicap. We get bipolarity from our father’s side, and he gets it from his mother’s side. Both my uncles are bipolar, my youngest brother has Lithium deficiency with a wee bit of bipolarity, but my middle broter is… Kinda fucked up. The symptoms are showing. I don’t know why he’s so proud. Our second cousin once removed – my father’s cousin, has bipolarity. So do both my uncles, and they’re being medicated for it. Not many women get bipolarity. It’s just not in the evolution of our species. We’re bipolar because nature wanted us to hibernate during the winter, but it fucked up, so based on seasonal changes, bipolar people like me and many people in my family get depressed, it gets harder and harder for them to step out of their dwelling, they get depressed, and after depression passes, they become manic. It’s HELL.

Anyways, if you’re bipolar, I hope you’re getting the medication and the help you deserve. Many people in developing countries suffer for the cheapest of psych meds, even Lithium. This is bad.

Well, to the books!