Elimination Of Small Connected Components

Published Date: 02 Nov 2017

R.Sangeetha, P.G Student,

Department of Computer Science and Engineering, Anna University, Chennai.

Abstractâ€” A computer-aided diagnosis system is described to identify the pathological lung tissue patterns (healthy, emphysema, ground glass and tuberculosis) from CT images of the chest. The collected CT images are segmented using region growing algorithm and closing mathematical morphological operation. A set of texture features are developed based on quincunx wavelet frame transform along with complementary measures of density using gray level histogram. A support vector machine (SVM) classifier is constructed and trained with this set of features to discriminate among the classes of healthy and pathological lung tissue types in segmented CT scan images. Wavelets have proved particularly effective for extracting texture features. In this paper, wavelet performances in terms of classification accuracy can be pushed further by optimizing its parameters. Since the random selection of initial quincunx wavelet parameters; degree of the wavelet transform (Î³) and number of iterations (N) affects the performance of the SVM classifier. The main aim of this paper is to optimize these wavelet parameters along with SVM model parameters using Particle Swarm Optimization algorithm.

Index Termsâ€” Interstitial lung diseases (ILDs), wavelet transform, lung tissue analysis, texture analysis.

I. INTRODUCTION

The continuous enhancements in the field of information technology along with digital imaging in the medical domain improves the development of image based computer aided diagnostic tool to identify the lung tissue patterns from chest computed tomography images of patients affected with interstitial lung diseases. Interstitial lung diseases (ILDs) are one of the most common medical conditions worldwide. Hundreds of millions of people around the world suffer from these lung diseases. An interstitial lung disease refers to the group of lung diseases affecting the lung parenchyma and it leads to respiratory failure if the cause is not removed. So diagnosis of these pathologies is very important.

Initially the primary imaging procedure used for diagnosis of ILDs is chest X-ray due its low cost and weak radiation exposure. It provides a quick overview of the entire chest. If the chest X-ray is normal and further tests are required to identify the shortness of breath. It provides confident diagnosis in only 23% of the cases with lung diseases.

Computed Tomography (CT) images can provide an accurate assessment of lung tissue patterns. The appearance and quantification of the lung tissue patterns in CT are very informative for the diagnosis of ILD. The interpretation of CT images of the chest showing disorders of the lung tissue associated with ILDs is timeâ€“consuming and requires experience. Due to the complexity of the interpretation of CT images, Computer-aided diagnosis (CAD) tools were proposed to assist the radiologists in the CT interpretation tasks. Automatic detection and quantification of the lung tissue patterns in CT images have the advantage to reduce the risk of omission of important lesions. A reliable CAD could improve the radiologistsâ€™ efficiency and avoid surgical lung biopsies for some patients.

CAD analysis can provide quick and precious information for emergency radiologists and other non-chest specialists. Whereas the radiologistsâ€™ ability to interpret CT data is likely to change based on the field experience, Ergonomics, and time of the day, but computer based classification of lung tissue patterns is reproducible.

Diffuse lung disease is a common abnormality that occurs in lung parenchyma and can cause fatal illnesses. Although the correct diagnosis of diffuse lung disease is very important, it is one of the most difficult tasks for radiologists, because the contrast between malignant tissue and normal tissue may be present but below the threshold of human perception. It is for this reason that computer-aided diagnostic schemes have been developed for assisting radiologists in the detection and diagnosis of diffuse lung disease. Early CAD schemes for diffuse lung disease were developed in chest radiography. Because thoracic computed tomography usually provides much clearer imaging of diffuse lung disease than does chest radiography, researchers started to develop CAD schemes for diffuse lung disease in CT. The approach to design such an image-based, CAD system is to segment lung regions, extract relevant texture features from segmented image and the use of supervised machine learning techniques to classify the image as healthy or pathological pattern.

The rest of the paper is organized as follows: The literature survey is presented in the second section. The third section deals with system architecture with methodologies used in this paper. It gives the modules used to complete this work. The last section provides conclusion with the future scope of the work.

II. RELATED WORKS

A number of different approaches have been proposed to identify the lung tissue patterns associated with ILDs.

Adrien Depeursing et al. (2012) developed a texture classification system to identify the five types of lung tissue patterns (healthy, emphysema, ground glass, fibrosis and micronodules). The specific texture signatures of the lung tissue patterns are hardly described by the deterministic methods because intraclass variances are large due to the influence of the factors such as the age of the patient, smoking history and extend of the disease. So in order to catch subtle texture signatures of the given lung tissue pattern, they proposed to develop invariant set of texture features (translation, rotation and scale invariant) based on wavelet transform along with complementary gray level histograms to impartially learn any texture appearance independently of any orientation or size. They used High Resolution Computed Tomography (HRCT) lung images from patients affected with ILD. They have achieved a retrieval precision of 76.9% against 33.1% achieved with single feature vector approach.

Lauge Sorenson et al. (2010) presented a texture classification based system for emphysema quantification in CT images. They used to differentiate three classes: healthy, centrilobular and paraseptal emphysema. They stated that current standard measures in the relative area of emphysema focus on a single intensity threshold on individual pixels which ignored interrelations between pixels. So they proposed to use Local Binary Patterns as texture features for a much richer representation that also takes the local structures around pixels into account, along with intensity histograms for characterizing regions of interest. They used a combination of thresholding and connected component analysis to segment the lung parenchyma pixels in the HRCT image. Local Binary Patterns (LBP) is used as lung texture features. They stated that emphysematous tissue contain more edges and homogeneous dark areas compared to healthy tissue and also microstructures are expected to exist at different scales and frequencies according to the severity of the disease state. They used K- Nearest Neighbor (K-NN) classifier to discriminate lung tissue patterns. It is the natural classifier of choice when working in a distance representation of objects. They used HRCT lung images of chest. They have achieved classification accuracy of 95.2%, with sensitivity and specificity of 97.3% and 93.2%, respectively. They stated that their proposed technique performs slightly better than the Gaussian Filter Bank approach.

Panayiotis D. Korfiatis et al. (2010) presented a computer aided scheme for quantification of interstitial pneumonia (IP) patterns. They used to differentiate three types of lung tissue patterns; normal, ground glass and reticular. To identify diseased areas attached to lung borders which may not be included in subsequent texture analysis, they employed Lung field segmentation data pre-processing as additional step. The below are the techniques used by them to classify the patterns. They achieved lung field segmentation using 3-D automated gray-level thresholding combined with an edge-highlighting wavelet preprocessing step, and then perform texture-based border refinement step. They used vessel-tree segmentation method utilizing an unsupervised thresholding of responses produced by a 3-D multiscale enhancement filtering of vessels tubular structure. The identified vessel tree volume is removed from LF, to obtain the LP volume. They used Gray-Level Co-occurrence Features to analyze the spatial distribution of gray levels in the image. They were calculated 130 features per volume of interest. They employed stepwise discriminant analysis (SDA) technique to reduce the dimensions of feature vector. They used K-NN classifier to classify the lung tissue patterns with Euclidean distance computed between normalized feature vectors. They used CT scan lung images from patients affected with IP built at the University Hospital of Patras, Greece. They used 13 CT scans corresponding to four normal patients and ten patients diagnosed with IP. They have evaluated in identifying and characterizing ground glass and reticular patterns by means of volume overlap (ground glass: 0.734 Â± 0.057, reticular: 0.815 Â± 0.037) on five Multidetector CT (MDCT) scans.

Elizabeth et al. (2009) proposed a computer aided detection system for the detection of bronchiectasis from computed tomography images of chest. They have collected CT images of the chest and images were denoised using wiener filter. Optimal thresholding method is used for segmentation of lung parenchyma. They have used gray level co-occurrence matrix to construct feature vectors. They used the Mahalanobis distance measure and Probabilistic Neural Network for classificaton task and compared the performance of the system using both techniques. They used one image database which consists of 1500 CT images of the chest that includes 168 CT images affected by bronchiectasis, 159 normal CT images, and the remaining CT images affected by other lung disorders. Their proposed approach gave higher efficiency with probabilistic neural network to classify the images as diseased or not.

Elizabeth et al. (2012) proposed a novel segmentation approach for improving diagnostic accuracy of CAD Systems for detecting Lung cancer from chest computed tomography images. They used an optimal thresholding technique and operations based on convex edge and centroid properties of the lung region for segmentation of lung tissue in a CAD system to detect lung cancer from chest computed tomography images. They have used gray level co-occurrence matrix to construct feature vectors. They used the Probabilistic Neural Network for classificaton task. The performance of the CAD system was tested using 20 images with peripherally placed PBR, 80 images with internally placed PBR, and 100 normal lung images. The CAD system that uses optimal thresholding based solely on grayscale values achieved an accuracy of 88.5%. They have compared the performance of the proposed technique with other techniques. The CAD system that uses optimal thresholding followed by a rolling ball operator achieved an accuracy of 95%, and the CAD System that uses the proposed segmentation technique achieved an even higher accuracy, of 97%. They stated that the lungs can be correctly segmented even in the presence of peripheral pathology bearing regions in their proposed approach.

Comparing with the existing system [1] discussed in the literature, the aim of this proposed work is to optimize the SVM classifier accuracy by automatically (1) estimating the best values of the wavelet parameters. Moreover, as far as we know, no attempt has been made to optimize these wavelet parameters in this field. (2) detecting the best set of discriminative features since the number of features derived from wavelet transform depends on number of decomposition levels. (3) solving the SVM model selection issue (estimating best values of cost and kernel parameter). In order to attain this, the proposed system is derived from an optimization framework based on PSO.

III. SYSTEM ARCHITECTURE

The system design describes the methods of the proposed system in the following sequence: Segmentation, Feature extraction, Multiclass classification with parameters optimization using particle swarm optimization.

Figure 1 depicts the overall system architecture. Each component of the system is described in the following sections.

3.1 SEGMENTATION

Segmentation of the lung regions is a required preliminary step to lung tissue categorization. This process is used to isolate the lung parenchyma from the surrounding structures in the image which decreasing the volume of the data to be analyzed. The input to this process is a JPEG image of chest CT scan of size 512 x 512 pixels. It involves the following steps.

Optimal thresholding

Extraction of lung regions

Elimination of small connected components

Edge detection

Closing mathematical morphology operation

Chest CT Images

Segmentation

Parameter Optimization using PSO

Feature Extraction

Multiclass Classification

Predicted class labels

Figure 1. System Design

3.1.1 Optimal Thresholding

Optimal thresholding is the first step in segmenting the image. A CT image contains two main groups of pixels: high intensity pixels located in the body and low intensity pixels that are in the lung and the surrounding air. Because of this large difference in intensity between these two groups, thresholding leads to a good separation. The method applied here is the optimal thresholding proposed by Hu et al (2001). This iterative procedure computes the value of a threshold so that the two groups of pixels are well separated.

3.1.2 Region Growing Algorithm

The aim of this process is the removal of the areas that does not belong to the actual lungs. It performs a segmentation of an image with examine the neighboring pixels of a set of points, known as seed points, and determine whether the pixels could be classified to the cluster of seed point or not.

Input: Threshold image.

Process Logic: Step 1. Start with a seed pixel defined by the user and added to the region.

Step 2. The region is iteratively grown by examining all unallocated neighboring pixels to the region. The difference between a pixels intensity value and the regions mean is used as a similarity measure. The pixel with the smallest difference measured this way is allocated to the iteratively grown region.

Step 3. This process is finished when the intensity difference between region mean and new pixel is larger than a certain threshold (computed from optimal thresholding).

Output: Region growing output image.

3.1.3 Elimination of Small Connected Components

Small connected components which are not a part of the lung parenchyma with an area less than 700 pixels are identified and eliminated.

Input: Region growing output image.

Process Logic: Small connected components with an area less than 700 pixels are identified and filled with their surrounding pixel values.

Output: Region growing output image with the lung region alone without any connected components smaller than 700 pixels.

3.1.4 Edge Detection

After region growing algorithm is finished, edge detector is applied on the image, to create a mask for the extraction of the lungs.

Input: Region growing output image with the lung region alone without any connected components smaller than 600 pixels.

Process Logic: Sobel operator is applied for edge detection.

Output: Edge detection output image.

3.1.5 Closing Mathematical Morphology Operation

Once segmentation is complete, morphological operations can be used to remove imperfections in the segmented image. Region growing algorithm represents the global lung regions well but the resulting image contains many holes. Closing operation is used to fill these holes.

Input: Edge detection output image.

Process Logic: Morphological closing operation is applied to binary mask to close the borders of the edges using spherical structuring element of radius 1. Finally edges outside the lung region are filled. Extraction is done by combining the regions of binary map of the entire lung and from binary map of just its edges.

Output: Segmented image.

3.2 FEATURE EXTRACTION

The feature extraction step is crucial in an automatic texture classification. This is because a classifier can operate reliably only if the features of each event are selected properly. In order to automatically categorize every pixel of segmented image, the image is divided into 32x32 blocks. The following feature extraction techniques are applied for each block. The input to this process is the output of lung image segmentation. The output of this process is the feature vector v for each block using quincunx wavelet frame transform and gray level histogram. The method applied here is proposed by A. Depeursing et al. (2012).

V = {bin0, bin1â€¦ bin21, pixair, Î¼0, Ïƒ10, Ïƒ20, Î¼1, Ïƒ11, Ïƒ21â€¦ Î¼7, Ïƒ17, Ïƒ27}

It involves the following steps.

Gray level histogram

Quincunx wavelet frame decomposition

Gaussian scale mixture model with Expectation Maximization algorithm

3.2.1 Gray Level Histogram

It estimates the count of how many pixels possess a given gray level value which corresponds to the density of the observed tissue. Gray level values always contain useful information in the image. The distribution of gray level values shows high variability among four lung tissue patterns.

Input: 32x32 block of segmented image.

Process Logic: Step 1. Find the minimum and maximum value in [0 - 255] display range using the below linear equation for x = -1000 and x = 600 (A. C Horwood et al. (2001)).

y (greyscale) = 127.5 + 0.1275x.

Step 3. Initialize Number of bins = 22.

Step 2. Calculate bin width.

Bin Width =

Step 3. Compute the bin positions using bin width between min and max values.

Step 4. Histograms of pixel values are computed between min and max ranges.

Output: 23 features.

(i) 22 histogram bins [bin0, bin1â€¦ bin21].

(ii) Number of pixels with value = 0 [Pixair].

3.2.2 Quincunx Wavelet Frame Decomposition

The input image is decomposed into a multiresolution hierarchy of sub band (scale) images. These sub band images contain some additional / hidden information about the original image which is used for image analysis. This decomposition yields set of wavelet coefficients which characterize the frequency content of the image. From these coefficients we can compute texture parameters.

Input: 32x32 block of segmented image.

Process Logic: Wavelet coefficients are obtained by applying quincunx wavelet frame decomposition using randomly chosen parameters (Î³, N).

Output: Wavelet coefficients for each sub band.

3.2.3 Gaussian Scale Mixture Model with Expectation Maximization Algorithm

It is used to compute texture parameters from wavelet coefficients. The distributions of the wavelet coefficients in each sub band are characterized through the parameters of a simple GSM model of two Gaussians with a fixed mean Î¼1, Î¼2 = Î¼ and two standard deviations Ïƒ1,2, which are estimated using the expectationâ€“maximization (EM) algorithm. The method applied here is proposed by A. P. Dempster et al. (1977).

Input: Wavelet coefficients for each sub band.

Process Logic: Step 1. The two Gaussians have a weight of 0.5 each. Î¼ is initialized with the mean of the wavelet coefficients Sj. Ïƒ1,2 are initialized using the range

rSj = max(Sj ) âˆ’ min(Sj ) as follows:

Ïƒ1 = rSj , Ïƒ2 = rSj /10.

Step 2. Expectation Step makes soft classification of the values into one of the two Gaussian mixtures.

Step 3. Maximization Step estimates the parameter (sigma) of each class.

Step 4. Steps 2 to 3 were repeated until the difference between previous sigma value and current sigma value was less than or equal to epsilon value 1.0e-4.

Output: 3 Parameters.

(i) Mean (Î¼)

(ii) Standard Deviation (Ïƒ1, Ïƒ2)

3.3 MULTICLASS TEXTURE CLASSIFICATION WITH PARAMETER OPTIMIZATION

The CT scan chest images are classified using support vector machine. It involves deciding whether the CT image given by a user is one of the four lung tissue patterns. The method applied here is proposed by J. Kennedy et al. (2001).

Input: Segmented image.

Process Logic:

Step 1. Initialize maximum no. of iterations and no. of parameters to be optimized.

Step 2. Generate initial set of parameter values.

Step 3. Initialize velocity vectors vi associated with the particles.

Step 4. For each position of the particle Pi from the population, compute features using quincunx wavelet frame transform, train the classifier and compute the classification accuracy.

Step 5. Initialize the best position of each particle with its initial position.

Step 6. Find the best global position in the population exhibiting the maximum value of the considered fitness function.

Step 7. Update the velocity of each particle using

where is a inertia weight factor.

and are random values from the range [0,1].

and are acceleration constants used to regulating the relative velocities with respect to the best local and global positions respectively.

Step 8. Update the value of each particle using

Step 9. For each particle pi, train the classifier and compute the corresponding fitness function f (i).

Step 10. Update the best local value of each particle if their current fitness value is greater than previous one.

Step 11. If the maximum number of iterations is not yet reached, return to step (5).

Step 12. Compute the features using QWF fed with the global best value of the two parameters N and Î³ and then train the classifier with global best value of gamma and cost.

Step 13. Classify the CT scan images with the trained classifier.

Output: Predicted class labels.

IV. CONCLUSION

This paper described a texture classification system with PSO based wavelet optimization approach that identifies lung tissue patterns from computed tomography images of patients affected with interstitial lung diseases. This work aims at optimizing the performance of SVM in terms of accuracy by finding the best set of features and estimating the best values of wavelet parameters along with SVM parameters. We believe that the proposed PSO based Waveletâ€“SVM classification system should boost the generalization capability achievable with the SVM classifier.

Our Service Portfolio

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

Do not panic, you are at the right place

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now