Resume

Computer Science Medical

Location:

Jonesboro, AR

Posted:

February 14, 2013

Contact this candidate

Resume:

Kockara et al. BMC Bioinformatics ****, **(Suppl *):S26

http://www.biomedcentral.com/1471-2105/11/S6/S26

PROCEEDINGS Open Access

Analysis of density based and fuzzy c-means

clustering methods on lesion border extraction

in dermoscopy images

Sinan Kockara1*, Mutlu Mete2*, Bernard Chen1, Kemal Aydin3

From Seventh Annual MCBIOS Conference. Bioinformatics: Systems, Biology, Informatics and Computation

Jonesboro, AR, USA. 19-20 February 2010

Abstract

Background: Computer-aided segmentation and border detection in dermoscopic images is one of the core

components of diagnostic procedures and therapeutic interventions for skin cancer. Automated assessment tools

for dermoscopy images have become an important research field mainly because of inter- and intra-observer

variations in human interpretation. In this study, we compare two approaches for automatic border detection in

dermoscopy images: density based clustering (DBSCAN) and Fuzzy C-Means (FCM) clustering algorithms. In the first

approach, if there exists enough density greater than certain number of points- around a point, then either a new

cluster is formed around the point or an existing cluster grows by including the point and its neighbors. In the

second approach FCM clustering is used. This approach has the ability to assign one data point into more than

one cluster.

Results: Each approach is examined on a set of 100 dermoscopy images whose manually drawn borders by a

dermatologist are used as the ground truth. Error rates; false positives and false negatives along with true positives

and true negatives are quantified by comparing results with manually determined borders from a dermatologist.

The assessments obtained from both methods are quantitatively analyzed over three accuracy measures: border

error, precision, and recall.

Conclusion: As well as low border error, high precision and recall, visual outcome showed that the DBSCAN

effectively delineated targeted lesion, and has bright future; however, the FCM had poor performance especially in

border error metric.

Introduction Dermoscopy is the major non-invasive skin imaging

Melanoma is the fifth most common malignancy in the technique that is extensively used in the diagnosis of

United States and has rapidly become one of the leading melanoma and other skin lesions. Dermoscopy improves

cancers in the world. Malignant melanoma is the most upon simple photography by revealing more of the sub-

deadly form of skin cancer and the fastest growing skin surface structures underneath the skin, and is now

cancer type in the human body. 8,441 deaths out of widely used by dermatologists. The contact dermoscopy

68,720 incidences are estimated numbers in the United technique consists of placing fluid such as mineral oil,

States in 2009 [1]. If it is detected early, melanoma can water, or alcohol on the skin lesion that is subsequently

often be cured with a simple excision operation. inspected using a digital camera and a hand-held der-

moscopy attachment such as Dermlite. The fluid placed

* Correspondence: abqovs@r.postjobfree.com; abqovs@r.postjobfree.com on the lesion eliminates surface reflection and renders

Computer Science Department, University of Central Arkansas, Conway, AR,

the cornified layer translucent; thus, allowing a better

USA

visualization of pigmented structures within the

Department of Computer Science, Texas A&M University-Commerce,

Commerce, TX, USA

Full list of author information is available at the end of the article

2010 Kockara and Mete; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative

Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and

reproduction in any medium, provided the original work is properly cited.

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 2 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

epidermis, the dermoepidermal junction and the superfi- Dermoscopy involves optical magnification of the

cial dermis. region-of-interest, which makes subsurface structures

For early detection of melanoma, finding lesion bor- more easily visible when compared to conventional

ders is the first and key step of the diagnosis since the macroscopic images [12]. This in turn improves screen-

border structure provides life saving information for ing characteristics and provides greater differentiation

accurate diagnosis. Currently, dermatologists draw bor- between difficult lesions such as pigmented Spitz nevi

ders manually which is described as tedious and time and small, clinically equivocal lesions [13]. However, it

consuming. That is where computer-aided border detec- has also been demonstrated that dermoscopy may actu-

tion of dermoscopy images comes in to the picture. ally lower the diagnostic accuracy in the hands of an

Detecting melanoma early is critical because the mela- inexperienced dermatologists [14]. Therefore, novel

noma not detected early can be fatal. Also, speed is cri- computerized image understanding frameworks are

tical because of a lack of dermatologists to screen all the needed to minimize the diagnostic errors that result

images. Thus, physician error increases with rapid eva- from the difficulty and subjectivity of visual interpreta-

luation of cases [2]. For these very reasons, automated tions [15,16].

systems would be a significant help for dermotologists. For melanoma investigation, delineation of region-of-

The proposed method in this study is a fully automated interest is the key step in the computerized analysis of

system. By fully automated authors mean that prior to skin lesion images for many reasons. First of all, the

and during the processing there is no human interven- border structure provides important information for

tion in the system. The purpose with this system is to accurate diagnosis. Asymmetry, border irregularity, and

increase dermotologist s comfort level with his/her deci- abrupt border cutoff are a few of many clinical features

sion; however, a dermatologist always constitutes the calculated based on the lesion border. Furthermore, the

final decision on the subject. extraction of other important clinical indicators such as

atypical pigment networks, globules, and blue-white vein

Background areas critically depends on the border detection [17].

Differentiating or partitioning objects from background The blue-white veil is described as an irregular area

or other objects on an image is called image segmenta- with blended blue pigment with a ground glass haze

tion. Solutions to the image segmentation have wide- (white), as if the image were out of focus.

spread applications in various fields; including medical At the first stage for analysis of dermoscopy images,

diagnosis and treatment. Plenty of methods have been automated border detection is usually being applied

generated for grayscale and color image segmentations [16]. There are many factors that make automated bor-

[3-6]. Four popular approaches [7] for image segmenta- der detection challenging e.g. low contrast between the

tion are: edge-based methods, threshold techniques [8], surrounding skin and the lesion, fuzzy and irregular

neighborhood-based techniques, graph-based methods lesion border, intrinsic artifacts such as cutaneous fea-

[9,10], and cluster-based methods. Edge based techni- tures (air bubbles, blood vessels, hairs, and black frames)

ques investigate discontinuities in image whereas neigh- to name a few [17]. According to Celebi et al. 2009 [17],

borhood-based methods examine the similarity automated border detection can be divided into four

(neighborhoods) among different regions. Threshold sections: pre-processing, segmentation, post-processing,

methods identify different parts of an image by combin- and evaluation. Pre-processing step involves color space

ing peaks and valleys of 1D or 3D histograms (RGB). transformations [18], [19], contrast enhancement [20]

Also, there exists numerous innovative graph-based [21], and artifacts removal [22], [23], [24-28]. Segmenta-

image segmentation approaches in the literature. Shi et tion step involves partitioning of an image into disjoint

al. 1997-1998 [9,10] treated segmentation as a graph regions [29], [28],[23]. Post-processing step is used to

partitioning problem, and proposed a novel unbiased obtain the lesion border [16], [30]. Evaluation step

involves dermatologists evaluations on the border detec-

measure for segregating subgroups of a graph, known as

the Normalized Cut criterion. More recently, Felzensz- tion results.

walb et al. [11] developed another segmentation techni- Regarding boundary of clusters, Lee and Castro [31]

que by defining a predicate for the existence of introduced a new algorithm of polygonization based on

boundaries between regions, utilizing graph-based repre- boundary of resulting point clusters. Recently Nosovskiy

sentations of images. In this study; however, we focus et al. [32] used another theoretical approach to find

on cluster-based segmentation methods. In cluster- boundary of clusters in order to infer accurate boundary

based methods, individual image pixels are considered between close neighboring clusters. These two works

as general data samples and assumed correspondence principally study boundaries of finalized data groups

between homogeneous image regions and clusters in the (clusters). Schmid et al. [23] proposed an algorithm

spectral domain. based on color clustering. First, a two-dimensional

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 3 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

histogram is calculated from the first two principal com-

ponents of the CIE L*u*v* color space. The histogram is

then smoothed and initial cluster centers are obtained

from the peaks using a perceptron classifier. At the final

step, the lesion image is segmented.

In this study for computer-aided border detection we

use two clustering algorithms density based clustering

(DBSCAN) [33] and multi level fuzzy C means cluster-

ing (FCM) and compare their performances over dermo-

Figure 1 Direct density reachable (left) and density reachable

scopy images for border detection. In the context of

property of DBSCAN (right).

dermoscopic images, clustering corresponds to finding

whether each pixel in an image belongs to skin lesion

border or not. Automatic border detection makes der- the cluster (border pixels). As expected, a neighborhood

matologist s tedious manual border drawing procedure query for a border pixel returns notably less points than

faster and easier. a neighborhood query of a core pixel. Thus, in order to

include all points belonging to the same segment, we

DBSCAN should set the minimum number of pixels (MinPxl) to a

With the aim of separating background from skin lesion comparatively low value. This value, however, would not

to target possible melanoma, we cluster pixels of thre- be characteristic for the respective cluster - particularly

sholded images by using DBSCAN. It takes a binary in the presence of negative pixels (non-cluster). There-

(segmented) image, and delineates only significantly fore, we require that for every pixel p in a cluster C

important regions by clustering. The expected outcome there is a pixel q in C so that p is inside of the Eps-

of this framework is desired boundary of the lesion in a neighborhood of q and NEps(q) contains at least MinPxl

dermoscopy image. pixels: NEps(p) MinPxl and dist(p, q) Eps. A pixel

Technically, it is appropriate to tailor density based p is called density-reachable from a pixel q when there

algorithm in which cluster definition guarantees that the is a chain of pixels p1, p2, .., pn, where p1 = q, pn = p.

number of positive pixels is equal to or greater than This is illustrated in Figure 1. A cluster C (segment) in

minimum number of pixels (MinPxl) in certain neighbor- image is a non-empty subset of pixels and given as:

hood of core points. The core point is that the neighbor-

C = {p q NEps(p) MinPxl,

hood of a given radius (Eps) has to contain at least a

where q is density reachable from p. DBSCAN centers

minimum number of positive pixels (MinPxl), i.e., the

around the key idea: to form a new cluster or grow an

density in the neighborhood should exceed pre-defined

existing cluster the Eps-neighborhood of a point should

threshold (MinPxl). The definition of a neighborhood is

contain at least a minimum number of points (MinPxl).

determined by the choice of a distance function for two

Algorithm 1 DBSCAN

pixels p and q, denoted by dist(p,q). For instance, when

the Manhattan distance is used in 2D space, the shape of DBSCAN (SubImage, Eps, Minpxl)

the neighborhood would be rectangular. Note that

DBSCAN works with any distance function so that an ClusterId:=nextId(NOISE);

appropriate function can be designed for some other spe- FOR I FROM 1 To SubImage.height DO

cific applications. DBSCAN is significantly more effective FOR I FROM 1 To SubImage.width DO

in discovering clusters of arbitrary shapes. It was success- Point := SubImage.get(i,j);

fully used for synthetic dataset as well as earth science, IF point.Cid = UNCLASSIFIED AND

and protein dataset. Theoretical details of DBSCAN are Point.positive = TRUE

given in [33]. Once the two parameters Eps and MinPxl THEN

are defined, DBSCAN starts to cluster data points (pixels) IF ExpandCluster(SubImage, Point,

from an arbitrary point q as illustrated in Figure 1. ClusterId, Eps, MinPxl)

Let I be a subimage that is of dimension N N. For a THEN

pixel p, let px and py denote its position where top-left ClusterId:=nextId(ClusterId)

corner is (0, 0) of I. Let cxy represent the color at (px, END IF;

p y ). The Eps-neighborhood of a pixel p, denoted by END IF;

NEps( p ), is defined by NEps( p ) = { q I dist ( p, q) END FOR;

Eps } where dist is Euclidean distance. There can be END FOR;

found two kinds of pixels in a cluster: 1) pixels inside of

the cluster (core pixels) and 2) pixels on the border of END DBSCAN;

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 4 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

A lgorithm 1 summarizes DBSCAN for image seg- the pixels with some colors can belong partially to the

mentation. Once the two parameters Eps and MinPxl background class and/or the skin lesion. The main

are defined, DBSCAN starts to cluster data points advantage of this method is that, it does not require a

from an arbitrary point q. It begins by finding the priori knowledge about number of objects in the

neighborhood of point q, i.e., all points that are image.

directly density reachable from point q. This neighbor- Fuzzy C-Means (FCM) clustering algorithm [34,35] is

hood search is called region query. For an image, we one of the most popular fuzzy clustering algorithms.

start with left-top pixel (not necessarily a corner pixel, FCM is based on minimization of the objective function

any arbitrary pixel can be chosen for first iteration) as Fm(u, c) [35]:

our first point in the dataset (subimage). We look for

n C

( uik )m d2 ( xk, ci )

Fm ( u, c ) =

first pixel satisfying the core pixel condition as a start-

k =1 i =1

ing (seed) point. If the neighborhood is sparsely popu-

lated, i.e. it has fewer than MinPxl points, then point q

FCM computes the membership u ij and the cluster

is labeled as a noise. Otherwise, a cluster is initiated

centers cj by:

and all points in neighborhood of point q are marked

by new cluster s ID. Next the neighborhoods of all q s 1

uij =

neighbors are examined iteratively to check if they can 2

xi c j m 1

be added into the cluster. If a cluster cannot be

expanded any more, DBSCAN chooses another arbi-

k =1 xi c k

trary unlabeled point and repeats processes to form

another cluster. This procedure is iterated until all

data points in the dataset have been labeled as noise

u .x

or with a cluster ID. Figure 2 illustrates example clus- ij i

i =1

ter expansion.

i =1

Fuzzy c-means clustering

where m, the fuzzification factor which is a weighting

Clustering, a major area of study in the scope of unsu-

exponent on each fuzzy membership, is any real number

pervised learning, deals with recognizing meaningful

greater than 1, uij is the degree of membership of xi in

groups of similar items. Under the influence of fuzzy

the cluster j, x i is the i th of d-dimensional measured

logic, fuzzy clustering assigns each point with a degree

data, cj is the dimension center of the cluster, d2(xk,ci) is

of belonging to clusters, instead of belonging to exactly

a distance measure between object xk and cluster center

one cluster.

c i, and is any norm expressing the similarity

In fuzzy event modeling, pixel colors in a dermo-

between any measured data and the center.

scopy image can be viewed as probability space where

Figure 2 Example Cluster Expanding: New points (green ones in circle) are expanding cluster.

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 5 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

The FCM algorithm involves the following steps: fuzzification factor m is taken as 2. Figure 3 shows how

FCM detected area (red region) is changed by the

1. Set values for c and m change in threshold.

2. Initial membership matrix U= [uij], which is U(0)

Experiments and results

( i = number of members, j = number of clusters)

3. At k-step: calculate the centroids for each cluster The proposed methods are tested on a set of 100 der-

through equation (2) if k 0. (If k=0, initial cen- moscopy images obtained from the EDRA Interactive

troids location by random) Atlas of Dermoscopy [12]. These are 24-bit RGB color

4. For each member, calculate membership degree images with dimensions ranging from 577 397 pixels

by equation (1) and store the information in U(k) to 1921 1285 pixels. The benign lesions include nevo-

5. If the difference between U(k) and U(k+1) less than cellular nevi and dysplastic nevi. The distance function

a certain threshold, then STOP; otherwise, return to used is Euclidean distance between pixels p and q, and

step 3.

( p.x q.x )2 + ( p.y q.y )2

given as d(p, q) = where p.x

and p.y denote position of pixel p at xth column and

In the FCM, the number of classes (c in equation 1) is

yth row with respect to top-left corner (0, 0) of image.

a user input. We tried to find the number of unclassified

We run DBSCAN on each image with the eps of 5 and

data points is greater than some threshold (T) values

MinPts of 60.

(30, 40, 50, 60, and 70) in our experiments. Since the

We evaluated the border detection errors of the

number of classes is a user input in FCM, there is a risk

DBSCAN and FCM by comparing our results with

of over segmentation. For instance when the number of

physician-drawn boundaries as a ground truth. Manual

segments in a skin image is 3 and we force the number

borders were obtained by selecting a number of points

of clusters to be found by FCM to be 6, the FCM over

on the lesion border, connecting these points by a sec-

segments the image. This was one of the principal chal-

ond-order B-spline and finally filling the resulting

lenges we encountered with FCM. Thus, we ran FCM

closed curve [22]. Using the dermatologist-determined

for different number of clusters and different threshold

borders, the automatic borders obtained from the

values and found that for the value of five initial clusters

DBSCAN and FCM are compared using three quanti-

and threshold value of 30, FCM gave good accuracy in

tative error metrics: border error, precision, and recall.

segmentation. Therefore, we used these values in all of

Border error is developed by Hance et al. [18] and

our experiments. Moreover, in all of our experiments

Figure 3 Overlay images of FCM with different threshold values a) 30, b) 40, c) 50, d) 60, e) 70.

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 6 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

agree on. FN and FP are labels for missed lesion and

erroneous positive regions, respectively. Addition to bor-

der error, we also reported precision (positive predictive

value) and recall (sensitivity) for each experimental

image in Table 1 and Table 2 for results generated with

DBSCAN and FCM respectively. Precision and recall are

TP TP

defined as respectively.

and

TP + FP TP + FN

Figure 4 An exemplary dermoscopy image (left) and corresponding

Note that all definitions runs for the number of pixels

dermatologist drawn border (right)

in the particular region. Analogously, Area function

returns the number of active pixels in a binary image.

currently the most important metric for assessing qual-

Table 1 gives border error, precision and recall rates

ity of any automatic border detection algorithm, and

generated from the DBSCAN for each image whereas

given by:

Table 2 represents border error, precision and recall

rates generated from the FCM. It can be seen that the

Area ( AutomaticBorder ManualBorder )

Border Error = 100, results vary significantly across the images.

Area ( ManualBorder )

In Figure 4, an exemplary dermoscopy image, which is

determined as melanoma, and its corresponding derma-

where AutomaticBorder is the binary image obtained

from DBSCAN or FCM, ManualBorder is the binary tologist drawn border are illustrated. Figure 6 illustrates

image obtained from a dermatologist (see Figure 4 right the DBSCAN generated result in red color for the same

side). Exclusive OR operator,, essentially emphasizes image. The DBSCAN generated result is overlaid on top

disagreement between target (ManualBorder) and pre- of the dermatologist drawn border image in black color.

dicted (AutomaticBorder) regions. Referring to informa- As seen from the figure, hair is detected as false positive.

tion retrieval terminology, numerator of the border Figure 3 shows results generated from the FCM with

error means summation of False Positive (FP) and False different fuzzification factors.

For example for the melanoma image given in Figure

Negative (FN). The denominator is obtained by adding

4, FCM s precision, recall, and border error rates are

True Positive (TP) to False Negatives (FN). An illustra-

99.4%, 75.4%, and 100.4% respectively; however,

tive example is given in Figure 5.

Regarding the image in Figure 5, assume that red is DBSCAN s precision, recall, and border error rates for

drawn by a dermatologist and blue is the automated the same image are 94%, 84%, and 2.2% respectively.

line, respectively. TP indicates correct lesion region Following tables show results generated with the

found automatically. Similarly, TN shows healthy region DBSCAN and the FCM for 100 image dataset

(background) both manual and computer assessment respectively.

Since the most important metric to evaluate perfor-

mance of a lesion detection algorithm is border error

metric, border errors for DBSCAN and FCM are illu-

strated in Figure 7. In the figure, X-axis show image IDs

in random order. As seen from Figure 7, DBSCAN out-

performs FCM for lesion border detection on dermo-

scopy images: for DBSCAN overall average border error

ratio is 6.94% whereas overall average border error ratio

for FCM is 100%. As for recall and precision, DBSCAN

and FCM averaged out 76.66% and 99.26%; 55% and

100%, respectively.

Automatically drawn boundaries usually found at

more intense regions of a lesion (see Figure 8a, 8b, 8c,

8e, 8f) having promising assessment with DBSCAN. In

Figure 8(d), the DBSCAN marked also outer regions.

Obviously, the gradient region between blue and red

boundaries seems to be a major problem for the

Figure 5 Illustration of components used in accuracy and error

DBSCAN. We believe that even though inter-dermatolo-

quantification.

gist agreement on manual borders is not perfect, most

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 7 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

Table 1 DBSCAN border error, precision, and recall measures for each image in the dataset

Img. ID Border Error Precision Recall Img. ID Border Error Precision Recall

1 8.2% 0.98 0.79 51 5.1% 1.00 0.81

2 8.0% 0.93 0.86 52 6.9% 1.00 0.80

3 4.9% 0.89 0.85 53 7.4% 1.00 0.78

4 6.2% 1.00 0.82 54 1.5% 1.00 0.95

5 5.4% 1.00 0.88 55 4.2% 1.00 0.88

6 4.6% 1.00 0.83 56 14.9% 1.00 0.60

7 3.9% 0.96 0.91 57 9.4% 1.00 0.77

8 3.2% 1.00 0.87 58 5.9% 1.00 0.82

9 3.4% 1.00 0.82 59 4.6% 1.00 0.75

10 2.2% 1.00 0.91 60 2.9% 1.00 0.81

11 0.9% 1.00 0.91 61 6.5% 0.90 0.74

12 6.5% 1.00 0.61 62 5.8% 1.00 0.74

13 10.0% 1.00 0.70 63 5.6% 1.00 0.77

14 14.8% 1.00 0.70 64 2.9% 1.00 0.82

15 5.9% 1.00 0.67 65 2.2% 0.94 0.84

16 6.8% 1.00 0.76 66 8.3% 0.89 0.79

17 6.0% 1.00 0.67 67 6.3% 0.98 0.83

18 4.0% 1.00 0.86 68 3.2% 1.00 0.79

19 6.4% 1.00 0.71 69 2.4% 1.00 0.79

20 8.0% 1.00 0.80 70 4.6% 1.00 0.74

21 8.8% 1.00 0.78 71 8.8% 1.00 0.71

22 12.6% 1.00 0.73 72 3.5% 0.94 0.84

23 8.6% 1.00 0.76 73 1.8% 0.99 0.86

24 9.0% 1.00 0.72 74 2.9% 1.00 0.90

25 5.7% 1.00 0.79 75 5.9% 1.00 0.71

26 33.9% 1.00 0.51 76 9.2% 1.00 0.74

27 9.0% 1.00 0.74 77 3.3% 1.00 0.72

28 8.0% 1.00 0.65 78 13.6% 1.00 0.61

29 10.6% 1.00 0.75 79 10.4% 1.00 0.71

30 11.3% 1.00 0.74 80 6.7% 1.00 0.65

31 9.7% 1.00 0.72 81 1.8% 1.00 0.65

32 10.8% 1.00 0.77 82 7.5% 1.00 0.82

33 3.3% 1.00 0.86 83 9.9% 1.00 0.54

34 4.2% 1.00 0.88 84 3.1% 1.00 0.74

35 2.7% 1.00 0.88 85 6.4% 1.00 0.79

36 6.0% 1.00 0.79 86 7.5% 0.98 0.79

37 4.0% 1.00 0.85 87 7.2% 1.00 0.73

38 8.0% 1.00 0.71 88 5.1% 1.00 0.59

39 3.4% 1.00 0.76 89 5.5% 0.91 0.82

40 3.6% 1.00 0.82 90 17.0% 1.00 0.56

41 8.0% 1.00 0.73 91 8.1% 1.00 0.61

42 3.2% 1.00 0.85 92 4.3% 1.00 0.89

43 7.3% 1.00 0.74 93 1.7% 1.00 0.93

44 17.7% 1.00 0.70 94 14.6% 1.00 0.66

45 3.6% 1.00 0.84 95 3.0% 1.00 0.68

46 5.2% 1.00 0.88 96 7.8% 1.00 0.75

47 2.5% 1.00 0.91 97 21.8% 1.00 0.66

48 3.0% 1.00 0.87 98 4.0% 1.00 0.85

49 10.9% 1.00 0.68 99 11.5% 1.00 0.65

50 12.0% 1.00 0.68 100 3.1% 1.00 0.66

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 8 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

Table 2 FCM border error, precision, and recall measures for each image in the dataset

Img. ID Border Error Precision Recall Img. ID Border Error Precision Recall

1 99% 1 0.635 51 99.9% 0.99 0.66

2 99.98% 1 0.65 52 132.5% 1 0.5

3 100% 1 0.62 53 69.93% 1 0.45

4 101% 1 0.54 54 100% 1 0.56

5 98% 1 0.66 55 89.47% 1 0.45

6 96% 1 0.55 56 108.13% 1 0.52

7 105% 1 0.645 57 96% 0.99 0.65

8 100% 1 0.66 58 105% 1 0.62

9 89% 1 0.7 59 100% 1 0.56

10 106% 1 0.7 60 78.32% 1 0.51

11 100% 1 0.79 61 96.82% 1 0.53

12 98% 1 0.35 62 106.83% 1 0.34

13 97% 1 0.45 63 100% 1 0.71

14 99% 1 0.76 64 103.33% 0.98 0.56

15 103% 1 0.23 65 101% 1 0.47

16 98% 1 0.63 66 96.86% 0.95 0.52

17 100% 1 0.2 67 100% 1 0.65

18 89% 1 0.54 68 106.83% 1 0.62

19 99% 1 0.33 69 99% 1 0.34

20 99.9% 1 0.67 70 106.67% 1 0.49

21 92.9% 1 0.65 71 102.3% 1 0.65

22 98% 1 0.71 72 99.9% 1 0.71

23 78.3% 1 0.56 73 123% 1 0.48

24 96.8% 1 0.45 74 105.3% 1 0.53

25 106% 1 0.5 75 103.6% 1 0.5

26 123% 1 0.65 76 98% 1 0.58

27 105.4% 1 0.56 77 106.8% 1 0.45

28 104.7% 1 0.59 78 107% 1 0.76

29 98% 0.99 0.501 79 89.3% 1 0.69

30 95% 1 0.63 80 96.8% 1 0.59

31 93.7% 1 0.34 81 100% 1 0.63

32 96.8% 1 0.49 82 102.3% 1 0.34

33 100% 1 0.53 83 103.3% 1 0.56

34 101% 1 0.43 84 100% 1 0.32

35 98% 1 0.39 85 100% 1 0.67

36 103% 1 0.65 86 89% 1 0.65

37 98% 1 0.62 87 106.6% 1 0.71

38 100% 1 0.6 88 99% 1 0.56

39 89% 1 0.46 89 106.8% 0.97 0.5

40 106.6% 1 0.48 90 118.4% 1 0.48

41 93.6% 1 0.54 91 98.3% 1 0.65

42 96.8% 1 0.59 92 99.6% 1 0.62

43 100% 1 0.57 93 122.4% 1 0.45

44 89% 1 0.63 94 100% 1 0.48

45 107% 1 0.76 95 106.6% 1 0.56

46 89.3% 1 0.64 96 93.6% 1 0.43

47 96.8% 1 0.45 97 96.8% 0.99 0.51

48 106.7% 1 0.48 98 100% 1 0.53

49 99% 1 0.56 99 89% 1 0.49

50 99.9% 1 0.59 100 107% 1 0.53

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 9 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

used a subset of 100 dermoscopy image dataset (90

images). Their image IDs might be different than our

image IDs even for the same image. Therefore, for now

the mean error rate is only indication we have as a

proof that DBSCAN is better than studies given in [17]

and [22].

Conclusion

In this study, we introduced two approaches for auto-

matic detection of skin lesions. First, a fast density based

algorithm DBSCAN is introduced for dermoscopy ima-

ging. Second, the FCM is used for lesion border detec-

tion. The assessments obtained from both methods are

quantitatively analyzed over three accuracy measures:

border error, precision, and recall. As well as low border

error, high precision and recall, visual outcome showed

that the DBSCAN effectively delineated targeted lesion,

Figure 6 Overlay image of DBSCAN.

and has bright future; however, the FCM had poor per-

formance especially in border error metric. The next

step, we will focus on at more details on intra-variability

dermatologists will draw borders approximately the red

and post-assessment during performance analysis of the

borders as shown in images of Figure 8. This is because

intelligent systems. Additionally, performance of

the reddish area just outside the obvious tumor border

DBSCAN will be evaluated over different polygon-union-

is part of the lesion.

ing algorithms. In terms of border errors, we plan to

We also made a rough comparison of the DBSCAN

develop model that are more sensitive to melanoma

with prior state of the art lesion border detection meth-

lesion. A thresholding method which is well-integrated

ods proposed by Celebi et. al 2008 [22] and 2009 [17].

with clustering rationale, such as the one described in

Comparisons showed that the mean error of DBSCAN

[36], will be preferred in the future because of unex-

(6.94%) is obviously less than their results. However, we

pected difference between precision and recall rates.

cannot make image by image comparison since they

Figure 7 Border errors generated by DBSCAN (red) and FCM (green)

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 10 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

Figure 8 Sample images showing assessments of the dermatologist (red), automated frameworks DBSCAN (blue) and FCM (green) introduced in

this study.

Published: 7 October 2010

Acknowledgements

This article has been published as part of BMC Bioinformatics Volume 11

Supplement 6, 2010: Proceedings of the Seventh Annual MCBIOS References

Conference. Bioinformatics: Systems, Biology, Informatics and Computation. 1. Jemal A, Siegel R, Ward E, Hao Y, Xu J, Murray T, Thun MJ: Cancer statistics.

The full contents of the supplement are available online at A Cancer Journal for Clinicians CA 2008, 59:225-249.

http://www.biomedcentral.com/1471-2105/11?issue=S6. 2. Levy P, Berk W, Compton S, King J, Sweeny P, Kuhn G, Bock B: Effect of

Rapid Patient Evaluation on the Rate and Type of Error Citation of

Author details Emergency Physicians. Annals of Emergency Medicine 2005, 46(3):108-109.

Computer Science Department, University of Central Arkansas, Conway, AR, 3. Pantofaru C, Hebert M: A Comparison of Image Segmentation Algorithms.

USA. 2Department of Computer Science, Texas A&M University-Commerce, Tech. report CMU-RI-TR-05-40 Robotics Institute, Carnegie Mellon University

Commerce, TX, USA. 3Computer Science Department, University of Arkansas 2005.

at Pine Bluff, Pine Bluff, AR, USA. 4. Pathegama M, G l : Edge-end pixel extraction for edge-based image

segmentation. Transactions on Engineering, Computing and Technology

Authors contributions 130*-****-****, 2:213-216.

SK and MM have made equal contributions to this study. Both of the 5. Pal NR, Pal SK: A review of image segmentation techniques. Pattern

authors participated in the overall design of the study. MM designed the Recogn 1993, 26(9):1277-1294.

density-based algorithms. SK developed the general comparison testbed, 6. Skarbek W, Koschan A: Colour image segmentation a survey. Technical

performed data analysis, algorithm testing, statistical measurements, and report. Institute for Technical Informatics Technical University of Berlin 1994

benchmarking. BB designed and implemented the FCM. KA integrated it to [http://citeseer.nj.nec.com/skarbek94colour.html], October 1994.

the general framework. SK and MM contributed to the writing of this 7. Sezgin M, Sankur B: Image thresholding techniques: Quantitative

manuscript. All of the authors read and approved the final manuscript. performance evaluation. Journal of Electronic Imaging 2004, SPIE.

8. Shapiro LG, Stockman GC: Computer Vision,. New Jersey, Prentice-Hall0-13-

Competing interests 030***-*-****, 279-325.

The authors declare that they have no competing interests in regards to this 9. Shi J, Belongie S, Leung T, Malik J: Image and Video Segmentation: The

study. Normalized Cut Framework. Proc. of 1998 Int l Conf. on Image Processing

(ICIP 98) Chicago, IL 1998, 1:943-947.

Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 11 of 11

http://www.biomedcentral.com/1471-2105/11/S6/S26

10. Jianbo S, Malik J: Normalized cuts and image segmentation. IEEE Trans. on 36. Celebi ME, Iyatomi H, Schaefer G, Stoecker WV: Approximate Lesion

Pattern Analysis and Machine Intelligence 2000, 22(8):888-905. Localization in Dermoscopy Images. Skin Res Technol 2009, 15(3):314-322.

11. Felzenszwalb PF, Huttenlocher DP: Efficient graph-based image

doi:10.1186/147*-****-**-S6-S26

segmentation. International Journal of Computer Vision 2004, 59(2):167-181.

Cite this article as: Kockara et al.: Analysis of density based and fuzzy c-

12. Argenziano G, Soyer HP, De Giorgi V: Dermoscopy: A Tutorial. EDRA

means clustering methods on lesion border extraction in dermoscopy

Medical Publishing & New Media 2002.

images. BMC Bioinformatics 2010 11(Suppl 6):S26.

13. Steiner K, Binder M, Schemper M, Wolff K, Pehamberger H: Statistical

evaluation of epiluminescence dermoscopy criteria for melanocytic

pigmented lesions. Journal of the American Academy of Dermatology 1993,

29(4):581-588.

14. Binder M, Schwarz M, Winkler A, Steiner A, Kaider A, Wolff K,

Pehamberger H: Epiluminescence Microscopy: a Useful Tool For The

Diagnosis Of Pigmented Skin Lesions For Formally Trained

Dermatologists. Archives Of Dermatology 1995, 131(3):286-291.

15. Fleming MG, Steger C, Zhang J, Gao J, Cognetta AB, Pollak I, Dyer CR:

Techniques for a structural analysis of dermatoscopic imagery. Comput

Med Imaging Graph 1998, 22(5):375-389.

16. Celebi ME, Aslandogan YA, Stoecker WV, Iyatomi H, Oka H, Chen X:

Unsupervised Border Detection in Dermoscopy Images. Skin Res Technol

2007, 13(4):454-462.

17. Celebi ME, Iyatomi H, Schaefer G, Stoecker WV: Lesion Border Detection in

Dermoscopy Images. Comput Med Imaging Graph 2009, 33(2):148-153.

18. Hance GA, Umbaugh SE, Moss RH, Stoecker WV: Unsupervised color image

segmentation with application to skin tumor borders. IEEE Engineering in

Medicine and Biology 1996, 15(1):104-11.

19. Pratt WK: Digital image processing: PIKS inside. Hoboken, NJ: John Wiley

& Sons 2007.

20. Carlotto MJ: Histogram Analysis Using a Scale-Space Approach. IEEE

Transaction on Pattern Analysis and Machine Intelligence 1987, 1(9):121-129.

21. Delgado D, Butakoff C, Ersboll BK, Stoecker WV: Independent histogram

pursuit for segmentation of skin lesions. IEEE Trans on Biomedical

Engineering 2008, 55(1):157-61.

22. Celebi ME, Kingravi HA, Iyatomi H, et al: Border Detection in Dermoscopy

Images Using Statistical Region Merging. Skin Res Technol 2008,

14(3):347-353.

23. Schmid P: Segmentation Of Digitized Dermoscopic Images by Two-

Dimensional Color Clustering. IEEE Transaction on Medical Imaging 1999,

18:164-171.

24. Geusebroek J-M, Smeulders AW, van deWeijer MJ: Fast anisotropic gauss

filtering. IEEE Trans on Image Processing 2003, 12(8):938-43.

25. Perreault S, H bert P: Median filtering in constant time. IEEE Trans on

Image Processing 2007, 16(9):2389-94.

26. Lee TK, Ng V, Gallagher R: Dullrazor: A software approach to hair removal

from images. Comput Biol Med 1997, 27(6):533-543.

27. Zhou H, Chen M, Gass R, et al: Feature-preserving artifact removal from

dermoscopy images. Proceedings of the SPIE medical imaging 2008

conference 2008, 6914:69141B-69141B-9.

28. Wighton P, Lee TK, Atkins MS: Dermoscopic hair disocclusion using

inpainting. Proceedings of the SPIE medical imaging 2008, Conference

6914:691427-691427-8.

29. Sonka M, Hlavac V, Boyle R: Image processing, analysis, and machine

vision. Cengage-Engineering 2007.

30. Melli R, Grana C, Cucchiara R: Comparison of color clustering algorithms

for segmentation of dermatological images. Proceedings of the SPIE

medical imaging 2006 Conference 2006, 3S1-9S39.

31. Lee I, Estivill-Castro V: Polygonization of Point Clusters Through Cluster

Boundary Extraction For Geographical Data Mining. Proceedings of the

10th International Symposium on Spatial Data Handling 2002, 27-40.

Contact this candidate