Kockara et al. BMC Bioinformatics ****, **(Suppl *):S26
http://www.biomedcentral.com/1471-2105/11/S6/S26
PROCEEDINGS Open Access
Analysis of density based and fuzzy c-means
clustering methods on lesion border extraction
in dermoscopy images
Sinan Kockara1*, Mutlu Mete2*, Bernard Chen1, Kemal Aydin3
From Seventh Annual MCBIOS Conference. Bioinformatics: Systems, Biology, Informatics and Computation
Jonesboro, AR, USA. 19-20 February 2010
Abstract
Background: Computer-aided segmentation and border detection in dermoscopic images is one of the core
components of diagnostic procedures and therapeutic interventions for skin cancer. Automated assessment tools
for dermoscopy images have become an important research field mainly because of inter- and intra-observer
variations in human interpretation. In this study, we compare two approaches for automatic border detection in
dermoscopy images: density based clustering (DBSCAN) and Fuzzy C-Means (FCM) clustering algorithms. In the first
approach, if there exists enough density greater than certain number of points- around a point, then either a new
cluster is formed around the point or an existing cluster grows by including the point and its neighbors. In the
second approach FCM clustering is used. This approach has the ability to assign one data point into more than
one cluster.
Results: Each approach is examined on a set of 100 dermoscopy images whose manually drawn borders by a
dermatologist are used as the ground truth. Error rates; false positives and false negatives along with true positives
and true negatives are quantified by comparing results with manually determined borders from a dermatologist.
The assessments obtained from both methods are quantitatively analyzed over three accuracy measures: border
error, precision, and recall.
Conclusion: As well as low border error, high precision and recall, visual outcome showed that the DBSCAN
effectively delineated targeted lesion, and has bright future; however, the FCM had poor performance especially in
border error metric.
Introduction Dermoscopy is the major non-invasive skin imaging
Melanoma is the fifth most common malignancy in the technique that is extensively used in the diagnosis of
United States and has rapidly become one of the leading melanoma and other skin lesions. Dermoscopy improves
cancers in the world. Malignant melanoma is the most upon simple photography by revealing more of the sub-
deadly form of skin cancer and the fastest growing skin surface structures underneath the skin, and is now
cancer type in the human body. 8,441 deaths out of widely used by dermatologists. The contact dermoscopy
68,720 incidences are estimated numbers in the United technique consists of placing fluid such as mineral oil,
States in 2009 [1]. If it is detected early, melanoma can water, or alcohol on the skin lesion that is subsequently
often be cured with a simple excision operation. inspected using a digital camera and a hand-held der-
moscopy attachment such as Dermlite. The fluid placed
* Correspondence: ********@***.***; **********@****-********.*** on the lesion eliminates surface reflection and renders
1
Computer Science Department, University of Central Arkansas, Conway, AR,
the cornified layer translucent; thus, allowing a better
USA
visualization of pigmented structures within the
2
Department of Computer Science, Texas A&M University-Commerce,
Commerce, TX, USA
Full list of author information is available at the end of the article
2010 Kockara and Mete; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly cited.
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 2 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
epidermis, the dermoepidermal junction and the superfi- Dermoscopy involves optical magnification of the
cial dermis. region-of-interest, which makes subsurface structures
For early detection of melanoma, finding lesion bor- more easily visible when compared to conventional
ders is the first and key step of the diagnosis since the macroscopic images [12]. This in turn improves screen-
border structure provides life saving information for ing characteristics and provides greater differentiation
accurate diagnosis. Currently, dermatologists draw bor- between difficult lesions such as pigmented Spitz nevi
ders manually which is described as tedious and time and small, clinically equivocal lesions [13]. However, it
consuming. That is where computer-aided border detec- has also been demonstrated that dermoscopy may actu-
tion of dermoscopy images comes in to the picture. ally lower the diagnostic accuracy in the hands of an
Detecting melanoma early is critical because the mela- inexperienced dermatologists [14]. Therefore, novel
noma not detected early can be fatal. Also, speed is cri- computerized image understanding frameworks are
tical because of a lack of dermatologists to screen all the needed to minimize the diagnostic errors that result
images. Thus, physician error increases with rapid eva- from the difficulty and subjectivity of visual interpreta-
luation of cases [2]. For these very reasons, automated tions [15,16].
systems would be a significant help for dermotologists. For melanoma investigation, delineation of region-of-
The proposed method in this study is a fully automated interest is the key step in the computerized analysis of
system. By fully automated authors mean that prior to skin lesion images for many reasons. First of all, the
and during the processing there is no human interven- border structure provides important information for
tion in the system. The purpose with this system is to accurate diagnosis. Asymmetry, border irregularity, and
increase dermotologist s comfort level with his/her deci- abrupt border cutoff are a few of many clinical features
sion; however, a dermatologist always constitutes the calculated based on the lesion border. Furthermore, the
final decision on the subject. extraction of other important clinical indicators such as
atypical pigment networks, globules, and blue-white vein
Background areas critically depends on the border detection [17].
Differentiating or partitioning objects from background The blue-white veil is described as an irregular area
or other objects on an image is called image segmenta- with blended blue pigment with a ground glass haze
tion. Solutions to the image segmentation have wide- (white), as if the image were out of focus.
spread applications in various fields; including medical At the first stage for analysis of dermoscopy images,
diagnosis and treatment. Plenty of methods have been automated border detection is usually being applied
generated for grayscale and color image segmentations [16]. There are many factors that make automated bor-
[3-6]. Four popular approaches [7] for image segmenta- der detection challenging e.g. low contrast between the
tion are: edge-based methods, threshold techniques [8], surrounding skin and the lesion, fuzzy and irregular
neighborhood-based techniques, graph-based methods lesion border, intrinsic artifacts such as cutaneous fea-
[9,10], and cluster-based methods. Edge based techni- tures (air bubbles, blood vessels, hairs, and black frames)
ques investigate discontinuities in image whereas neigh- to name a few [17]. According to Celebi et al. 2009 [17],
borhood-based methods examine the similarity automated border detection can be divided into four
(neighborhoods) among different regions. Threshold sections: pre-processing, segmentation, post-processing,
methods identify different parts of an image by combin- and evaluation. Pre-processing step involves color space
ing peaks and valleys of 1D or 3D histograms (RGB). transformations [18], [19], contrast enhancement [20]
Also, there exists numerous innovative graph-based [21], and artifacts removal [22], [23], [24-28]. Segmenta-
image segmentation approaches in the literature. Shi et tion step involves partitioning of an image into disjoint
al. 1997-1998 [9,10] treated segmentation as a graph regions [29], [28],[23]. Post-processing step is used to
partitioning problem, and proposed a novel unbiased obtain the lesion border [16], [30]. Evaluation step
involves dermatologists evaluations on the border detec-
measure for segregating subgroups of a graph, known as
the Normalized Cut criterion. More recently, Felzensz- tion results.
walb et al. [11] developed another segmentation techni- Regarding boundary of clusters, Lee and Castro [31]
que by defining a predicate for the existence of introduced a new algorithm of polygonization based on
boundaries between regions, utilizing graph-based repre- boundary of resulting point clusters. Recently Nosovskiy
sentations of images. In this study; however, we focus et al. [32] used another theoretical approach to find
on cluster-based segmentation methods. In cluster- boundary of clusters in order to infer accurate boundary
based methods, individual image pixels are considered between close neighboring clusters. These two works
as general data samples and assumed correspondence principally study boundaries of finalized data groups
between homogeneous image regions and clusters in the (clusters). Schmid et al. [23] proposed an algorithm
spectral domain. based on color clustering. First, a two-dimensional
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 3 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
histogram is calculated from the first two principal com-
ponents of the CIE L*u*v* color space. The histogram is
then smoothed and initial cluster centers are obtained
from the peaks using a perceptron classifier. At the final
step, the lesion image is segmented.
In this study for computer-aided border detection we
use two clustering algorithms density based clustering
(DBSCAN) [33] and multi level fuzzy C means cluster-
ing (FCM) and compare their performances over dermo-
Figure 1 Direct density reachable (left) and density reachable
scopy images for border detection. In the context of
property of DBSCAN (right).
dermoscopic images, clustering corresponds to finding
whether each pixel in an image belongs to skin lesion
border or not. Automatic border detection makes der- the cluster (border pixels). As expected, a neighborhood
matologist s tedious manual border drawing procedure query for a border pixel returns notably less points than
faster and easier. a neighborhood query of a core pixel. Thus, in order to
include all points belonging to the same segment, we
DBSCAN should set the minimum number of pixels (MinPxl) to a
With the aim of separating background from skin lesion comparatively low value. This value, however, would not
to target possible melanoma, we cluster pixels of thre- be characteristic for the respective cluster - particularly
sholded images by using DBSCAN. It takes a binary in the presence of negative pixels (non-cluster). There-
(segmented) image, and delineates only significantly fore, we require that for every pixel p in a cluster C
important regions by clustering. The expected outcome there is a pixel q in C so that p is inside of the Eps-
of this framework is desired boundary of the lesion in a neighborhood of q and NEps(q) contains at least MinPxl
dermoscopy image. pixels: NEps(p) MinPxl and dist(p, q) Eps. A pixel
Technically, it is appropriate to tailor density based p is called density-reachable from a pixel q when there
algorithm in which cluster definition guarantees that the is a chain of pixels p1, p2, .., pn, where p1 = q, pn = p.
number of positive pixels is equal to or greater than This is illustrated in Figure 1. A cluster C (segment) in
minimum number of pixels (MinPxl) in certain neighbor- image is a non-empty subset of pixels and given as:
hood of core points. The core point is that the neighbor-
C = {p q NEps(p) MinPxl,
hood of a given radius (Eps) has to contain at least a
where q is density reachable from p. DBSCAN centers
minimum number of positive pixels (MinPxl), i.e., the
around the key idea: to form a new cluster or grow an
density in the neighborhood should exceed pre-defined
existing cluster the Eps-neighborhood of a point should
threshold (MinPxl). The definition of a neighborhood is
contain at least a minimum number of points (MinPxl).
determined by the choice of a distance function for two
Algorithm 1 DBSCAN
pixels p and q, denoted by dist(p,q). For instance, when
the Manhattan distance is used in 2D space, the shape of DBSCAN (SubImage, Eps, Minpxl)
the neighborhood would be rectangular. Note that
DBSCAN works with any distance function so that an ClusterId:=nextId(NOISE);
appropriate function can be designed for some other spe- FOR I FROM 1 To SubImage.height DO
cific applications. DBSCAN is significantly more effective FOR I FROM 1 To SubImage.width DO
in discovering clusters of arbitrary shapes. It was success- Point := SubImage.get(i,j);
fully used for synthetic dataset as well as earth science, IF point.Cid = UNCLASSIFIED AND
and protein dataset. Theoretical details of DBSCAN are Point.positive = TRUE
given in [33]. Once the two parameters Eps and MinPxl THEN
are defined, DBSCAN starts to cluster data points (pixels) IF ExpandCluster(SubImage, Point,
from an arbitrary point q as illustrated in Figure 1. ClusterId, Eps, MinPxl)
Let I be a subimage that is of dimension N N. For a THEN
pixel p, let px and py denote its position where top-left ClusterId:=nextId(ClusterId)
corner is (0, 0) of I. Let cxy represent the color at (px, END IF;
p y ). The Eps-neighborhood of a pixel p, denoted by END IF;
NEps( p ), is defined by NEps( p ) = { q I dist ( p, q) END FOR;
Eps } where dist is Euclidean distance. There can be END FOR;
found two kinds of pixels in a cluster: 1) pixels inside of
the cluster (core pixels) and 2) pixels on the border of END DBSCAN;
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 4 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
A lgorithm 1 summarizes DBSCAN for image seg- the pixels with some colors can belong partially to the
mentation. Once the two parameters Eps and MinPxl background class and/or the skin lesion. The main
are defined, DBSCAN starts to cluster data points advantage of this method is that, it does not require a
from an arbitrary point q. It begins by finding the priori knowledge about number of objects in the
neighborhood of point q, i.e., all points that are image.
directly density reachable from point q. This neighbor- Fuzzy C-Means (FCM) clustering algorithm [34,35] is
hood search is called region query. For an image, we one of the most popular fuzzy clustering algorithms.
start with left-top pixel (not necessarily a corner pixel, FCM is based on minimization of the objective function
any arbitrary pixel can be chosen for first iteration) as Fm(u, c) [35]:
our first point in the dataset (subimage). We look for
n C
( uik )m d2 ( xk, ci )
Fm ( u, c ) =
first pixel satisfying the core pixel condition as a start-
k =1 i =1
ing (seed) point. If the neighborhood is sparsely popu-
lated, i.e. it has fewer than MinPxl points, then point q
FCM computes the membership u ij and the cluster
is labeled as a noise. Otherwise, a cluster is initiated
centers cj by:
and all points in neighborhood of point q are marked
by new cluster s ID. Next the neighborhoods of all q s 1
uij =
neighbors are examined iteratively to check if they can 2
xi c j m 1
be added into the cluster. If a cluster cannot be
C
expanded any more, DBSCAN chooses another arbi-
k =1 xi c k
trary unlabeled point and repeats processes to form
another cluster. This procedure is iterated until all
data points in the dataset have been labeled as noise
u .x
N
m
or with a cluster ID. Figure 2 illustrates example clus- ij i
i =1
=
cj
ter expansion.
u
N
m
ij
i =1
Fuzzy c-means clustering
where m, the fuzzification factor which is a weighting
Clustering, a major area of study in the scope of unsu-
exponent on each fuzzy membership, is any real number
pervised learning, deals with recognizing meaningful
greater than 1, uij is the degree of membership of xi in
groups of similar items. Under the influence of fuzzy
the cluster j, x i is the i th of d-dimensional measured
logic, fuzzy clustering assigns each point with a degree
data, cj is the dimension center of the cluster, d2(xk,ci) is
of belonging to clusters, instead of belonging to exactly
a distance measure between object xk and cluster center
one cluster.
c i, and is any norm expressing the similarity
In fuzzy event modeling, pixel colors in a dermo-
between any measured data and the center.
scopy image can be viewed as probability space where
Figure 2 Example Cluster Expanding: New points (green ones in circle) are expanding cluster.
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 5 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
The FCM algorithm involves the following steps: fuzzification factor m is taken as 2. Figure 3 shows how
FCM detected area (red region) is changed by the
1. Set values for c and m change in threshold.
2. Initial membership matrix U= [uij], which is U(0)
Experiments and results
( i = number of members, j = number of clusters)
3. At k-step: calculate the centroids for each cluster The proposed methods are tested on a set of 100 der-
through equation (2) if k 0. (If k=0, initial cen- moscopy images obtained from the EDRA Interactive
troids location by random) Atlas of Dermoscopy [12]. These are 24-bit RGB color
4. For each member, calculate membership degree images with dimensions ranging from 577 397 pixels
by equation (1) and store the information in U(k) to 1921 1285 pixels. The benign lesions include nevo-
5. If the difference between U(k) and U(k+1) less than cellular nevi and dysplastic nevi. The distance function
a certain threshold, then STOP; otherwise, return to used is Euclidean distance between pixels p and q, and
step 3.
( p.x q.x )2 + ( p.y q.y )2
given as d(p, q) = where p.x
and p.y denote position of pixel p at xth column and
In the FCM, the number of classes (c in equation 1) is
yth row with respect to top-left corner (0, 0) of image.
a user input. We tried to find the number of unclassified
We run DBSCAN on each image with the eps of 5 and
data points is greater than some threshold (T) values
MinPts of 60.
(30, 40, 50, 60, and 70) in our experiments. Since the
We evaluated the border detection errors of the
number of classes is a user input in FCM, there is a risk
DBSCAN and FCM by comparing our results with
of over segmentation. For instance when the number of
physician-drawn boundaries as a ground truth. Manual
segments in a skin image is 3 and we force the number
borders were obtained by selecting a number of points
of clusters to be found by FCM to be 6, the FCM over
on the lesion border, connecting these points by a sec-
segments the image. This was one of the principal chal-
ond-order B-spline and finally filling the resulting
lenges we encountered with FCM. Thus, we ran FCM
closed curve [22]. Using the dermatologist-determined
for different number of clusters and different threshold
borders, the automatic borders obtained from the
values and found that for the value of five initial clusters
DBSCAN and FCM are compared using three quanti-
and threshold value of 30, FCM gave good accuracy in
tative error metrics: border error, precision, and recall.
segmentation. Therefore, we used these values in all of
Border error is developed by Hance et al. [18] and
our experiments. Moreover, in all of our experiments
Figure 3 Overlay images of FCM with different threshold values a) 30, b) 40, c) 50, d) 60, e) 70.
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 6 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
agree on. FN and FP are labels for missed lesion and
erroneous positive regions, respectively. Addition to bor-
der error, we also reported precision (positive predictive
value) and recall (sensitivity) for each experimental
image in Table 1 and Table 2 for results generated with
DBSCAN and FCM respectively. Precision and recall are
TP TP
defined as respectively.
and
TP + FP TP + FN
Figure 4 An exemplary dermoscopy image (left) and corresponding
Note that all definitions runs for the number of pixels
dermatologist drawn border (right)
in the particular region. Analogously, Area function
returns the number of active pixels in a binary image.
currently the most important metric for assessing qual-
Table 1 gives border error, precision and recall rates
ity of any automatic border detection algorithm, and
generated from the DBSCAN for each image whereas
given by:
Table 2 represents border error, precision and recall
rates generated from the FCM. It can be seen that the
Area ( AutomaticBorder ManualBorder )
Border Error = 100, results vary significantly across the images.
Area ( ManualBorder )
In Figure 4, an exemplary dermoscopy image, which is
determined as melanoma, and its corresponding derma-
where AutomaticBorder is the binary image obtained
from DBSCAN or FCM, ManualBorder is the binary tologist drawn border are illustrated. Figure 6 illustrates
image obtained from a dermatologist (see Figure 4 right the DBSCAN generated result in red color for the same
side). Exclusive OR operator,, essentially emphasizes image. The DBSCAN generated result is overlaid on top
disagreement between target (ManualBorder) and pre- of the dermatologist drawn border image in black color.
dicted (AutomaticBorder) regions. Referring to informa- As seen from the figure, hair is detected as false positive.
tion retrieval terminology, numerator of the border Figure 3 shows results generated from the FCM with
error means summation of False Positive (FP) and False different fuzzification factors.
For example for the melanoma image given in Figure
Negative (FN). The denominator is obtained by adding
4, FCM s precision, recall, and border error rates are
True Positive (TP) to False Negatives (FN). An illustra-
99.4%, 75.4%, and 100.4% respectively; however,
tive example is given in Figure 5.
Regarding the image in Figure 5, assume that red is DBSCAN s precision, recall, and border error rates for
drawn by a dermatologist and blue is the automated the same image are 94%, 84%, and 2.2% respectively.
line, respectively. TP indicates correct lesion region Following tables show results generated with the
found automatically. Similarly, TN shows healthy region DBSCAN and the FCM for 100 image dataset
(background) both manual and computer assessment respectively.
Since the most important metric to evaluate perfor-
mance of a lesion detection algorithm is border error
metric, border errors for DBSCAN and FCM are illu-
strated in Figure 7. In the figure, X-axis show image IDs
in random order. As seen from Figure 7, DBSCAN out-
performs FCM for lesion border detection on dermo-
scopy images: for DBSCAN overall average border error
ratio is 6.94% whereas overall average border error ratio
for FCM is 100%. As for recall and precision, DBSCAN
and FCM averaged out 76.66% and 99.26%; 55% and
100%, respectively.
Automatically drawn boundaries usually found at
more intense regions of a lesion (see Figure 8a, 8b, 8c,
8e, 8f) having promising assessment with DBSCAN. In
Figure 8(d), the DBSCAN marked also outer regions.
Obviously, the gradient region between blue and red
boundaries seems to be a major problem for the
Figure 5 Illustration of components used in accuracy and error
DBSCAN. We believe that even though inter-dermatolo-
quantification.
gist agreement on manual borders is not perfect, most
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 7 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
Table 1 DBSCAN border error, precision, and recall measures for each image in the dataset
Img. ID Border Error Precision Recall Img. ID Border Error Precision Recall
1 8.2% 0.98 0.79 51 5.1% 1.00 0.81
2 8.0% 0.93 0.86 52 6.9% 1.00 0.80
3 4.9% 0.89 0.85 53 7.4% 1.00 0.78
4 6.2% 1.00 0.82 54 1.5% 1.00 0.95
5 5.4% 1.00 0.88 55 4.2% 1.00 0.88
6 4.6% 1.00 0.83 56 14.9% 1.00 0.60
7 3.9% 0.96 0.91 57 9.4% 1.00 0.77
8 3.2% 1.00 0.87 58 5.9% 1.00 0.82
9 3.4% 1.00 0.82 59 4.6% 1.00 0.75
10 2.2% 1.00 0.91 60 2.9% 1.00 0.81
11 0.9% 1.00 0.91 61 6.5% 0.90 0.74
12 6.5% 1.00 0.61 62 5.8% 1.00 0.74
13 10.0% 1.00 0.70 63 5.6% 1.00 0.77
14 14.8% 1.00 0.70 64 2.9% 1.00 0.82
15 5.9% 1.00 0.67 65 2.2% 0.94 0.84
16 6.8% 1.00 0.76 66 8.3% 0.89 0.79
17 6.0% 1.00 0.67 67 6.3% 0.98 0.83
18 4.0% 1.00 0.86 68 3.2% 1.00 0.79
19 6.4% 1.00 0.71 69 2.4% 1.00 0.79
20 8.0% 1.00 0.80 70 4.6% 1.00 0.74
21 8.8% 1.00 0.78 71 8.8% 1.00 0.71
22 12.6% 1.00 0.73 72 3.5% 0.94 0.84
23 8.6% 1.00 0.76 73 1.8% 0.99 0.86
24 9.0% 1.00 0.72 74 2.9% 1.00 0.90
25 5.7% 1.00 0.79 75 5.9% 1.00 0.71
26 33.9% 1.00 0.51 76 9.2% 1.00 0.74
27 9.0% 1.00 0.74 77 3.3% 1.00 0.72
28 8.0% 1.00 0.65 78 13.6% 1.00 0.61
29 10.6% 1.00 0.75 79 10.4% 1.00 0.71
30 11.3% 1.00 0.74 80 6.7% 1.00 0.65
31 9.7% 1.00 0.72 81 1.8% 1.00 0.65
32 10.8% 1.00 0.77 82 7.5% 1.00 0.82
33 3.3% 1.00 0.86 83 9.9% 1.00 0.54
34 4.2% 1.00 0.88 84 3.1% 1.00 0.74
35 2.7% 1.00 0.88 85 6.4% 1.00 0.79
36 6.0% 1.00 0.79 86 7.5% 0.98 0.79
37 4.0% 1.00 0.85 87 7.2% 1.00 0.73
38 8.0% 1.00 0.71 88 5.1% 1.00 0.59
39 3.4% 1.00 0.76 89 5.5% 0.91 0.82
40 3.6% 1.00 0.82 90 17.0% 1.00 0.56
41 8.0% 1.00 0.73 91 8.1% 1.00 0.61
42 3.2% 1.00 0.85 92 4.3% 1.00 0.89
43 7.3% 1.00 0.74 93 1.7% 1.00 0.93
44 17.7% 1.00 0.70 94 14.6% 1.00 0.66
45 3.6% 1.00 0.84 95 3.0% 1.00 0.68
46 5.2% 1.00 0.88 96 7.8% 1.00 0.75
47 2.5% 1.00 0.91 97 21.8% 1.00 0.66
48 3.0% 1.00 0.87 98 4.0% 1.00 0.85
49 10.9% 1.00 0.68 99 11.5% 1.00 0.65
50 12.0% 1.00 0.68 100 3.1% 1.00 0.66
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 8 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
Table 2 FCM border error, precision, and recall measures for each image in the dataset
Img. ID Border Error Precision Recall Img. ID Border Error Precision Recall
1 99% 1 0.635 51 99.9% 0.99 0.66
2 99.98% 1 0.65 52 132.5% 1 0.5
3 100% 1 0.62 53 69.93% 1 0.45
4 101% 1 0.54 54 100% 1 0.56
5 98% 1 0.66 55 89.47% 1 0.45
6 96% 1 0.55 56 108.13% 1 0.52
7 105% 1 0.645 57 96% 0.99 0.65
8 100% 1 0.66 58 105% 1 0.62
9 89% 1 0.7 59 100% 1 0.56
10 106% 1 0.7 60 78.32% 1 0.51
11 100% 1 0.79 61 96.82% 1 0.53
12 98% 1 0.35 62 106.83% 1 0.34
13 97% 1 0.45 63 100% 1 0.71
14 99% 1 0.76 64 103.33% 0.98 0.56
15 103% 1 0.23 65 101% 1 0.47
16 98% 1 0.63 66 96.86% 0.95 0.52
17 100% 1 0.2 67 100% 1 0.65
18 89% 1 0.54 68 106.83% 1 0.62
19 99% 1 0.33 69 99% 1 0.34
20 99.9% 1 0.67 70 106.67% 1 0.49
21 92.9% 1 0.65 71 102.3% 1 0.65
22 98% 1 0.71 72 99.9% 1 0.71
23 78.3% 1 0.56 73 123% 1 0.48
24 96.8% 1 0.45 74 105.3% 1 0.53
25 106% 1 0.5 75 103.6% 1 0.5
26 123% 1 0.65 76 98% 1 0.58
27 105.4% 1 0.56 77 106.8% 1 0.45
28 104.7% 1 0.59 78 107% 1 0.76
29 98% 0.99 0.501 79 89.3% 1 0.69
30 95% 1 0.63 80 96.8% 1 0.59
31 93.7% 1 0.34 81 100% 1 0.63
32 96.8% 1 0.49 82 102.3% 1 0.34
33 100% 1 0.53 83 103.3% 1 0.56
34 101% 1 0.43 84 100% 1 0.32
35 98% 1 0.39 85 100% 1 0.67
36 103% 1 0.65 86 89% 1 0.65
37 98% 1 0.62 87 106.6% 1 0.71
38 100% 1 0.6 88 99% 1 0.56
39 89% 1 0.46 89 106.8% 0.97 0.5
40 106.6% 1 0.48 90 118.4% 1 0.48
41 93.6% 1 0.54 91 98.3% 1 0.65
42 96.8% 1 0.59 92 99.6% 1 0.62
43 100% 1 0.57 93 122.4% 1 0.45
44 89% 1 0.63 94 100% 1 0.48
45 107% 1 0.76 95 106.6% 1 0.56
46 89.3% 1 0.64 96 93.6% 1 0.43
47 96.8% 1 0.45 97 96.8% 0.99 0.51
48 106.7% 1 0.48 98 100% 1 0.53
49 99% 1 0.56 99 89% 1 0.49
50 99.9% 1 0.59 100 107% 1 0.53
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 9 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
used a subset of 100 dermoscopy image dataset (90
images). Their image IDs might be different than our
image IDs even for the same image. Therefore, for now
the mean error rate is only indication we have as a
proof that DBSCAN is better than studies given in [17]
and [22].
Conclusion
In this study, we introduced two approaches for auto-
matic detection of skin lesions. First, a fast density based
algorithm DBSCAN is introduced for dermoscopy ima-
ging. Second, the FCM is used for lesion border detec-
tion. The assessments obtained from both methods are
quantitatively analyzed over three accuracy measures:
border error, precision, and recall. As well as low border
error, high precision and recall, visual outcome showed
that the DBSCAN effectively delineated targeted lesion,
Figure 6 Overlay image of DBSCAN.
and has bright future; however, the FCM had poor per-
formance especially in border error metric. The next
step, we will focus on at more details on intra-variability
dermatologists will draw borders approximately the red
and post-assessment during performance analysis of the
borders as shown in images of Figure 8. This is because
intelligent systems. Additionally, performance of
the reddish area just outside the obvious tumor border
DBSCAN will be evaluated over different polygon-union-
is part of the lesion.
ing algorithms. In terms of border errors, we plan to
We also made a rough comparison of the DBSCAN
develop model that are more sensitive to melanoma
with prior state of the art lesion border detection meth-
lesion. A thresholding method which is well-integrated
ods proposed by Celebi et. al 2008 [22] and 2009 [17].
with clustering rationale, such as the one described in
Comparisons showed that the mean error of DBSCAN
[36], will be preferred in the future because of unex-
(6.94%) is obviously less than their results. However, we
pected difference between precision and recall rates.
cannot make image by image comparison since they
Figure 7 Border errors generated by DBSCAN (red) and FCM (green)
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 10 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
Figure 8 Sample images showing assessments of the dermatologist (red), automated frameworks DBSCAN (blue) and FCM (green) introduced in
this study.
Published: 7 October 2010
Acknowledgements
This article has been published as part of BMC Bioinformatics Volume 11
Supplement 6, 2010: Proceedings of the Seventh Annual MCBIOS References
Conference. Bioinformatics: Systems, Biology, Informatics and Computation. 1. Jemal A, Siegel R, Ward E, Hao Y, Xu J, Murray T, Thun MJ: Cancer statistics.
The full contents of the supplement are available online at A Cancer Journal for Clinicians CA 2008, 59:225-249.
http://www.biomedcentral.com/1471-2105/11?issue=S6. 2. Levy P, Berk W, Compton S, King J, Sweeny P, Kuhn G, Bock B: Effect of
Rapid Patient Evaluation on the Rate and Type of Error Citation of
Author details Emergency Physicians. Annals of Emergency Medicine 2005, 46(3):108-109.
1
Computer Science Department, University of Central Arkansas, Conway, AR, 3. Pantofaru C, Hebert M: A Comparison of Image Segmentation Algorithms.
USA. 2Department of Computer Science, Texas A&M University-Commerce, Tech. report CMU-RI-TR-05-40 Robotics Institute, Carnegie Mellon University
Commerce, TX, USA. 3Computer Science Department, University of Arkansas 2005.
at Pine Bluff, Pine Bluff, AR, USA. 4. Pathegama M, G l : Edge-end pixel extraction for edge-based image
segmentation. Transactions on Engineering, Computing and Technology
Authors contributions 130*-****-****, 2:213-216.
SK and MM have made equal contributions to this study. Both of the 5. Pal NR, Pal SK: A review of image segmentation techniques. Pattern
authors participated in the overall design of the study. MM designed the Recogn 1993, 26(9):1277-1294.
density-based algorithms. SK developed the general comparison testbed, 6. Skarbek W, Koschan A: Colour image segmentation a survey. Technical
performed data analysis, algorithm testing, statistical measurements, and report. Institute for Technical Informatics Technical University of Berlin 1994
benchmarking. BB designed and implemented the FCM. KA integrated it to [http://citeseer.nj.nec.com/skarbek94colour.html], October 1994.
the general framework. SK and MM contributed to the writing of this 7. Sezgin M, Sankur B: Image thresholding techniques: Quantitative
manuscript. All of the authors read and approved the final manuscript. performance evaluation. Journal of Electronic Imaging 2004, SPIE.
8. Shapiro LG, Stockman GC: Computer Vision,. New Jersey, Prentice-Hall0-13-
Competing interests 030***-*-****, 279-325.
The authors declare that they have no competing interests in regards to this 9. Shi J, Belongie S, Leung T, Malik J: Image and Video Segmentation: The
study. Normalized Cut Framework. Proc. of 1998 Int l Conf. on Image Processing
(ICIP 98) Chicago, IL 1998, 1:943-947.
Kockara et al. BMC Bioinformatics 2010, 11(Suppl 6):S26 Page 11 of 11
http://www.biomedcentral.com/1471-2105/11/S6/S26
10. Jianbo S, Malik J: Normalized cuts and image segmentation. IEEE Trans. on 36. Celebi ME, Iyatomi H, Schaefer G, Stoecker WV: Approximate Lesion
Pattern Analysis and Machine Intelligence 2000, 22(8):888-905. Localization in Dermoscopy Images. Skin Res Technol 2009, 15(3):314-322.
11. Felzenszwalb PF, Huttenlocher DP: Efficient graph-based image
doi:10.1186/147*-****-**-S6-S26
segmentation. International Journal of Computer Vision 2004, 59(2):167-181.
Cite this article as: Kockara et al.: Analysis of density based and fuzzy c-
12. Argenziano G, Soyer HP, De Giorgi V: Dermoscopy: A Tutorial. EDRA
means clustering methods on lesion border extraction in dermoscopy
Medical Publishing & New Media 2002.
images. BMC Bioinformatics 2010 11(Suppl 6):S26.
13. Steiner K, Binder M, Schemper M, Wolff K, Pehamberger H: Statistical
evaluation of epiluminescence dermoscopy criteria for melanocytic
pigmented lesions. Journal of the American Academy of Dermatology 1993,
29(4):581-588.
14. Binder M, Schwarz M, Winkler A, Steiner A, Kaider A, Wolff K,
Pehamberger H: Epiluminescence Microscopy: a Useful Tool For The
Diagnosis Of Pigmented Skin Lesions For Formally Trained
Dermatologists. Archives Of Dermatology 1995, 131(3):286-291.
15. Fleming MG, Steger C, Zhang J, Gao J, Cognetta AB, Pollak I, Dyer CR:
Techniques for a structural analysis of dermatoscopic imagery. Comput
Med Imaging Graph 1998, 22(5):375-389.
16. Celebi ME, Aslandogan YA, Stoecker WV, Iyatomi H, Oka H, Chen X:
Unsupervised Border Detection in Dermoscopy Images. Skin Res Technol
2007, 13(4):454-462.
17. Celebi ME, Iyatomi H, Schaefer G, Stoecker WV: Lesion Border Detection in
Dermoscopy Images. Comput Med Imaging Graph 2009, 33(2):148-153.
18. Hance GA, Umbaugh SE, Moss RH, Stoecker WV: Unsupervised color image
segmentation with application to skin tumor borders. IEEE Engineering in
Medicine and Biology 1996, 15(1):104-11.
19. Pratt WK: Digital image processing: PIKS inside. Hoboken, NJ: John Wiley
& Sons 2007.
20. Carlotto MJ: Histogram Analysis Using a Scale-Space Approach. IEEE
Transaction on Pattern Analysis and Machine Intelligence 1987, 1(9):121-129.
21. Delgado D, Butakoff C, Ersboll BK, Stoecker WV: Independent histogram
pursuit for segmentation of skin lesions. IEEE Trans on Biomedical
Engineering 2008, 55(1):157-61.
22. Celebi ME, Kingravi HA, Iyatomi H, et al: Border Detection in Dermoscopy
Images Using Statistical Region Merging. Skin Res Technol 2008,
14(3):347-353.
23. Schmid P: Segmentation Of Digitized Dermoscopic Images by Two-
Dimensional Color Clustering. IEEE Transaction on Medical Imaging 1999,
18:164-171.
24. Geusebroek J-M, Smeulders AW, van deWeijer MJ: Fast anisotropic gauss
filtering. IEEE Trans on Image Processing 2003, 12(8):938-43.
25. Perreault S, H bert P: Median filtering in constant time. IEEE Trans on
Image Processing 2007, 16(9):2389-94.
26. Lee TK, Ng V, Gallagher R: Dullrazor: A software approach to hair removal
from images. Comput Biol Med 1997, 27(6):533-543.
27. Zhou H, Chen M, Gass R, et al: Feature-preserving artifact removal from
dermoscopy images. Proceedings of the SPIE medical imaging 2008
conference 2008, 6914:69141B-69141B-9.
28. Wighton P, Lee TK, Atkins MS: Dermoscopic hair disocclusion using
inpainting. Proceedings of the SPIE medical imaging 2008, Conference
6914:691427-691427-8.
29. Sonka M, Hlavac V, Boyle R: Image processing, analysis, and machine
vision. Cengage-Engineering 2007.
30. Melli R, Grana C, Cucchiara R: Comparison of color clustering algorithms
for segmentation of dermatological images. Proceedings of the SPIE
medical imaging 2006 Conference 2006, 3S1-9S39.
31. Lee I, Estivill-Castro V: Polygonization of Point Clusters Through Cluster
Boundary Extraction For Geographical Data Mining. Proceedings of the
10th International Symposium on Spatial Data Handling 2002, 27-40.