Mobile Call

Location:

United States

Posted:

November 16, 2012

Contact this candidate

Resume:

Predicting Human Movement based on Telecom s Handoff in Mobile Networks

Yi-Bing Lin, Fellow, IEEE, Chien-Chun Huang-Fu

Department of Computer Science

National Chiao Tung University, Hsinchu, Taiwan

{liny,jjfu}@csie.nctu.edu.tw

and

Nabil Alrajeh

Biomedical Technology Department

King Saud University, Saudi Arabia

*****@***.***.**

Abstract

Investigating human movement behavior is important for studying issues such as prediction of

vehicle traffic and spread of contagious diseases. Since mobile telecom network can effi-

ciently monitor the movement of mobile users, the telecom s mobility management is an ideal

mechanism for studying human movement issues. The problem can be abstracted as follows:

What is the probability that a person at location A will move to location B after T hours. The

answer cannot be directly obtained because commercial telecom networks do not exactly trace

the movement history of every mobile user. In this paper, we show how to use the standard

outputs (handover rates, call arrival rates, call holding time and call traffic) measured in a

mobile telecom network to derive the answer for this problem.

Index Terms: Human movement, Little s Law, mobile computing, mobility management.

1. Introduction

Prediction of human movement behavior is important for studying issues such as prediction of

vehicle traffic and spread of contagious diseases, which requires tracing the movement of

people. In [1], a wireless sensor network technology is utilized to obtain high-resolution data

of close proximity interactions which cause the spread of most contagious diseases. However,

this method needs extra effort to distribute the senor network motes, and is constrained in a

small area (e.g., a campus).

An alternative to investigate user movement can be achieved through mobile telecom service.

In a mobile telecom network (e.g., UMTS, cdma2000, and GSM [6]), the users are tracked by

the mobility management mechanism so that the network can connect incoming calls to the

users through Base Stations (BSs) [2]. For this purpose, BSs in the service area are grouped

into location areas (LAs). The users are tracked at the accuracy of a LA coverage, and when

an incoming call arrives, all BSs in that LA will page the user. Since this mobility mana ge-

ment mechanism provides the position information of a user at the accuracy of one LA cov-

erage that may include 10-100 BSs, it cannot be used for location-based applications that re-

quire position accuracy within the size smaller than a cell (the radio coverage of a BS or a

sector of the BS). Location-based services that need to accurately track the position of a user

require specific techniques described in 3GPP TS 25.305 [3]. Details of these methods were

described in [4], and are elaborated here for the reader s benefit:

The Cell-ID-based method determines the mobile user s position based on the coverage of

Service Areas (SAs). An SA includes one or more cells. At most one-cell-sized accuracy

(about 500 meters) can be achieved when the SA includes only one cell. The Observed Time

Difference of Arrival (OTDOA) method utilizes trilateration to determine the mobile user s

position. At least three concurrent downlink signals from different cells are measured by the

mobile phone. The time differences among the signal arrivals are calculated to form hyper-

bolic curves. The intersection of these curves is then used to indicate the mobile user s posi-

tion. This method provides location accuracy within 50-150 meters. The Assisted Global Po-

sitioning System (A-GPS) method speeds up GPS positioning by downloading GPS infor-

mation through the Radio Access Network (RAN). Execution of A-GPS positioning only re-

quires several seconds while execution of normal GPS positioning requires 30 seconds to

several minutes. GPS modules are installed in both the mobile phone and the RAN. This

method provides location accuracy within 5-15 meters. The Uplink Time Difference of Arrival

(U-TDOA) method evolves from OTDOA, which utilizes uplink signals instead of downlink

signals. A normal uplink signal from the mobile user is measured in different cells, and no

extra signal is required. Same calculation process as OTDOA is then conducted to find out the

mobile user s position. Since the measurement and the calculation process are exercised onl y

in the RAN, this method does not require any modification to the mobile phone. This method

provides location accuracy within 50-150 meters.

The aforementioned techniques can effectively monitor the behaviors of specific mobile users

at the cost of modifications to telecom network, which are not appropriate to generate behav-

ior statistics for a large number of users that are typically required to study problems such as

pedestrian movement and contagious disease spread. In other words, these techniques cannot

be used to answer questions like What is the probability that a person at location A

will move to location B after T hours.

In [5], two data sets of the BS locations are utilized to analyze the human mobility patterns.

The first data set records the BS locations of 100,000 individuals for six -month when they

initiated/received a call or a short message. The second data set captures the BS locations of

206 individuals recorded every two hours for one week. The distribution of the displacements

calculated from these two data sets is found to be well approximated by a truncated pow-

er-law equation. This method as well as other solutions [9,10] require quasi-anonymous

phone identities for tracing individual movements which causes extra undesirable overhead

for the telecom operators. Furthermore, the data cannot be processed quickly (say, in one day)

if the number of mobile users is larger than millions.

In this paper, we propose a novel approach to address the spread problem by only using the

statistics from the standard mobile telecom switches such as Mobile Switching Centers

(MSCs), and Serving General Packet Radio Service Support Nodes (SGSNs) [6]. Our ap-

proach does not need to identify individual users and therefore does not cause any customer

privacy problem. The notation used in this paper is listed in Appendix A.

2. Spread Prediction Model

Figure 1 illustrates a mobile telecom service area covered by several BSs. In this figure, a cell

of a BS is represented by a circle. A mobile user is represented by a vehicle moving around

the cells. If a user in conversation moves from one cell to another, then the call connection

must be switched from the old cell to the new cell. This switching operation is called a hand-

over. When a call arrives at a user or when he/she performs a handover, the activity is recor d-

ed at the MSC/SGSN. The mobile telecom network collects the statistics of the activities for

every t interval typically ranging from 15 minutes to several hours. The mobile operator can

then investigate these statistics (output measures) for future network planning.

Mobile Telecom Network

MSC/SGSN

Base Station (BS)

i,j

BSj

BSi

j,i

Cell i Cell j

Figure 1. A simplified mobile telecom network

Four major measures provided by mobile telecom network are the expected call holding time,

the numbers of handovers in and out of the cells, the number of new call arrivals of the cells,

and the voice/data traffic (in Erlang) of the cells. For time define as the timeslot

+ 1 . Suppose that a mobile user in conversation moves from cell i to cell j at

time, then he/she contributes to one handover out of cell i, and one handover into cell j in

timeslot . The mobile telecom network measures, the number of handovers from

cell i to cell j in timeslot . Note that it is meaningless that a call hands over to the same cell,

and therefore = 0. A mobile user resides at cell i in timeslot may receive an in-

coming call or originate an outgoing call. Such new calls are counted by the measure .

In telephony engineering, an Erlang represents the continuous use of one voice path. Let

be the measure of the Erlang traffic of cell i in Then is the number of calls

arriving at cell i in times the expected call holding time. Practically, the mobile telecom

by summing up all conversation minutes of cell i in The ex-

network measures

pected call holding time E[ ] is the average of all collected call holding times over a long

observed period.

Consider a cell as the granularity of location coverage. The movement of a mobile user is de-

scribed as follows. The user stays at one cell for a period of time, and then moves to the next

cell. If we sum up the residence times of the cells he/she visited, then we know exactly which

cell the user visited after a specific elapsed time. Unfortunately, the above task cannot be

achieved because the standard outputs E[ ],, and cannot tell you how

a specific mobile user moves exactly. On the other hand, these outputs can be used to derive

the probability that where and when a user moves. When a user arrives at cell i in timeslot,

let be the average residence time before the user moves out of the cell. Let be

the transition probability that a user moves from cell i to cell j in timeslot . If both

and are known, then we can predict the probability of the user s location at time +

. That is, the user moves to cell j with probability . Note that = 0

because = 0. We use a prediction model to approximate and by using

E[ ],, and, and show how the model computes the probability

that starting from cell i, a user will move to cell j after a time period T.

The concept behind the prediction model is Little s Law [7], which says that the expected

number of users in a system is the arrival rate of the users times the expected response

time R that a user stays in the system; i.e.,

Equation (1) is used to compute . We first derive the average number of users at

cell i in timeslot . The intuition behind the derivation of is the following: if every-

one at cell i makes at most one call in and every call takes E[ ] minutes, then

/E[ ]. Let tR be the true cell residence time. Assume that

will discuss what happens if this assumption is violated later). When E[ ], the con-

versation minutes contributed by the user is, therefore the denominator of the equa-

tion is adjusted by min E[ ] and the equation is re-written as

min E[ ]

When E[ ], most calls occurring in are new calls (Section 3 will show in (13) that

handover calls rarely occur when E[ ] is small), and their call holding times are completely

before ends. Therefore, above equation is reasonably accurate when

measured in

E[ ] or E[ ] . When E[ ], many ongoing calls (either in or out of the

cell) are observed in the beginning or the end of, and such a call contributes much less

conversation minutes than min E[ ] . Therefore, we scale down the conversation

minutes by a linear factor expressed as

E[ ] E[ ]

= + [1 ] where 0 1

max E[ ] max E[ ]

Note that when E[ ] or E[ ], 1, and the conversation minutes are

min E[ ] . When E[ ],, the conversation minutes are {min E[ ] }

for incomplete calls that do not begin or end in . The final equation in our predic-

tion model is

{min E[ ] }

Based on the above discussion, accuracy of (2) is affected by two effects:

Effect 1 (Multiple-Calls-Per-Cell Effect on ). If a user generates multiple calls per cell,

then (2) overestimates .

Effect 2 (Observed Timeslot). If E[ ] and E[ ] E[ ], then the measured

conversation minutes of a call should be min E[ ] E[ ], and (2) will underestimate .

In Section 3, we show that this error can partially corrected by selecting a smaller value (e.g.,

= 0.4 .

In timeslot, the number of handovers flowing into cell i is

= 3

Four types of users are observed at cell i. In timeslot a type 1 user moves into the cell

when he/she is in phone conversation (and a handover occurs). A type 2 user is not in phone

conversation when he/she moves into the cell, and then has phone calls at this cell in . A

type 3 user moves in and/or out of cell i without any call activity. A type 4 user arrives at cell i

earlier than, and then has at least one phone call at this cell in .

The number of type 1 users moving into cell i in timeslot is . If Multi-

ple-Calls-Per-Cell effect does not exist, then is the number of type 2 and type 4 users

of cell i in timeslot .

According to (1), the average residence time of a user arriving in cell i in timeslot

is approximated as

From (2), we have

{min E[ ] }[ + ]

where in (4) is computed from (3).

Effect 3. (Multiple-Calls-Per-Cell Effect on ). If Effect 1 is significant, i.e., a user tends

to make multiple calls in a cell, then overestimates the number of type 2 users.

Note that due to Effect 1, (4) overestimates the cell residence time. Due to Effect 3, (4) un-

derestimates the cell residence time.

Effect 4. A type 4 user is not supposed to contribute to the arrival rate in (1). However, a

type 4 user does contribute to, and therefore (4) underestimates the cell resi-

dence time.

If > E[ ], a type 4 user becomes a type 2 user. Note that type 3 users contribute to neither

nor, and are reasonable to be ignored in computing (4). The predicted routing

probability is expressed as

= 5

However, behavior of Type 3 users does affect the routing probability, which results in the

following effect.

Effect 5. Movement of type 3 user is not included in, which affects the accuracy of

By using (4) and (5), (T) can be expressed recursively as

(T)=

= + . For 0

where

1 for =

0 for

0. We note that the above recursive algorithm is given for the

The recursion stops when

description purpose. In our implementation of the prediction model, an iterative algorithm is

used to compute (6), where the details are given in Appendix B.

3. Numerical Examples

We have collected, E[ ], and statistics from a commercial mobile tele-

com service area in Hsinchu, Taiwan. The statistics were measured when = 1 hour, which

can be translated into = 15 minutes, and are listed below.

E[ ]: 3.3725 movements per 15 minutes (13.49 movements per hour)

E[ ] : 1.0325 Erlangs per 15 minutes (4.103 Erlangs per hour)

E[ ]: 1-5 minutes for voice calls and 10-20 minutes for data sessions

Due to the Personal Information Protection Act in Taiwan, we are not allowed to publish the

mappings between the collected statistics and the base stations in that area. Therefore, we as-

sume that the cell layout is of the Manhattan Street fashion with 7x7 cell structure, where

every cell has 4 neighbors (the boundary cells have 2 or 3 neighbors and the users visiting

these cells will bounce back).

We consider a baseline scenario where call arrivals are a Poisson process with the expected

inter-call arrival time E[ ] = 2 hours, and the call holding time is Exponentially distributed

with the mean E[ ] = 2 minutes. Based on these assumptions as well as the E[ ] and

the E[ ] statistics obtained from the commercial mobile telecom network, we select the

number of users as follows. The total call traffic generated in this system is E[ ]

49 =4.103 49 = 201.05 Erlangs per hour. The Erlang traffic generated from a user is

E[ ] 1 1

= 60. Therefore, the number of users in the system is 201.05 (60) = 12062.82. The

E[ ]

numbers of users considered in our experiments are 12000 and 24000, respectively. Also, as a

baseline scenario, the expected cell residence time E[ ] can be selected as follows. Since we

assume that every cell has 4 neighbors, the handover rate out of a cell is E[ ]

4=13.49 4=53.96 per hour. The expected number of users in a cell is 12000 49 = 244.9.

Since every user leaves the cell with rate 1/E[ ], and such cell crossing is a handover with

E[ ] 1

probability we have

E[ ]+E[ ] 61

1 1

244.9 (E[ ]) (61) =53.96

That is, E[ ] = 0.08 hours or 4.8 minutes. Our experiments consider E[ ] = 5, 10, 15,

and 20 minutes.

Based on the above parameters, we simulate user movement and call activities for 24 hours.

The cell residence time has an arbitrary distribution (we specifically consider Gamma,

Normal, and Weibull distributions; the results for Gamma distribution are elaborated in this

paper, other distributions show similar results and are not presented). In our experiments, the

cell residence times E[ ] and the routing probabilities are given. Probabilities

are randomly generated to avoid homogeneous routing; i.e., our experiments arrange

that for cell i s neighboring cells . At the end of a simulation run, we

= 6 12 18 and 24 hours). During the experi-

obtain the real probabilities ) for

ments, the simulated mobile telecom network produces i,j, tc, and in every 15

minutes (i.e., t=15 minutes). Then we use the prediction model to compute

), and compare them with the real values, E[ ], and

and .

In our experiments, phone calls (connected to the MSC) are represented by E[ ] 5

minutes, and data sessions (connected to the SGSN) are represented by E[ ] 10 minutes.

The accuracy of the prediction model is investigated through the following measures (where

the number of users is 24000 and the expected inter call arrival time E[ ] =2 hours) :

Measure 1 is the average error between the predicted and the real routing probabilities. Let

be the error between the predicted routing probabilities and the real values

. That is,

Then we have

1 =

where is the number of cells in the mobile network, and is the set of neighboring cells

of cell .

As pointed out in Effect 5, the users who do not make calls when they cross the cells also af-

fect which are not captured by (5). If increases, decreases, or decreases,

more users are in conversation when they move into a cell (see (13) and (14)), and therefore

there are more handovers. Having more handovers means that the effect of the users who do

not make calls when crossing the cells becomes insignificant and the measured is

more accurate, as indicated in Figure 2.

Figure 2. Accuracy of routing probability prediction (E[ ]=2 hours)

Measure 2 is the error between the real cell residence time E[ ] and the predicted cell

residence time computed by (4), which is mainly caused by Effects 1, 2 (on Equation

(2)), 3 and 4. Figure 3 (a) plots 2 against where E[ ]=15 minutes. The figure indicates

that when = 1 (i.e., no adjustment), the largest 2 =50% occurs at E[ ]= t=15 minutes.

This result is exactly what we expected (see Effect 2). When t, 2 decreases as E[ ] decreases due to Effect 6. The above state-

ment is also true for E[ ]

Contact this candidate