CS Colloquium by Virgilio Almeida on Modeling Spam (Wed June 13 at 11:00)

Azer Bestavros
Mon Jun 11 21:34:10 EDT 2007

Computer Science Department
Boston University

Date: Wednesday, June 13th 2007
Time: 11:00am
Place: Room MCS 135, 111 Cummington Street
       (for directions, see www.cs.bu.edu/colloquium) 


Characterizing and Modeling Spam Traffic and Anti-Social Behavior

Virgilio Almeida

Computer Science Department 
Federal University of Minas Gerais, Brazil



This talk is divided into two parts. In the first one, we present an
extensive characterization of a spam traffic. As basis for our
characterization, standard spam detection techniques are used to
classify a large number of incoming e-mails to a large university into
two categories, namely spam and non-spam. For each of the two resulting
workloads, as well as for the aggregate workload, we analyze a set of
parameters, aiming at identifying the characteristics that significantly
distinguish spam from non-spam traffic, assessing the qualitative impact
of spam on the aggregate traffic and, possibly, drawing insights into
the design of more effective spam detection techniques. Our
characterization reveals significant differences in the spam and
non-spam traffic patterns. E-mail arrival process, size distribution as
well as the distributions of popularity and temporal locality of e-mail
recipients are key workload aspects which distinguish spam from
traditional e-mail traffic. In the second part, we present networks of
email, that have been used to illustrate general properties of social
networks of communication and collaboration. However, increasingly, the
majority of email traffic reflect opportunistic, rather than symbiotic
social relations. We show the use of e-mail data from a large university
to construct networks of e-mail exchange that quantify the differences
between social and antisocial behavior in networks of communication. We
show that while structural characteristics typical of other social
networks are shared to a large extent by the legitimate component, they
are not characteristic of antisocial traffic (e.g., spam traffic). 

Short Biography:

Virgilio Almeida is a professor of the Computer Science Department at
the Federal University of Minas Gerais, Brazil. His research interests
include performance modeling and analysis of large scale distributed
systems and large social networks. In particular, his current research
work is focused on the interaction of social networks and system
behavior, including factors such as performance, availability and
malicious behavior. Virgilio is a recipient of a Fulbright Research
Scholar Award. He held visiting positions at Boston University,
Polytechnic University of Catalunya in Barcelona, Polytechnic
University-Brooklyn of and held visiting appointments at Xerox PARC and
Hewlett-Packard Research Laboratory. Professor Almeida is co-author of 4
books on performance modeling (i.e., "Performance by Design: computer
capacity planning by example," "Capacity Planning for Web Services:
metrics, models, and methods," ) and is the author of over fifty papers
on performance modeling and characterization of computer systems.

Host: Azer Bestavros

