[Busec] Fwd: BUCS Friday Theory Seminar: David Kempe on "Subset Selection for Linear Regression" [Today 02/24 @ 3:00 pm in MCS 148]

Sharon Goldberg goldbe at cs.bu.edu
Fri Feb 24 10:51:48 EST 2012

Hi Group,

Not sure if you've seen this talk announcement for 3pm today -- I just
noticed it.

Could be interesting.


---------- Forwarded message ----------
From: BU CS Colloquium <bucscolloquium at gmail.com>
Date: Fri, Feb 24, 2012 at 10:06 AM
Subject: BUCS Friday Theory Seminar: David Kempe on "Subset Selection
for Linear Regression" [Today 02/24 @ 3:00 pm in MCS 148]
To: colloq-l at cs.bu.edu

Boston University -- Computer Science Department

Friday Theory Seminar

Friday, February 24, 2012
3:00 PM
MCS 148

"Subset Selection for Linear Regression"

David Kempe
USC - Currently visiting MSR

Abstract:  One of the central problems in many data-driven sciences is
the right selection of attributes to observe in order to accurately
predict a variable of interest. Applications abound in areas as
diverse as medical sciences (predicting medical conditions based on
observable attributes), social sciences (predicting future behaviors
or outcomes) and sensor networks, among others.

In many of these cases, time or cost constraints prohibit sampling
more than a few attributes. Also, many application domains use linear
regression as a method of prediction, and evaluate the quality of
attributes in terms of the R^2 fit with the quantity to be predicted.
This motivates the following formal problem definition: "Given the
covariances between observable variables X_i and a target variable Z,
select k of the variables X_i such that the selected set has the best
possible R^2 fit with Z."

The main result presented in this talk is that so long as the
covariance  matrix between the X_i variables is far from singular,
greedy algorithms frequently used in practice are provably
constant-factor approximations. The proof is based on extending the
widely used concept of submodularity to a notion of approximate
submodularity, and relating it to the spectrum of the covariance
matrix. Furthermore, we will investigate various graph-theoretical
properties of covariance matrices which allow for efficient exact or
approximate algorithms.

We conclude with several exciting open questions.

[This talk is based on joint work with Abhimanyu Das, appearing in
ICML 2011 and STOC 2008.]

Host: Steve Homer

Sharon Goldberg
Computer Science, Boston University

More information about the Busec mailing list