Archive for the ‘Microsoft Research’ Category

Supercomputers: The Amazing Race

January 13, 2015 Comments off

Supercomputers: The Amazing Race
Source: Microsoft Research (Gordon Bell)

The “ideal supercomputer” has an infinitely fast clock, executes a single instruction stream program operating on data stored in an infinitely large, and fast single-memory. Backus established the von Neumann programming model with FORTRAN. Supercomputers have evolved in steps: increasing processor speed, processing vectors, adding processors for a program held in a single memory monocomputer; and interconnecting multiple computers over which a distributed program runs in parallel. Thus, supercomputing has evolved from a hardware engineering design challenge of the Cray Era(1960-1995) of the monocomputer to the challenging of creating programs that operate on distributed (mono)computers of the Multicomputer Era (1985- present).

Identifying Presentation Styles in Online Educational Videos

January 6, 2015 Comments off

Identifying Presentation Styles in Online Educational Videos
Source: Microsoft Research

The rapid growth of online educational videos has resulted in huge redundancy. The same underlying content is often available in multiple videos with varying quality, presenter, and presentation style (slide show, whiteboard presentation, demo, etc). The fact that there are so many videos on the same content makes it important to retrieve videos that are attuned to user preferences. While there are several aspects that drive user engagement, we focus on the presentation style of the video. Based on a large scale manual study, we identify the 11 dominant presentation styles that typically employed. We propose a reference algorithm combining a set of 3-Way Decision Forests with probabilistic fusion and using a large set of image, face and motion features. We analyze our empirical results to provide understanding of the difficulties of the problem and to highlight directions for future research on this new application. We also make the data available.

Optimizing Human Computation to Save Time and Money

January 6, 2015 Comments off

Optimizing Human Computation to Save Time and Money
Source: Microsoft Research

Crowd-sourcing is increasingly being used for providing answers to online polls and surveys. However, existing systems, while taking care of the mechanics of attracting crowd workers, poll building, and payment, generally provide little by way of cost-management (e.g. working with a tight budget), time-management (e.g. obtaining results as quickly as possible), and controlling the margin of error (e.g. working on a sample population which is largely different from the general census statistics). The problems above create significant pain points for those wanting to run large-scale surveys, such as people doing polling for political campaigns, marketing professionals, and the like.

Our work unlocks the possibility of large-scale polling on a budget though the use of novel optimization strategies. Our work, is based on InterPoll, a platform for programming crowdsourced polls. In this paper, we present three static and three runtime optimizations for InterPoll polls represented as LINQ queries. The former share some similarities for traditional compiler optimizations, while the latter borrow insight from databases and real-life polling strategies.

These optimizations lead to significant improvements in practice. In our experiments we observed tenfold savings in survey cost and time savings of as much as 20 hours for some of the queries.

Search and Breast Cancer: On Disruptive Shifts of Attention over Life Histories of an Illness

November 24, 2014 Comments off

Search and Breast Cancer: On Disruptive Shifts of Attention over Life Histories of an Illness
Source: Microsoft Research

We seek to understand the evolving needs of people who are faced with a life-changing medical diagnosis based on analyses of queries extracted from an anonymized search query log. Focusing on breast cancer, we manually tag a set of Web searchers as showing disruptive shifts in focus of attention and long-term patterns of search behavior consistent with the diagnosis and treatment of breast cancer. We build and apply probabilistic classifiers to detect these searchers from multiple sessions and to detect the timing of diagnosis, using a variety of temporal and statistical features. We explore the changes in information-seeking over time before and after an inferred diagnosis of breast cancer by aligning multiple searchers by the likely time of diagnosis. We automatically identify 1700 candidate searchers with an estimated 90% precision, and we predict the day of diagnosis within 15 days with an 88% accuracy. We show that the geographic and demographic attributes of searchers identified with high probability are strongly correlated with ground truth of reported incidence rates. We then analyze the content of queries over time from searchers for whom diagnosis was predicted, using a detailed ontology of cancerrelated search terms. Our analysis reveals the rich temporal structure of the evolving queries of people likely diagnosed with breast cancer. Finally, we focus on subtypes of illness based on inferred stages of cancer and show clinically relevant dynamics of information seeking based on dominant stage expressed by searchers.

Turk-Life in India

November 19, 2014 Comments off

Turk-Life in India
Source: Microsoft Research

Previous studies on Amazon Mechanical Turk (AMT), the most well-known marketplace for microtasks, show that the largest population of workers on AMT is U.S. based, while the second largest is based in India. In this paper, we present insights from an ethnographic study conducted in India to introduce some of these workers or ‘Turkers’ – who they are, how they work and what turking means to them. We examine the work they do to maintain their reputations and their work-life balance. In doing this, we illustrate how AMT’s design practically impacts on turk-work. Understanding the ‘lived work’ of crowdwork is a valuable first step for technology design.

Urban Computing: Concepts, Methodologies, and Applications

November 10, 2014 Comments off

Urban Computing: Concepts, Methodologies, and Applications
Source: Microsoft Research

Urbanization’s rapid progress has modernized many people’s lives, and also engendered big issues, such as traffic congestion, energy consumption, and pollution. Urban computing aims to tackle these issues by using the data that has been generated in cities, e.g., traffic flow, human mobility and geographical data. Urban computing connects urban sensing, data management, data analytics, and service providing into a recurrent process for an unobtrusive and continuous improvement of people’s lives, city operation systems, and the environment. Urban computing is an interdisciplinary field where computer sciences meet conventional city-related fields, like transportation, civil engineering, environment, economy, ecology, and sociology, in the context of urban spaces. This article first introduces the concept of urban computing, discussing its general framework and key challenges from the perspective of computer sciences. Secondly, we classify the applications of urban computing into seven categories, consisting of urban planning, transportation, the environment, energy, social, economy, and public safety & security, presenting representative scenarios in each category. Thirdly, we summarize the typical technologies that are needed in urban computing into four folds, which are about urban sensing, urban data management, knowledge fusion across heterogeneous data, and urban data visualization. Finally, we outlook the future of urban computing, suggesting a few research topics that are somehow missing in the community.

Privacy Considerations for a Pervasive Eye Tracking World

November 5, 2014 Comments off

Privacy Considerations for a Pervasive Eye Tracking World
Source: Microsoft Research

Multiple vendors now provide relatively inexpensive desktop eye and gaze tracking devices. With miniatureization and decreasing manufacturing costs, gaze trackers will follow the path of webcams, becoming ubiquitous and inviting many of the same privacy concerns. However, whereas the privacy loss from webcams may be obvious to the user, gaze tracking is more opaque and deserves special attention. In this paper, we review current research in gaze tracking and pupillometry and argue that gaze data should be protected by both policy and good data hygiene.


Get every new post delivered to your Inbox.

Join 1,014 other followers