Tag Archives: conference

CVPR 2016 & Research at Google



This week, Las Vegas hosts the 2016 Conference on Computer Vision and Pattern Recognition (CVPR 2016), the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. As a leader in computer vision research, Google has a strong presence at CVPR 2016, with many Googlers presenting papers and invited talks at the conference, tutorials and workshops.

We congratulate Google Research Scientist Ce Liu and Google Faculty Advisor Abhinav Gupta, who were selected as this year’s recipients of the PAMI Young Researcher Award for outstanding research contributions within computer vision. We also congratulate Googler Henrik Stewenius for receiving the Longuet-Higgins Prize, a retrospective award that recognizes up to two CVPR papers from ten years ago that have made a significant impact on computer vision research, for his 2006 CVPR paper “Scalable Recognition with a Vocabulary Tree”, co-authored with David Nister.

If you are attending CVPR this year, please stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for hundreds of millions of people. The Google booth will also showcase sveral recent efforts, including the technology behind Motion Stills and a live demo of neural network-based image compression. Learn more about our research being presented at CVPR 2016 in the list below (Googlers highlighted in blue).

Oral Presentations
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Alan L. Yuille, Kevin Murphy

Detecting Events and Key Actors in Multi-Person Videos
Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, Li Fei-Fei

Spotlight Session: 3D Reconstruction
DeepStereo: Learning to Predict New Views From the World’s Imagery
John Flynn, Ivan Neulander, James Philbin, Noah Snavely

Posters
Discovering the Physical Parts of an Articulated Object Class From Multiple Videos
Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

Blockout: Dynamic Model Selection for Hierarchical Deep Networks
Calvin Murdock, Zhen Li, Howard Zhou, Tom Duerig

Rethinking the Inception Architecture for Computer Vision
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, Zbigniew Wojna

Improving the Robustness of Deep Neural Networks via Stability Training
Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow

Semantic Image Segmentation With Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform
Liang-Chieh Chen, Jonathan T. Barron, George Papandreou, Kevin Murphy, Alan L. Yuille

Tutorial
Optimization Algorithms for Subset Selection and Summarization in Large Data Sets
Ehsan Elhamifar, Jeff Bilmes, Alex Kulesza, Michael Gygli

Workshops
Perceptual Organization in Computer Vision: The Role of Feedback in Recognition and Reorganization
Organizers: Katerina Fragkiadaki, Phillip Isola, Joao Carreira
Invited talks: Viren Jain, Jitendra Malik

VQA Challenge Workshop
Invited talks: Jitendra Malik, Kevin Murphy

Women in Computer Vision
Invited talk: Caroline Pantofaru

Computational Models for Learning Systems and Educational Assessment
Invited talk: Jonathan Huang

Large-Scale Scene Understanding (LSUN) Challenge
Invited talk: Jitendra Malik

Large Scale Visual Recognition and Retrieval: BigVision 2016
General Chairs: Jason Corso, Fei-Fei Li, Samy Bengio

ChaLearn Looking at People
Invited talk: Florian Schroff

Medical Computer Vision
Invited talk: Ramin Zabih

ICML 2016 & Research at Google



This week, New York hosts the 2016 International Conference on Machine Learning (ICML 2016), a premier annual Machine Learning event supported by the International Machine Learning Society (IMLS). Machine Learning is a key focus area at Google, with highly active research groups exploring virtually all aspects of the field, including deep learning and more classical algorithms.

We work on an extremely wide variety of machine learning problems that arise from a broad range of applications at Google. One particularly important setting is that of large-scale learning, where we utilize scalable tools and architectures to build machine learning systems that work with large volumes of data that often preclude the use of standard single-machine training algorithms. In doing so, we are able to solve deep scientific problems and engineering challenges, exploring theory as well as application, in areas of language, speech, translation, music, visual processing and more.

As Gold Sponsor, Google has a strong presence at ICML 2016 with many Googlers publishing their research and hosting workshops. If you’re attending, we hope you’ll visit the Google booth and talk with our researchers to learn more about the exciting work, creativity and fun that goes into solving interesting ML problems that impact millions of people. You can also learn more about our research being presented at ICML 2016 in the list below (Googlers highlighted in blue).

ICML 2016 Organizing Committee
Area Chairs include: Corinna Cortes, John Blitzer, Maya Gupta, Moritz Hardt, Samy Bengio

IMLS
Board Members include: Corinna Cortes

Accepted Papers
ADIOS: Architectures Deep In Output Space
Moustapha Cisse, Maruan Al-Shedivat, Samy Bengio

Associative Long Short-Term Memory
Ivo Danihelka, Greg Wayne, Benigno Uria, Nal Kalchbrenner, Alex Graves

Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu

Binary embeddings with structured hashed projections
Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann LeCun

Discrete Distribution Estimation Under Local Privacy
Peter Kairouz, Keith Bonawitz, Daniel Ramage

Dueling Network Architectures for Deep Reinforcement Learning (Best Paper Award recipient)
Ziyu Wang, Nando de Freitas, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot

Exploiting Cyclic Symmetry in Convolutional Neural Networks
Sander Dieleman, Jeffrey De Fauw, Koray Kavukcuoglu

Fast Constrained Submodular Maximization: Personalized Data Summarization
Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi

Greedy Column Subset Selection: New Bounds and Distributed Algorithms
Jason Altschuler, Aditya Bhaskara, Gang Fu, Vahab Mirrokni, Afshin Rostamizadeh, Morteza Zadimoghaddam

Horizontally Scalable Submodular Maximization
Mario Lucic, Olivier Bachem, Morteza Zadimoghaddam, Andreas Krause

Continuous Deep Q-Learning with Model-based Acceleration
Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

Meta-Learning with Memory-Augmented Neural Networks
Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap

One-Shot Generalization in Deep Generative Models
Danilo Rezende, Shakir Mohamed, Daan Wierstra

Pixel Recurrent Neural Networks (Best Paper Award recipient)
Aaron Van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu

Pricing a low-regret seller
Hoda Heidari, Mohammad Mahdian, Umar Syed, Sergei Vassilvitskii, Sadra Yazdanbod

Primal-Dual Rates and Certificates
Celestine Dünner, Simone Forte, Martin Takac, Martin Jaggi

Recommendations as Treatments: Debiasing Learning and Evaluation
Tobias Schnabel, Thorsten Joachims, Adith Swaminathan, Ashudeep Singh, Navin Chandak

Recycling Randomness with Structure for Sublinear Time Kernel Expansions
Krzysztof Choromanski, Vikas Sindhwani

Train faster, generalize better: Stability of stochastic gradient descent
Moritz Hardt, Ben Recht, Yoram Singer

Variational Inference for Monte Carlo Objectives
Andriy Mnih, Danilo Rezende

Workshops
Abstraction in Reinforcement Learning
Organizing Committee: Daniel Mankowitz, Timothy Mann, Shie Mannor
Invited Speaker: David Silver

Deep Learning Workshop
Organizers: Antoine Bordes, Kyunghyun Cho, Emily Denton, Nando de Freitas, Rob Fergus
Invited Speaker: Raia Hadsell

Neural Networks Back To The Future
Organizers: Léon Bottou, David Grangier, Tomas Mikolov, John Platt

Data-Efficient Machine Learning
Organizers: Marc Deisenroth, Shakir Mohamed, Finale Doshi-Velez, Andreas Krause, Max Welling

On-Device Intelligence
Organizers: Vikas Sindhwani, Daniel Ramage, Keith Bonawitz, Suyog Gupta, Sachin Talathi
Invited Speakers: Hartwig Adam, H. Brendan McMahan

Online Advertising Systems
Organizing Committee: Sharat Chikkerur, Hossein Azari, Edoardo Airoldi
Opening Remarks: Hossein Azari
Invited Speakers: Martin Pál, Todd Phillips

Anomaly Detection 2016
Organizing Committee: Nico Goernitz, Marius Kloft, Vitaly Kuznetsov

Tutorials
Deep Reinforcement Learning
David Silver

Rigorous Data Dredging: Theory and Tools for Adaptive Data Analysis
Moritz Hardt, Aaron Roth

Googlers on the road: OSCON 2016 in Austin

Developers and open source enthusiasts converge on Austin, Texas in just under two weeks for O’Reilly Media’s annual open source conference, OSCON, and the Community Leadership Summit (CLS) that precedes it. CLS runs May 14-15 at the Austin Convention Center followed by OSCON from May 16-19.

OSCON 2014 program chairs including Googler Sarah Novotny.
Photo licensed by O'Reilly Media under CC-BY-NC 2.0.

This year we have 10 Googlers hosting sessions covering topics including web development, machine learning, devops, astronomy and open source. A list of all of the talks hosted by Googlers alongside related events can be found below.

If you’re a student, educator, mentor, past or present participant in Google Summer of Code or Google Code-in, or just interested in learning more about the two programs, make sure to join us Monday evening for our Birds of a Feather session.

Have questions about Kubernetes, Google Summer of Code, open source at Google or just want to meet some Googlers? Stop by booth #307 in the Expo Hall.


Thursday, May 12th - GDG Austin
7:00pm   Google Developers Group Austin Meetup


Sunday, May 15th - Community Leadership Summit

Monday, May 16th
7:00pm   Google Summer of Code and Google Code-in Birds of a Feather


Tuesday, May 17th

Wednesday, May 18th

Thursday, May 19th
11:00am  Kubernetes hackathon at OSCON Contribute hosted by Brian Dorsey, Nikhil Jindal, Janet Kuo, Jeff Mendoza, John Mulhausen, Sarah Novotny, Terrence Ryan and Chao Xu
5:10pm    PANOPTES: Open source planet discovery by Jennifer Tong and Wilfred Gee

Haven’t registered for OSCON yet? You can knock 25% off the cost of registration by using discount code Google25, or attend parts of the event including our Birds of a Feather session for free by using discount code OSCON16XPO.

See you at OSCON!

By Josh Simmons, Open Source Programs Office

Research at Google and ICLR 2016



This week, San Juan, Puerto Rico hosts the 4th International Conference on Learning Representations (ICLR 2016), a conference focused on how one can learn meaningful and useful representations of data for Machine Learning. ICLR includes conference and workshop tracks, with invited talks along with oral and poster presentations of some of the latest research on deep learning, metric learning, kernel learning, compositional models, non-linear structured prediction, and issues regarding non-convex optimization.

At the forefront of innovation in cutting-edge technology in Neural Networks and Deep Learning, Google focuses on both theory and application, developing learning approaches to understand and generalize. As Platinum Sponsor of ICLR 2016, Google will have a strong presence with over 40 researchers attending (many from the Google Brain team and Google DeepMind), contributing to and learning from the broader academic research community by presenting papers and posters, in addition to participating on organizing committees and in workshops.

If you are attending ICLR 2016, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about our research being presented at ICLR 2016 in the list below (Googlers highlighted in blue).

Organizing Committee

Program Chairs
Samy Bengio, Brian Kingsbury

Area Chairs include:
John Platt, Tara Sanaith

Oral Sessions

Neural Programmer-Interpreters (Best Paper Award Recipient)
Scott Reed, Nando de Freitas

Net2Net: Accelerating Learning via Knowledge Transfer
Tianqi Chen, Ian Goodfellow, Jon Shlens

Conference Track Posters

Prioritized Experience Replay
Tom Schau, John Quan, Ioannis Antonoglou, David Silver

Reasoning about Entailment with Neural Attention
Tim Rocktäschel, Edward GrefenstetteKarl Moritz Hermann, Tomáš Kočiský, Phil Blunsom

Neural Programmer: Inducing Latent Programs With Gradient Descent
Arvind Neelakantan, Quoc Le, Ilya Sutskever

MuProp: Unbiased Backpropagation For Stochastic Neural Networks
Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih

Multi-Task Sequence to Sequence Learning
Minh-Thang Luong, Quoc LeIlya Sutskever, Oriol Vinyals, Lukasz Kaiser

A Test of Relative Similarity for Model Selection in Generative Models
Eugene Belilovsky, Wacha Bounliphone, Matthew Blaschko, Ioannis Antonoglou, Arthur Gretton

Continuous control with deep reinforcement learning
Timothy Lillicrap, Jonathan HuntAlexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra

Policy Distillation
Andrei Rusu, Sergio Gomez, Caglar Gulcehre, Guillaume Desjardins, James Kirkpatrick, Razvan Pascanu, Volodymyr Mnih, Koray Kavukcuoglu, Raia Hadsell

Neural Random-Access Machines
Karol Kurach, Marcin Andrychowicz, Ilya Sutskever

Variable Rate Image Compression with Recurrent Neural Networks
George Toderici, Sean O'Malley, Damien Vincent, Sung Jin Hwang, Michele Covell, Shumeet Baluja, Rahul Sukthankar, David Minnen

Order Matters: Sequence to Sequence for Sets
Oriol Vinyals, Samy Bengio, Manjunath Kudlur

Grid Long Short-Term Memory
Nal Kalchbrenner, Alex Graves, Ivo Danihelka

Neural GPUs Learn Algorithms
Lukasz Kaiser, Ilya Sutskever

ACDC: A Structured Efficient Linear Layer
Marcin Moczulski, Misha Denil, Jeremy Appleyard, Nando de Freitas

Workshop Track Posters

Revisiting Distributed Synchronous SGD
Jianmin Chen, Rajat Monga, Samy Bengio, Rafal Jozefowicz

Black Box Variational Inference for State Space Models
Evan Archer, Il Memming Park, Lars Buesing, John Cunningham, Liam Paninski

A Minimalistic Approach to Sum-Product Network Learning for Real Applications
Viktoriya Krakovna, Moshe Looks

Efficient Inference in Occlusion-Aware Generative Models of Images
Jonathan Huang, Kevin Murphy

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke

Deep Autoresolution Networks
Gabriel Pereyra, Christian Szegedy

Learning visual groups from co-occurrences in space and time
Phillip Isola, Daniel Zoran, Dilip Krishnan, Edward H. Adelson

Adding Gradient Noise Improves Learning For Very Deep Networks
Arvind Neelakantan, Luke Vilnis, Quoc V. Le, Ilya Sutskever, Lukasz Kaiser, Karol Kurach, James Martens

Adversarial Autoencoders
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow

Generating Sentences from a Continuous Space
Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy Bengio

NIPS 2015 and Machine Learning Research at Google



This week, Montreal hosts the 29th Annual Conference on Neural Information Processing Systems (NIPS 2015), a machine learning and computational neuroscience conference that includes invited talks, demonstrations and oral and poster presentations of some of the latest in machine learning research. Google will have a strong presence at NIPS 2015, with over 140 Googlers attending in order to contribute to and learn from the broader academic research community by presenting technical talks and posters, in addition to hosting workshops and tutorials.

Research at Google is at the forefront of innovation in Machine Intelligence, actively exploring virtually all aspects of machine learning including classical algorithms as well as cutting-edge techniques such as deep learning. Focusing on both theory as well as application, much of our work on language understanding, speech, translation, visual processing, ranking, and prediction relies on Machine Intelligence. In all of those tasks and many others, we gather large volumes of direct or indirect evidence of relationships of interest, and develop learning approaches to understand and generalize.

If you are attending NIPS 2015, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about our research being presented at NIPS 2015 in the list below (Googlers highlighted in blue).

Google is a Platinum Sponsor of NIPS 2015.

PROGRAM ORGANIZERS
General Chairs
Corinna Cortes, Neil D. Lawrence
Program Committee includes:
Samy Bengio, Gal Chechik, Ian Goodfellow, Shakir Mohamed, Ilya Sutskever

ORAL SESSIONS
Learning Theory and Algorithms for Forecasting Non-stationary Time Series
Vitaly Kuznetsov, Mehryar Mohri

SPOTLIGHT SESSIONS
Distributed Submodular Cover: Succinctly Summarizing Massive Data
Baharan Mirzasoleiman, Amin Karbasi, Ashwinkumar Badanidiyuru, Andreas Krause

Spatial Transformer Networks
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu

Pointer Networks
Oriol Vinyals, Meire Fortunato, Navdeep Jaitly

Structured Transforms for Small-Footprint Deep Learning
Vikas Sindhwani, Tara Sainath, Sanjiv Kumar

Spherical Random Features for Polynomial Kernels
Jeffrey Pennington, Felix Yu, Sanjiv Kumar

POSTERS
Learning to Transduce with Unbounded Memory
Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Phil Blunsom

Deep Knowledge Tracing
Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas Guibas, Jascha Sohl-Dickstein

Hidden Technical Debt in Machine Learning Systems
D Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, Dan Dennison

Grammar as a Foreign Language
Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton

Stochastic Variational Information Maximisation
Shakir Mohamed, Danilo Rezende

Embedding Inference for Structured Multilabel Prediction
Farzaneh Mirzazadeh, Siamak Ravanbakhsh, Bing Xu, Nan Ding, Dale Schuurmans

On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators
Changyou Chen, Nan Ding, Lawrence Carin

Spectral Norm Regularization of Orthonormal Representations for Graph Transduction
Rakesh Shivanna, Bibaswan Chatterjee, Raman Sankaran, Chiranjib Bhattacharyya, Francis Bach

Differentially Private Learning of Structured Discrete Distributions
Ilias Diakonikolas, Moritz Hardt, Ludwig Schmidt

Nearly Optimal Private LASSO
Kunal Talwar, Li Zhang, Abhradeep Thakurta

Learning Continuous Control Policies by Stochastic Value Gradients
Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Tom Erez, Yuval Tassa

Gradient Estimation Using Stochastic Computation Graphs
John Schulman, Nicolas Heess, Theophane Weber, Pieter Abbeel

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer

Teaching Machines to Read and Comprehend
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom

Bayesian dark knowledge
Anoop Korattikara, Vivek Rathod, Kevin Murphy, Max Welling

Generalization in Adaptive Data Analysis and Holdout Reuse
Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, Aaron Roth

Semi-supervised Sequence Learning
Andrew Dai, Quoc Le

Natural Neural Networks
Guillaume Desjardins, Karen Simonyan, Razvan Pascanu, Koray Kavukcuoglu

Revenue Optimization against Strategic Buyers
Andres Munoz Medina, Mehryar Mohri


WORKSHOPS
Feature Extraction: Modern Questions and Challenges
Workshop Chairs include: Dmitry Storcheus, Afshin Rostamizadeh, Sanjiv Kumar
Program Committee includes: Jeffery Pennington, Vikas Sindhwani

NIPS Time Series Workshop
Invited Speakers include: Mehryar Mohri
Panelists include: Corinna Cortes

Nonparametric Methods for Large Scale Representation Learning
Invited Speakers include: Amr Ahmed

Machine Learning for Spoken Language Understanding and Interaction
Invited Speakers include: Larry Heck

Adaptive Data Analysis
Organizers include: Moritz Hardt

Deep Reinforcement Learning
Organizers include : David Silver
Invited Speakers include: Sergey Levine

Advances in Approximate Bayesian Inference
Organizers include : Shakir Mohamed
Panelists include: Danilo Rezende

Cognitive Computation: Integrating Neural and Symbolic Approaches
Invited Speakers include: Ramanathan V. Guha, Geoffrey Hinton, Greg Wayne

Transfer and Multi-Task Learning: Trends and New Perspectives
Invited Speakers include: Mehryar Mohri
Poster presentations include: Andres Munoz Medina

Learning and privacy with incomplete data and weak supervision
Organizers include : Felix Yu
Program Committee includes: Alexander Blocker, Krzysztof Choromanski, Sanjiv Kumar
Speakers include: Nando de Freitas

Black Box Learning and Inference
Organizers include : Ali Eslami
Keynotes include: Geoff Hinton

Quantum Machine Learning
Invited Speakers include: Hartmut Neven

Bayesian Nonparametrics: The Next Generation
Invited Speakers include: Amr Ahmed

Bayesian Optimization: Scalability and Flexibility
Organizers include: Nando de Freitas

Reasoning, Attention, Memory (RAM)
Invited speakers include: Alex Graves, Ilya Sutskever

Extreme Classification 2015: Multi-class and Multi-label Learning in Extremely Large Label Spaces
Panelists include: Mehryar Mohri

Machine Learning Systems
Invited speakers include: Jeff Dean


SYMPOSIA
Brains, Mind and Machines
Invited Speakers include: Geoffrey Hinton, Demis Hassabis

Deep Learning Symposium
Program Committee Members include: Samy Bengio, Phil Blunsom, Nando De Freitas, Ilya Sutskever, Andrew Zisserman
Invited Speakers include: Max Jaderberg, Sergey Ioffe, Alexander Graves

Algorithms Among Us: The Societal Impacts of Machine Learning
Panelists include: Shane Legg


TUTORIALS
NIPS 2015 Deep Learning Tutorial
Geoffrey E. Hinton, Yoshua Bengio, Yann LeCun

Large-Scale Distributed Systems for Training Neural Networks
Jeff Dean, Oriol Vinyals

VLDB 2015 and Database Research at Google



This week, Kohala, Hawaii hosts the 41st International Conference of Very Large Databases (VLDB 2015), a premier annual international forum for data management and database researchers, vendors, practitioners, application developers and users. As a leader in Database research, Google will have a strong presence at VLDB 2015 with many Googlers publishing work, organizing workshops and presenting demos.

The research Google is presenting at VLDB involves the work of Structured Data teams who are building intelligent and efficient systems to discover, annotate and explore structured data from the Web, surfacing them creatively through Google products (such as structured snippets and table search), as well as engineering efforts that create scalable, reliable, fast and general-purpose infrastructure for large-scale data processing (such as F1, Mesa, and Google Cloud's BigQuery).

If you are attending VLDB 2015, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about our research being presented at VLDB 2015 in the list below (Googlers highlighted in blue).

Google is a Gold Sponsor of VLDB 2015.

Papers:
Keys for Graphs
Wenfei Fan, Zhe Fan, Chao Tian, Xin Luna Dong

In-Memory Performance for Big Data
Goetz Graefe, Haris Volos, Hideaki Kimura, Harumi Kuno, Joseph Tucek, Mark Lillibridge, Alistair Veitch

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing
Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael Fernández-Moctezuma, Reuven Lax, Sam McVeety, Daniel Mills, Frances Perry, Eric Schmidt, Sam Whittle

Resource Bricolage for Parallel Database Systems
Jiexing Li, Jeffrey Naughton, Rimma Nehme

AsterixDB: A Scalable, Open Source BDMS
Sattam Alsubaiee, Yasser Altowim, Hotham Altwaijry, Alex Behm, Vinayak Borkar, Yingyi Bu, Michael Carey, Inci Cetindil, Madhusudan Cheelangi, Khurram Faraaz, Eugenia Gabrielova, Raman Grover, Zachary Heilbron, Young-Seok Kim, Chen Li, Guangqiang Li, Ji Mahn Ok, Nicola Onose, Pouria Pirzadeh, Vassilis Tsotras, Rares Vernica, Jian Wen, Till Westmann

Knowledge-Based Trust: A Method to Estimate the Trustworthiness of Web Sources
Xin Luna Dong, Evgeniy Gabrilovich, Kevin Murphy, Van Dang, Wilko Horn, Camillo Lugaresi, Shaohua Sun, Wei Zhang

Efficient Evaluation of Object-Centric Exploration Queries for Visualization
You Wu, Boulos Harb, Jun Yang, Cong Yu

Interpretable and Informative Explanations of Outcomes
Kareem El Gebaly, Parag Agrawal, Lukasz Golab, Flip Korn, Divesh Srivastava

Take me to your leader! Online Optimization of Distributed Storage Configurations
Artyom Sharov, Alexander Shraer, Arif Merchant, Murray Stokely

TreeScope: Finding Structural Anomalies In Semi-Structured Data
Shanshan Ying, Flip Korn, Barna Saha, Divesh Srivastava

Workshops:
Workshop on Big-Graphs Online Querying - Big-O(Q) 2015
Workshop co-chair: Cong Yu

3rd International Workshop on In-Memory Data Management and Analytics
Program committee includes: Sandeep Tata

High-Availability at Massive Scale: Building Google's Data Infrastructure for Ads
Invited talk at BIRTE by: Ashish Gupta, Jeff Shute

Demonstrations:
KATARA: Reliable Data Cleaning with Knowledge Bases and Crowdsourcing
Xu Chu, John Morcos, Ihab Ilyas, Mourad Ouzzani, Paolo Papotti, Nan Tang, Yin Ye

Error Diagnosis and Data Profiling with Data X-Ray
Xiaolan Wang, Mary Feng, Yue Wang, Xin Luna Dong, Alexandra Meliou

Pulling Back the Curtain on Google’s Network Infrastructure



Pursuing Google’s mission of organizing the world’s information to make it universally accessible and useful takes an enormous amount of computing and storage. In fact, it requires coordination across a warehouse-scale computer. Ten years ago, we realized that we could not purchase, at any price, a datacenter network that could meet the combination of our scale and speed requirements. So, we set out to build our own datacenter network hardware and software infrastructure. Today, at the ACM SIGCOMM conference, we are presenting a paper with the technical details on five generations of our in-house data center network architecture. This paper presents the technical details behind a talk we presented at Open Network Summit a few months ago.

From relatively humble beginnings, and after a misstep or two, we’ve built and deployed five generations of datacenter network infrastructure. Our latest-generation Jupiter network has improved capacity by more than 100x relative to our first generation network, delivering more than 1 petabit/sec of total bisection bandwidth. This means that each of 100,000 servers can communicate with one another in an arbitrary pattern at 10Gb/s.

Such network performance has been tremendously empowering for Google services. Engineers were liberated from optimizing their code for various levels of bandwidth hierarchy. For example, initially there were painful tradeoffs with careful data locality and placement of servers connected to the same top of rack switch versus correlated failures caused by a single switch failure. A high performance network supporting tens of thousands of servers with flat bandwidth also enabled individual applications to scale far beyond what was otherwise possible and enabled tight coupling among multiple federated services. Finally, we were able to substantially improve the efficiency of our compute and storage infrastructure. As quantified in this recent paper, scheduling a set of jobs over a single larger domain supports much higher utilization than scheduling the same jobs over multiple smaller domains.

Delivering such a network meant we had to solve some fundamental problems in networking. Ten years ago, networking was defined by the interaction of individual hardware elements, e.g., switches, speaking standardized protocols to dynamically learn what the network looks like. Based on this dynamically learned information, switches would set their forwarding behavior. While robust, these protocols targeted deployment environments with perhaps tens of switches communicating between between multiple organizations. Configuring and managing switches in such an environment was manual and error prone. Changes in network state would spread slowly through the network using a high-overhead broadcast protocol. Most challenging of all, the system could not scale to meet our needs.

We adopted a set of principles to organize our networks that is now the primary driver for networking research and industrial innovation, Software Defined Networking (SDN). We observed that we could arrange emerging merchant switch silicon around a Clos topology to scale to the bandwidth requirements of a data center building. The topology of all five generations of our data center networks follow the blueprint below. Unfortunately, this meant that we would potentially require 10,000+ individual switching elements. Even if we could overcome the scalability challenges of existing network protocols, managing and configuring such a vast number of switching elements would be impossible.
So, our next observation was that we knew the exact configuration of the network we were trying to build. Every switch would play a well-defined role in a larger-ensemble. That is, from the perspective of a datacenter building, we wished to build a logical building-scale network. For network management, we built a single configuration for the entire network and pushed the single global configuration to all switches. Each switch would simply pull out its role in the larger whole. Finally, we transformed routing from a pair-wise, fully distributed (but somewhat slow and high overhead) scheme to a logically-centralized scheme under the control of a single dynamically-elected master as illustrated in the figure below from our paper.
Our SIGCOMM paper on Jupiter is one of four Google-authored papers being presented at that conference. Taken together, these papers show the benefits of embedding research within production teams, reflecting both collaborations with PhD students carrying out extended research efforts with Google engineers during internships as well as key insights from deployed production systems:

  • Our work on Bandwidth Enforcer shows how we can allocate wide area bandwidth among tens of thousands of individual applications based on centrally configured policy, substantially improving network utilization while simultaneously isolating services from one another.
  • Condor addresses the challenges of designing data center network topologies. Network designers can specify constraints for data center networks; Condor efficiently generates candidate network designs that meet these constraints, and evaluates these candidates against a variety of target metrics.
  • Congestion control in datacenter networks is challenging because of tiny buffers and very small round trip times. TIMELY shows how to manage datacenter bandwidth allocation while maintaining highly responsive and low latency network roundtrips in the data center.

These efforts reflect the latest in a long series of substantial Google contributions in networking. We are excited about being increasingly open about results of our research work: to solicit feedback on our approach, to influence future research and development directions so that we can benefit from community-wide efforts to improve networking, and to attract the next-generation of great networking thinkers and builders to Google. Our focus on Google Cloud Platform further increases the importance of being open about our infrastructure. Since the same network powering Google infrastructure for a decade is also the underpinnings of our Cloud Platform, all developers can leverage the network to build highly robust, manageable, and globally scalable services.

Say hello to the Enigma conference



USENIX Enigma is a new conference focused on security, privacy and electronic crime through the lens of emerging threats and novel attacks. The goal of this conference is to help industry, academic, and public-sector practitioners better understand the threat landscape. Enigma will have a single track of 30-minute talks that are curated by a panel of experts, featuring strong technical content with practical applications to current and emerging threats.
Google is excited to both sponsor and help USENIX build Enigma, since we share many of its core principles: transparency, openness, and cutting-edge security research. Furthermore, we are proud to provide Enigma with with engineering and design support, as well as volunteer participation in program and steering committees.

The first instantiation of Enigma will be held January 25-27 in San Francisco. You can sign up for more information about the conference or propose a talk through the official conference site at http://enigma.usenix.org

KDD 2015 Best Research Paper Award: “Algorithms for Public-Private Social Networks”



The 21st ACM conference on Knowledge Discovery and Data Mining (KDD’15), a main venue for academic and industry research in data management, information retrieval, data mining and machine learning, was held last week in Sydney, Australia. In the past several years, Google has been actively participating in KDD, with several Googlers presenting work at the conference in the research and industrial tracks. This year Googlers presented 12 papers at KDD (listed below, with Googlers in blue), all of which are freely available at the ACM Digital Library.

One of these papers, Efficient Algorithms for Public-Private Social Networks, co-authored by Googlers Ravi Kumar, Silvio Lattanzi, Vahab Mirrokni, former Googler intern Alessandro Epasto and research visitor Flavio Chierichetti, was awarded Best Research Paper. The inspiration for this paper comes from studying social networks and the importance of addressing privacy issues in analyzing such networks.

Privacy issues dictate the way information is shared among the members of the social network. In the simplest case, a user can mark some of her friends as private; this would make the connections (edges) between this user and these friends visible only to the user. In a different instantiation of privacy, a user can be a member of a private group; in this case, all the edges among the group members are to be considered private. Thus, each user in the social network has her own view of the link structure of the network. These privacy issues also influence the way in which the network itself can be viewed and processed by algorithms. For example, one cannot use the list of private friends of user X for suggesting potential friends or public news items to another user on the network, but one can use this list for the purpose of suggesting friends for user X.

As a result, enforcing these privacy guarantees translates to solving a different algorithmic problem for each user in the network, and for this reason, developing algorithms that process these social graphs and respect these privacy guarantees can become computationally expensive. In a recent study, Dey et al. crawled a snapshot of 1.4 million New York City Facebook users and reported that 52.6% of them hid their friends list. As more users make a larger portion of their social neighborhoods private, these computational issues become more important.

Motivated by the above, this paper introduces the public-private model of graphs, where each user (node) in the public graph has an associated private graph. In this model, the public graph is visible to everyone, and the private graph at each node is visible only to each specific user. Thus, any given user sees their graph as a union of their private graph and the public graph.

From algorithmic point of view, the paper explores two powerful computational paradigms for efficiently studying large graphs, namely, sketching and sampling, and focuses on some key problems in social networks such as similarity ranking, and clustering. In the sketching model, the paper shows how to efficiently approximate the neighborhood function, which in turn can be used to approximate various notions of centrality scores for each node - such centrality scores like the PageRank score have important applications in ranking and recommender systems. In the sampling model, the paper focuses on all-pair shortest path distances, node similarities, and correlation clustering, and develop algorithms that computes these notions on a given public-private graph and at the same time. The paper also illustrates the effectiveness of this model and the computational efficiency of the algorithms by performing experiments on real-world social networks.

The public-private model is an abstraction that can be used to develop efficient social network algorithms. This work leaves a number of open interesting research directions such as: obtaining efficient algorithms for the densest subgraph/community detection problems, influence maximization, computing other pairwise similarity scores, and most importantly, recommendation systems.

KDD’15 Papers, co-authored by Googlers:

Efficient Algorithms for Public-Private Social Networks (Best Paper Award)
Flavio Chierichetti, Alessandro Epasto, Ravi Kumar, Silvio Lattanzi, Vahab Mirrokni

Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC
Sungjin Ahn, Anoop Korattikara, Nathan Liu, Suju Rajan, Max Welling

TimeMachine: Timeline Generation for Knowledge-Base Entities
Tim Althoff, Xin Luna Dong, Kevin Murphy, Safa Alai, Van Dang, Wei Zhang

Algorithmic Cartography: Placing Points of Interest and Ads on Maps
Mohammad Mahdian, Okke Schrijvers, Sergei Vassilvitskii

Stream Sampling for Frequency Cap Statistics
Edith Cohen

Dirichlet-Hawkes Processes with Applications to Clustering Continuous-Time Document Streams
Nan Du, Mehrdad Farajtabar, Amr Ahmed, Alexander J.Smola, Le Song

Adaptation Algorithm and Theory Based on Generalized Discrepancy
Corinna Cortes, Mehryar Mohri, Andrés Muñoz Medina (now at Google)

Estimating Local Intrinsic Dimensionality
Laurent Amsaleg, Oussama Chelly, Teddy Furon, Stéphane Girard, Michael E. Houle Ken-ichi Kawarabayashi, Michael Nett

Unified and Contrasting Cuts in Multiple Graphs: Application to Medical Imaging Segmentation
Chia-Tung Kuo, Xiang Wang, Peter Walker, Owen Carmichael, Jieping Ye, Ian Davidson

Going In-depth: Finding Longform on the Web
Virginia Smith, Miriam Connor, Isabelle Stanton

Annotating needles in the haystack without looking: Product information extraction from emails
Weinan Zhang, Amr Ahmed, Jie Yang, Vanja Josifovski, Alexander Smola

Focusing on the Long-term: It's Good for Users and Business
Diane Tang, Henning Hohnhold, Deirdre O'Brien

ICSE 2015 and Software Engineering Research at Google



The large scale of our software engineering efforts at Google often pushes us to develop cutting-edge infrastructure. In May 2015, at the International Conference on Software Engineering (ICSE 2015), we shared some of our software engineering tools and practices and collaborated with the research community through a combination of publications, committee memberships, and workshops. Learn more about some of our research below (Googlers highlighted in blue).

Google was a Gold supporter of ICSE 2015.

Technical Research Papers:
A Flexible and Non-intrusive Approach for Computing Complex Structural Coverage Metrics
Michael W. Whalen, Suzette Person, Neha Rungta, Matt Staats, Daniela Grijincu

Automated Decomposition of Build Targets
Mohsen Vakilian, Raluca Sauciuc, David Morgenthaler, Vahab Mirrokni

Tricorder: Building a Program Analysis Ecosystem
Caitlin Sadowski, Jeffrey van Gogh, Ciera Jaspan, Emma Soederberg, Collin Winter

Software Engineering in Practice (SEIP) Papers:
Comparing Software Architecture Recovery Techniques Using Accurate Dependencies
Thibaud Lutellier, Devin Chollak, Joshua Garcia, Lin Tan, Derek Rayside, Nenad Medvidovic, Robert Kroeger

Technical Briefings:
Software Engineering for Privacy in-the-Large
Pauline Anthonysamy, Awais Rashid

Workshop Organizers:
2nd International Workshop on Requirements Engineering and Testing (RET 2015)
Elizabeth Bjarnason, Mirko Morandini, Markus Borg, Michael Unterkalmsteiner, Michael Felderer, Matthew Staats

Committee Members:
Caitlin Sadowski - Program Committee Member and Distinguished Reviewer Award Winner
James Andrews - Review Committee Member
Ray Buse - Software Engineering in Practice (SEIP) Committee Member and Demonstrations Committee Member
John Penix - Software Engineering in Practice (SEIP) Committee Member
Marija Mikic - Poster Co-chair
Daniel Popescu and Ivo Krka - Poster Committee Members