Vũ Lê

About
PortraitI am a Research Manager in Microsoft's PROSE team working on various Copilots in Microsoft.
My research powered popular software such as Excel, Visual Studio, Power BI, MSSQL, Power Apps, Power Automate, PowerShell, etc. Prior research helped find 600+ real bugs in GCC and LLVM compilers.
  • Email:
  • Phone: 425-421-4582
  • Address: 1 Microsoft Way
    Redmond, WA 98052
  • Curriculum Vitae: Download

Education

University of California, Davis, USA

Ph.D. (2009 - 2015)

Advisors: Zhendong Su and Sumit Gulwani
Thesis: Program Synthesis for Empowering End Users and Stress-Testing Compilers

University of Technology - Vietnam National University

B.Eng. in Computer Science (2001 - 2006)

Awards

  • PLDI Distinguished Paper Award, PLDI '14, Edinburgh, 2014
  • Best Graduate Researcher Award, Computer Science Department, UC Davis, 2014
  • Gold Medal at ACM Student Research Competition, PLDI '13, Seattle, 2013
  • PhD Fellowship, Vietnam Education Foundation, 2009

Publications

Research Projects

PROSE

Microsoft

  • Period: July 2015 - Present
  • Collaborators: PROSE team

Microsoft Program Synthesis using Examples (PROSE) SDK is a framework of technologies for programming by examples – automatic generation of programs from input-output examples at runtime. Given a domain-specific language (DSL) and some input-output examples for the desired program’s behavior, PROSE algorithms synthesize a ranked set of DSL programs that are consistent with the given examples. The PROSE SDK also includes a pre-defined set of specific technologies for various data wrangling domains, such as: FlashFill, a library of string transformations from Microsoft Excel 2013; text extraction library with a DSL of programs that find desired regions in a textual file; web extraction library with a DSL of programs that select elements on a web page.

Critical Software Validation

UC Davis

  • Period: January 2013 - Present
  • Collaborators: Merhdad Afshari, Zhendong Su, Chengnian Sun

An umbrella project whose goal is to improve the quality of all aspects of critical software, such as compilers and database engines. We introduce Equivalence Modulo Inputs, a novel methodology to generate valid tests from existing programs to find bugs in compiler optimizers [PLDI14a, OOPSLA15]. We are also developing techniques to find bugs in other important compiler components: link-time optimizers [ISSTA15] and warning infrastructures [ICSE16]. In total, we have reported 600+ new bugs (majority are miscompilation) in GCC and LLVM, most of which have been confirmed and fixed by developers.

FlashM

Microsoft Research

  • Period: June 2014 - June 2015
  • Main Collaborator: Sumit Gulwani

A project that combines various program synthesis techniques and tools (e.g., FlashExtract, FlashFill, and NLyze) to enable users to perform more sophisticated tasks that otherwise cannot be done using a single technique alone [UIST15]. For example, we enable editing by examples by using FlashExtract to highlight regions that need to be edited repetitively, and applying transformation tools such as FlashFill to transform these regions.

FlashExtract

Microsoft Research

  • Period: April 2013 - present
  • Collaborator: Sumit Gulwani

A general framework that allows users to extract relevant data from semi-structured documents (text files, web pages, spreadsheets) using examples (demo video) [PLDI14b]. FlashExtract has been shipped as part of PowerShell in Windows 10 and as the Custom Fields feature in Microsoft Operations Management Suite.

SmartSynth

UC Davis, Microsoft Research

  • Period: January 2011 - September 2013
  • Collaborators: Peli de Halleux, Sumit Gulwani, Zhendong Su

An end-to-end programming system for synthesizing smartphone automation scripts from natural language descriptions [MobiSys13a]. SmartSynth has been integrated into TouchDevelop, a popular programming environment on mobile devices (demo video) [MobiSys13b].

Ariadne

UC Davis

  • Period: June 2010 - July 2012
  • Collaborators: Earl Barr, Zhendong Su, Thanh Vo

A practical symbolic execution system that was specifically designed for detecting floating-point exceptions [POPL13]. Ariadne has found 2,091 floating-point exceptions in the GNU Scientific Library (GSL).

Work Experience

Researcher

Microsoft Corporation

  • Period: July 2015 - present
  • Team: PROSE

Working on the PROSE framework.

Consultant

Microsoft Research Redmond

  • Period: October 2014 - June 2015
  • Supervisor: Sumit Gulwani

Worked on the FlashM project that combines multiple synthesis paradigms. Improving FlashExtract logics and ranking. Collaborating with other product groups in Microsoft to bring FlashExtract technology to more customers.

Research Intern

Microsoft Research Redmond

  • Period: June 2014 - September 2014
  • Supervisor: Sumit Gulwani

Worked with the PowerShell team to help transfer FlashExtract to PowerShell (shipped with Windows 10).

Research Intern (twice)

Microsoft Research Redmond

  • Period: April 2013 - September 2013
  • Supervisor: Sumit Gulwani

Designed and developed FlashExtract, a general program synthesis framework that enables data extraction from semi-structured documents using examples (published in PLDI 2014).

Research Intern

Microsoft Research Redmond

  • Period: June 2012 - September 2012
  • Supervisor: Peli de Halleux and Sumit Gulwani

Designed and developed a natural language front-end for TouchDevelop to synthesize TouchDevelop scripts from English keywords (integrated into TouchDevelop).