title: "Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists" authors: Ian Drosos, Titus Barik, Philip J. Guo, Robert DeLine, Sumit Gulwani venue: ACM Conference on Human Factors in Computing Systems (CHI) year: 2020 footer: "Best Paper Award" tweet: Wrex synthesizes readable data wrangling code within Jupyter Notebooks using programming-by-example abstract: > Data wrangling is a difficult and time-consuming activity in computational notebooks, and existing wrangling tools do not fit the exploratory workflow for data scientists in these environments. We propose a unified interaction model based on programming-by-example that generates readable code for a variety of useful data transformations, implemented as a Jupyter notebook extension called Wrex. User study results demonstrate that data scientists are significantly more effective and efficient at data wrangling with Wrex over manual programming. Qualitative participant feedback indicates that Wrex was useful and reduced barriers in having to recall or look up the usage of various data transform functions. The synthesized code allowed data scientists to verify the intended data transformation, increased their trust and confidence in Wrex, and fit seamlessly within their cell-based notebook workflows. This work suggests that presenting readable code to professional data scientists is an indispensable component of offering data wrangling tools in notebooks. bibtex: > @inproceedings{Drosos2020, author = {Drosos, Ian and Barik, Titus and Guo, Philip J. and DeLine, Robert and Gulwani, Sumit}, title = {Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists}, year = {2020}, isbn = {9781450367080}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3313831.3376442}, doi = {10.1145/3313831.3376442}, booktitle = {Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems}, pages = {1--12}, numpages = {12}, keywords = {data science, program synthesis, computational notebooks}, location = {Honolulu, HI, USA}, series = {CHI '20} }