Sound Like Me: Findings from a Randomized Experiment

Sound Like Me: Findings from a Randomized Experiment. SSRN Working Paper 4648689. With Donald Ngwe.

A new version of Copilot for Microsoft 365 includes a feature to let Outlook draft messages that “Sound Like Me” (SLM) based on training from messages in a user’s Sent Items folder. We sought to evaluate whether SLM lives up to its name. We find that it does, and more. Users widely and systematically praise SLM-generated messages as being more clear, more concise, and more “couldn’t have said it better myself”. When presented with a human-written message versus a SLM rewrite, users say they’d rather receive the SLM rewrite. All these findings are statistically significant. Furthermore, when presented with human and SLM messages, users struggle to tell the difference, in one specification doing worse than random.

(Also summarized in What Can Copilot’s Earliest Users Teach Us About Generative AI at Work? at “Email effectiveness.” Also summarized in AI and Productivity Report at “Outlook Email Study.”)