In 2015, Salesforce researchers working out of a basement under a Palo Alto West Elm furniture store developed the prototype of what would become Einstein, Salesforce’s AI platform that powers predictions across its products. As of November, Einstein is serving over 80 billion predictions per day for tens of thousands of businesses and millions of users. But while the technology remains core to Salesforce’s business, it’s but one of many areas of research under the purview of Salesforce Research, Salesforce’s AI R&D division.

Salesforce Research, whose mission is to advance AI techniques that pave the path for new products, applications, and research directions, is an outgrowth of Salesforce CEO Mark Benioff’s commitment to AI as a revenue driver. In 2016, when Salesforce first announced Einstein, Benioff characterized AI as “the next platform” on which he predicted companies’ future applications and capabilities will be built. The next year, Salesforce released research suggesting that AI’s impact through customer relationship management software alone will add over $1 trillion to gross domestic products around the globe and create 800,000 new jobs.

Today, Salesforce Research’s work spans a number of domains including computer vision, deep learning, speech, natural language processing, and reinforcement learning. Far from exclusively commercial in nature, the division’s projects run the gamut from drones that use AI to spot great white sharks to a system that’s able to identify signs of breast cancer from images of tissue. Work continues even as the pandemic forces Salesforce’s scientists out of the office for the foreseeable future. Just this past year, Salesforce Research released an environment — the AI Economist — for understanding how AI could improve economic design, a tool for testing natural language model robustness, and a framework spelling out the uses, risks, and biases of AI models.

According to Einstein GM Marco Casalaina, the bulk of Salesforce Research’s work falls into one of two categories: pure research or applied research. Pure research includes things like the AI Economist, which isn’t immediately relevant to tasks that Salesforce or its customers do today. Applied research, on the other hand, has a clear business motivation and use case.

One particularly active subfield of applied research at Salesforce Research is speech. Last spring, as customer service representatives were increasingly ordered to work from home in Manila, the U.S., and elsewhere, some companies began to turn to AI to bridge the resulting gaps in service. Casalaina says that this spurred work on the call center side of Salesforce’s business.

“We’re doing a lot of work for our customers … with regard to real-time voice cues. We offer this whole coaching process for customer service representatives that takes place after the call,” Casalaina told VentureBeat in a recent interview. “The technology identifies moments that were good or bad but that were coachable in some fashion. We’re also working on a number of capabilities like auto escalations and wrap-up, as well as using the contents of calls to prefill fields for you and make your life a little bit easier.”


AI with health care applications is another research pillar at Salesforce, Richard Socher, former chief scientist at Salesforce, told VentureBeat during a phone interview. Socher, who came to Salesforce following the acquisition of MetaMind in 2016, left Salesforce Research in July 2020 to found search engine startup but remains a scientist emeritus at Salesforce.

“Medical computer vision in particular can be highly impactful,” Socher said. “What’s interesting is that the human visual system hasn’t necessarily developed to be very good at reading x-rays, CT scans, MRI scans in three dimensions, or more importantly images of cells that might indicate a cancer … The challenge is predicting diagnoses and treatment.”

To develop, train, and benchmark predictive health care models, Salesforce Research draws from a proprietary database comprising tens of terabytes of data collected from clinics, hospitals, and other points of care in the U.S. It’s anonymized and deidentified, and Andre Esteva, head of medical AI at Salesforce Research, says that Salesforce is committed to adopting privacy-preserving techniques like federated learning that ensure patients a level of anonymity.


Salesforce Research’s ethical AI work straddles applied and pure research. There’s been increased interest in it from customers, according to Casalaina, who says he’s had a number of conversations with clients about the ethics of AI over the past six months.

In January, Salesforce researchers released Robustness Gym, which aims to unify a patchwork of libraries to bolster natural language model testing strategies. Robustness Gym provides guidance on how certain variables can help prioritize what evaluations to run. Specifically, it describes the influence of a task via a structure and known prior evaluations, as well as needs such as testing generalization, fairness, or security; and constraints like expertise, compute access, and human resources.

In the study of natural language, robustness testing tends to be the exception rather than the norm. One report found that 60% to 70% of answers given by natural language processing models were embedded somewhere in the benchmark training sets, indicating that the models were usually simply memorizing answers. Another study found that metrics used to benchmark AI and machine learning models tended to be inconsistent, irregularly tracked, and not particularly informative.