Cohere, a startup creating large language models to rival those from OpenAI, today announced the launch of a nonprofit research lab: Cohere For AI. Headed by Google alum Sara Hooker, Cohere says that Cohere Labs will work to solve some of the industry’s toughest challenges by contributing “fundamental research” to the open source community.
“We’re really excited to be leading a new nonprofit AI research lab as we continue to broaden how and where research is done. There’s so much to be uncovered, and our focus will be on collaborating openly and contributing fundamental research,” Hooker told TechCrunch via email. “At the same time, a key component of our work will be to expand the community and help train the next generation of talent, creating new entry points to work on core research. “
There’s long been a concern within the AI community that not enough funding is being set aside for AI research outside of wealthy corporations. One study found that ties to corporations — either funding or affiliation — in AI research significantly grew from 2008 to 2019. Another study showed that Google parent company Alphabet, as well as Amazon and Microsoft, hired a whopping 52 tenure-track AI professors between 2004 and 2018, removing these would-be teachers from academic and nonprofit work.
Concentrating power within corporations has a number of obvious downsides, but one of the most alarming is that it tends to underemphasize values such as beneficence, justice, and inclusion on the research side. A number of experts, speaking to Wired for a 2020 piece, point out that corporate AI projects have led to an “unscientific fixation” on projects only possible for people with access to powerful data centers. Regardless of domain, work within companies is often closely guarded, taking years to see the light of day — if it ever does.
“Our agenda is centered on advancing progress on machine learning questions alongside community-focused research,” Hooker said. “We also want to have a proactive research agenda so that we can identify major challenges before they become problems we need to fix retroactively. We’re focused on a variety of different disciplines to work on bias mitigation, for instance, and a very core piece of the research surrounds AI safety and the robust use of models.”
Another core component Cohere For AI is hoping to build out is access to compute resources, Hooker said — specifically helping researchers better utilize “cutting-edge” models to help develop their work.” The role of compute access is changing, as illustrated by the trends in language models (i.e., AI systems that understand and generate text). Only a few years ago, creating a highly sophisticated language model required massive compute resources. But now, thanks to academic breakthroughs and the work of the open source community, the barriers to entry are far lower than they used to be.
Road to nonprofit
Backed by AI luminaries including UC Berkeley AI lab co-director Pieter Abbeel, Cohere was founded in 2019 by a pedigreed team that includes Aidan Gomez, Ivan Zhang, and Nick Frosst. Gomez coauthored the academic paper “Attention Is All You Need,” which introduced the world to a fundamental AI model architecture called the Transformer. (Among other high-profile systems, OpenAI’s GPT-3 and Codex are based on the Transformer architecture.) Zhang, alongside Gomez, is a contributor at For.ai, an open AI research collective involving data scientists and engineers.
“For.ai was designed to help early-career enthusiasts better engage with more experienced researchers,” Hooker said. “Many of the founding members went on to pursue Ph.Ds., or work at academic or industry labs. At the time, For.ai was one of the first community-driven research groups to support independent researchers around the world. Now, the Cohere team and its supporters are excited to reintroduce the original concept but with more resources built out from Cohere.”
According to Hooker, Cohere For AI will offer ways for data scientists to “meet and collaborate” through mentorship research opportunities, engagement with traditional conferences, and contributions to research journals. Part of this will be through promoting stewardship of open source scientific practices and the “responsible” release of code, as well as supporting efforts that encourage “scientific communication” through different mediums, like blog posts.
“We really want to fashion Cohere For AI as an ambitious research lab that is contributing to the research community, but is also prioritizing how to better involve a diverse set of voices. We want to help change where, how, and by whom research is done,” Hooker said.
Despite its lofty goals, Cohere For AI — which Cohere itself will fund — is likely to invite skepticism from researchers wary of Cohere’s corporate ties. Cohere has raised $170 million to date from institutional venture capital firms including Tiger Global Management and Index Ventures, and has a number of associations with Google. Google Cloud AI chief scientist Fei-Fei Li and Google fellow Geoffrey Hinton were early backers of Cohere, and Gomez and Frosst previously worked at Google Brain, one of Google’s AI research divisions. Cohere also has a partnership with Google to train large language models on the company’s dedicated hardware infrastructure.
Google infamously dissolved an AI advisory board in 2019 just one week after forming it. And in 2020, the company fired leading AI researcher Timnit Gebru in what she claimed was retaliation for sending colleagues an email critical of Google’s managerial practices. Google subsequently dismissed another ethicist, Margaret Mitchell, who’d publicly denounced the company’s handling of the situation, and a third, Satrajit Chatterjee, after he coauthored a paper questioning Google’s work in AI-powered chip design systems.
Paved with good intentions
Broadly speaking, nonprofit initiatives to fund AI research have been mixed bag.
Among the success stories is The Allen Institute for AI (AI2), founded by the late Microsoft cofounder Paul Allen, which seeks to achieve scientific breakthroughs by constructing AI systems with reasoning capabilities. While not strictly nonprofit, Anthropic, launched by former OpenAI executives, has raised over half a billion dollars researching “reliable, interpretable, and steerable” AI systems.
But for every AI2 and Anthropic, there’s an OpenAI, which began as a nonprofit before transitioning to a capped-profit and accepting a $1 billion investment from Microsoft. Meanwhile, former Google chairman Eric Schmidt’s recently announced $125 million fund for AI research attracted fresh controversy after Politico reported that Schmidt wields unusually heavy sway over the White House Office of Science and Technology Policy. (One of the first recipients, Berkeley professor Rediet Abebe, asked for her name to be removed from consideration.)
Some newer collectives have shown promise, however, notably Gebru’s Distributed AI Research, a global nonprofit for AI research. Projects like Hugging Face’s BigScience and EleutherAI are other strong examples of what can be achieved in AI beyond the bounds of corporate influence.
“Ultimately, it’s up to us to prove that Cohere For AI won’t turn to venture over time,” Hooker said. “Though Cohere For AI will rely on Cohere for resources and funding, a distinct separation has been created between the two to preserve its independence as a research lab. This separation is crucial so that it can continue to contribute to and serve the broader community as an independent entity. Cohere For AI is structured as a nonprofit, and was intentionally designed to collaborate openly with many different organizations. Its work will be open source to enable more access to the wider community.”