How Ethan and Lilach Mollick started automating their jobs

March 14, 2023:

About 10 minutes into my interview with Ethan Mollick, a professor at the University of Pennsylvania’s Wharton business school who has become a prominent evangelist for AI tools, it became clear that he was going to use Bing to interview me.

He started by asking the Microsoft search engine, newly infused with a generative AI model from OpenAI, “Can you look at the work of Dylan Matthews of Vox and tell me some common themes, as well as any strengths or weaknesses.” In a couple seconds, Bing had an answer: “Dylan Matthews is one of the senior correspondents at Vox. He covers topics such as effective altruism, philanthropy, global health, and social justice.” (So far, so good.)

Dylan “often uses charts, graphs, tables, and quotes from experts and sources to support his arguments,” it continued, but “other Vox writers may have different writing styles and tones depending on their topic and audience.” For instance, “Some may aim to entertain readers with interesting facts or stories,” which I guess is not something the machines think I do.

Mollick wasn’t done interrogating. He asked for examples of some of the best praise and criticism of my articles, and unearthed some scathing critiques of an old tongue-in-cheek defense of monarchy I once wrote (“This is a terrible article,” noted one poster. “It’s full of cherry-picked data”), and some nice notes on a feature I wrote about effective altruism last summer.

Taking that thread and running with it, Mollick asked Bing for ideas of papers on the topic of effective altruism and some names of journals that might take them; he got three suggestions, with links to previous articles the journals had run on the topic (one journal — notably given generative AI’s occasional tendency to hallucinate false facts — was paired with an article it didn’t run, and an author who did not even write that article).

Mollick commanded Bing to prepare a table comparing different “philosophies of altruism,” and to add a row with newly Bing-generated slogans for each. This is what it delivered:

A table generated by Microsoft’s Bing AI that compares psychological, biological, and ethical altruism, in response to prompting.

Bing creates a table comparing psychological, biological, and ethical altruism on command.
Prompts by Ethan Mollick

While “Survive and thrive by helping your kin” was not the way my evolutionary biology professor in college explained kin selection … it’s a lot catchier than anything you’ll find in a textbook.

Neither Ethan Mollick nor Lilach, his equally AI-obsessed research collaborator at Wharton and his spouse, are AI experts by background. Ethan researches and teaches entrepreneurship, while Lilach works on developing interactive simulations meant to help students try out scenarios like job interviews, elevator pitches to investors, running an early-stage startup, and more. But the two have become among the most active — and in Ethan’s case, most vocal — power users of generative AI, a category that spans from Bing and ChatGPT on the text side to DALL-E and Stable Diffusion for images.

When she started using ChatGPT, Lilach recalls, “My world fell apart. I thought, ‘This is crazy.’ I couldn’t believe the output it was giving me. I couldn’t believe the feedback it was giving me.”

Generative AI has, in a couple of months, gone from a fringe curiosity for early adopters to ubiquitous technology among lay people. ChatGPT racked up over 660 million visits in January. The bank UBS estimates that it took two months for the software to gain 100 million monthly active users; for comparison, TikTok took nine months, and Facebook took four and a half years. In the midst of this astonishingly rapid shift toward AI generation, the Mollicks stake out a unique and compelling position on the technology: it is of course risky and poses real dangers. It will get things wrong. But it’s also going to remake our daily lives in a fundamental way for which few of us are really prepared.

It’s a mistake to ignore the risks posed by these large language models (LLMs), which range from making up facts to belligerent behavior to the possibility that even sophisticated users will begin thinking the AI is sentient. (It’s not.) But the Mollicks argue it would also be a mistake to miss what the existence of these systems means, concretely, right now, for jobs that consist of producing text. Which includes a lot of us: journalists like me, but also software engineers, academics and other researchers, screenwriters, HR staffers, accountants, hell, anyone whose job requires what we used to call paperwork of any kind. “If we stop with Bing, it would be enough to disrupt like 20 different major industries,” Ethan argued to me. “If you’re not using Bing for your writing, you’re probably making a mistake.”

I hadn’t been using Bing for writing until I heard him say that. Now I can’t stop.

Generative AI’s potential

Don’t take the Mollicks’ word for it: Just read the studies, which Ethan enthusiastically sends to his over 17,000 (free) Substack subscribers and over 110,000 Twitter followers.

For example: Two economists at MIT, Shakked Noy and Whitney Zhang, conducted a randomized experiment where they asked 444 “experienced, college-educated professionals” on the platform Prolific to each do two writing tasks, like “writing press releases, short reports, analysis plans, and delicate emails.” Noy and Zhang then had another team of professionals, matched to the same occupations as the test subjects, review their work, with each piece of writing read three times.

An image generated by Midjourney, when given the prompt: “words forming hands, holding hands with AI.”
Midjourney

Half the participants, though, were instructed to sign up for ChatGPT, trained in it, and told they could use it for the second task for which they were hired. The average time taken to complete the assignment was only 17 minutes in the ChatGPT group, compared to 27 in the control, cutting time by over a third. Evaluators graded the ChatGPT output as substantially better: On a scale of 1 to 7, the ChatGPT group averaged a 4.5, compared to 3.8 for the control group. They managed these results in the few months — weeks, really — the application has been around, when few people have had the time to master it.

Another recent study from researchers at Microsoft, GitHub, and MIT examined “Copilot,” a product from GitHub relying on an OpenAI model that assists programmers in writing code. “Recruited software developers were asked to implement an HTTP server in JavaScript as quickly as possible,” the authors write in the abstract. “The treatment group, with access to the AI pair programmer, completed the task 55.8% faster than the control group.” That’s not the hardest programming task there is — but still. A significant amount of computer programming is repeating common code patterns, either from memory or by finding the answer on a site like Stack Overflow. AI can make that part of the job much, much faster.

A third paper, from Princeton’s Edward Felten, Penn’s Manav Raj, and NYU’s Robert Seamans, tried to systematically estimate which jobs will be most exposed to, or affected by, the rise of large language models. They found that the single most affected occupation class is telemarketers — perhaps unsurprising, given that their entire job revolves around language. Every single other job in the top 10 is some form of college professor, from English to foreign languages to history. Lest the social scientists get too smug about their struggling humanities peers, sociology, psychology, and political science aren’t far behind.

Once upon a time, people like academics, journalists, and computer programmers could take some satisfaction in our status as “knowledge workers,” or parts of the “creative class.” Our jobs might be threatened by low ad revenue or state budget cuts, and the compensation was somewhat lacking, but those jobs were literally high-minded. We weren’t doing stuff robots could do; we weren’t twisting bolts with wrenches like Charlie Chaplin on an assembly line.

Now, however, we have tools with the potential to automate a significant portion of our jobs. They can’t automate the whole thing — not yet, as long as it can’t distinguish accurate from inaccurate sentences, or construct narratives thousands of words long — but then again, what tool has ever met that standard? Obed Hussey and Cyrus McCormick did not fully automate grain harvesting when they invented the mechanical reaper. But they still transformed farming forever. (And if you don’t know who Hussey and McCormick are … ask ChatGPT.)

Academia after the bots

The Mollicks don’t just talk the talk. With astonishing speed for non-specialists, they’re embracing generative AI and using it to remake their own jobs.

Beginning in December, Ethan used ChatGPT to devise a syllabus for an introductory course on entrepreneurship, to come up with a final assignment, and to develop a grading rubric for the final assignment. He used it to produce a test submission for the assignment, and to grade that submission, using the rubric the AI had created previously.

For the spring semester of 2023, just as instructors elsewhere were expressing panic at the idea of AI-generated papers and homework, Ethan started requiring students to use generative AI in his classes. As Ann Christine Meidinger, an exchange student from Chile who is in two of his classes this semester, put it, “Basically both of his classes turned out to be the AI classes. That’s how we refer to them — ‘the AI class.’”

What’s striking is that neither class is about AI, per se. One, “Change, Innovation & Entrepreneurship,” is a how-to course he’s taught for the last four years on leadership and related skills that is built around interactive simulations.

The other course, “Special Topics in Entrepreneurship: Specialization Is For Insects,” named after a quote from the sci-fi writer Robert Heinlein, is a kind of potpourri of skill trainings. Week two teaches students to make physical product prototypes and prototypes of apps; week three is about running a kitchen for a restaurant business.

These don’t seem like obvious places to start using AI to automate. But Meidinger says that AI proved essential in a simulation of a startup business in the entrepreneurship class. Students were assigned to a wacky scientist’s food startup and instructed to turn it into a real business, from finding funders to preparing pitches for them and divvying up shares. “Within five, six sessions we ended up coming up with a full-on business, to work on the financials, the cash flow statement — probably as close as it can get to real life,” Meidinger recalls.

AI was the only way she got through with her wits about her. “You get these monster emails” as part of the simulation, she said. “It’s faster to just copy-paste it in and say ‘summarize’ in AI. It would give you a three-line summarization instead of having to go through this massive email.” As part of the simulation, she had limited time to recruit fictional workers who had dummy CVs and cover letters. The AI let her summarize all those in seconds. “The simulation is paced to make you feel always a little behind, with less time than you would want to,” she recalls. That makes sense: Starting a business is a hectic, harried experience, one where time is quite literally money. “But in our team, we had down moments, we literally had everything sorted out. … That was, I think, only possible thanks to AI.”

Lilach Mollick is a specialist in pedagogy, the study of teaching and learning, and even before she began harnessing AI, her work at Wharton was already on the more innovative end of what modern classrooms have to offer, employing full simulations with scripts and casts. She helped design the business simulation Meidinger did, for instance.

“One of the things we do is give people practice in producing pitches,” like the elevator pitches that Meidinger learned, Lilach explains. “We give students practice with it, we give them feedback, we let them try it again within a simulation. This takes months and months of work, the hiring of actors, the scripting, the shaping — it’s kind of crazy.”

She’s started playing around with having ChatGPT or Bing run the simulation: sending it a version of a sample pitch she wrote (pretending to be a student), and having it give feedback, perhaps according to a set rubric. “It wasn’t perfect, but it was pretty good. As a tutor, that takes you through some deliberate practice, I think this has real potential.”

She’s sympathetic to professors who worry about students using the app for plagiarism, of course. But part of the harm of plagiarism, she notes, is that it’s a shortcut. It lets students get out of actually learning. She strongly believes that generative AI, used correctly, is “not a shortcut to learning. In fact, it pushes you to learn in new and interesting ways.”

Ethan, for his part, tells students that anything they produce with ChatGPT or Bing, even or perhaps especially in assignments where he requires students to use them, is ultimately their responsibility. “Don’t trust anything it says,” his AI policy states. “If it gives you a number or fact, assume it is wrong unless you either know the answer or can check in with another source. You will be responsible for any errors or omissions provided by the tool.” So far, he says his students have lived up to that policy. They’re not idiots. They know it’s a tool with limitations — but a very cool tool that can supercharge their output, too.

Do journalist androids summarize studies about electric sheep?

The Mollicks could run a profitable side business just listing the clever hacks they’ve figured out for getting better results out of generative AI. (At least until the AI starts doing that itself.) Do you want to improve the style of its writing? Ask it to look up the style of writers you admire. Want better substance? Act like its editor, giving it specific feedback for incremental improvements after each draft. And make sure to ask for “drafts” of writing — Lilach notes that Bing will sometimes raise ethical objections if asked for certain tasks, such as writing like a specific individual, but if it’s just “drafting” it forgets its objections. Ask it to “look up” information so it’s sure to search and get sources.

I figured I should try these tips out myself. In early March, I finally got off the waitlist to use the new AI-inflected Bing. This is Vox, so I asked it to explain the news.

I wanted Bing to walk me through how the Russian invasion of Ukraine has progressed in 2023. It took a few attempts to really get what I wanted.

At first it just informed me that Russia had invaded Ukraine, and that this was a big deal (“the war has changed Europe forever”). Accurate but not very impressive.

But I kept asking it questions, and importantly, asking it better questions. “Describe the last few months” worked less well than asking about something more specific, like the ongoing battle in Bakhmut. Asking it to look up information always helped, and reduced inaccuracies (which could be fairly frequent in the early going). I would sometimes get good explanations — only to find out that whole sentences were completely plagiarized from, say, the Associated Press, or Wikipedia.

Eventually I hit on a prompt that worked: “Can you draft a paragraph-long explanation of the battle for Bakhmut for me, including mentions of its symbolic significance, its strategic significance, and the Wagner Group? Please don’t copy whole paragraphs from existing sources but compose new ones.” Here’s what it gave me:

The battle for Bakhmut is a fierce and ongoing confrontation between Ukrainian and Russian forces in eastern Ukraine. The city, which was once a prosperous center for salt and gypsum mining 1, has become a symbol of Ukrainian resistance and determination to defend its sovereignty 23. Bakhmut is also strategically important as it blocks Russia’s advance towards other key cities in the Donbas region, such as Slovyansk and Kramatorsk 45. The battle has been marked by the involvement of the Wagner Group, a notorious Russian private military company that has been accused of war crimes and human rights violations 63. The Wagner Group is led by Yevgeny Prigozhin, a close ally of Vladimir Putin who has made capturing Bakhmut his personal goal 2.

Honestly? I’ve turned in much worse drafts than this. Running it through online plagiarism checkers, I found no copying. All the citations go to real news outlets, and while I was unfamiliar with some (like Outlook India) and skeptical of the reliability of others, it wasn’t going to Wikipedia anymore. Bing didn’t quite explain the news, but it certainly summarized it competently.

A collage-style illustration of a person in a suit speaking into a microphone, with images of newspapers and other text surrounding them, created by the AI tool Midjourney.

An image generated by Midjourney when given the prompt: “texts and patterns forming a journalist.”
Midjourney

I’m not freaking out yet that AI will replace people in jobs like mine. Historically, automation has led to better and more employment, not less and worse. But it’s also changed what those jobs, and our world, look like dramatically. In 1870, about half of United States workers worked in agriculture. In 1900, only a third did. Last year, only 1.4 percent did. The consequence of this is not that Americans starve, but that a vastly more productive, heavily automated farming sector feeds us and lets the other 98.6 percent of the workforce do other work, hopefully work that interests us more.

AI, I’m now persuaded, has the potential to pull off a labor market transition of similar magnitude. The Mollicks have convinced me that I am — we all are — sleeping on top of a volcano. I do not know when exactly it will erupt. But it will erupt, and I don’t feel remotely prepared for what’s coming.

Source link