- Adam Binks
What AI can do, what it can’t do, and what’s coming.
Updated: Feb 2
What are computers good at? You might think that machines are better than humans at crunching numbers, but that humans still far surpass computers at creative tasks. Yes, they can beat us at chess with a few creative moves, but that’s a limited domain governed by strict rules. However, in the last few years, a new generation of more advanced Artificial Intelligence (AI) has emerged. These advanced AI are now beginning to compete with and occasionally even outperform humans in tasks we’d normally think of as creative ones. This rapid rate of progress leads us to think that, within the next 10 years, AIs will do a better job at a number of creative tasks than most humans can. That’s a big deal. In this post, we’ll get you up to speed on a number of wild new applications of AI, so you can learn about what real-world implications are around the corner - and what’s already here.
For convenience, here's a breakdown of the AI applications that this article covers:
Part 1: AI that writes
GPT-3: a big step forward
In June 2020, researchers at OpenAI released GPT-3, one of the most powerful text-generating AIs to date. It’s built on a simple principle: you provide some text in English, and GPT-3 guesses what text might come next. It turns out that this simple auto-completion structure can handle all sorts of important tasks that involve generating text, including writing poetry, code, recipes, or entire conversations. GPT-3 was trained on hundreds of billions of words (45 terabytes of text) from the Internet and books, which gives it impressive performance in many areas of human knowledge. To put this in context, it is far more information than a human (reading a book per day) could read in 30 lifetimes. 19 months later, people are still finding exciting new ways to use it. You can get a feel for the incredible performance of GPT-3 in a recent episode of the Clearer Thinking podcast, where we “interviewed” GPT-3. The AI performed surprisingly well as a podcast guest, discussing big topics like the meaning of life, free will, and AI itself. We even had it pretend to be different people, such as the philosopher Peter Singer, and the musician Kanye West. To demonstrate GPT-3’s capabilities, we got it to generate a poem in the style of Edgar Allan Poe. Below is the prompt we put into GPT-3, so that it would generate plausible text that might come next: “An unpublished poem by Edgar Allan Poe was recently discovered. Some scholars are saying it may be even better than his other work. Here is the Edgar Allan Poe poem, in full: The Whisperer” And here is the poem that GPT-3 generated as a result, on its first try, with no human help:
Upon the pallid bust of Pallas Aye, Pallas, thou art there! I see thee sit, serene and still, As thou wert wont of yore. The rock-born goddess, from her seat In the far-off Achaian land, Thou watch'st the storm-tossed ships That plough the dark blue sea. And ever, when the tempest rages, And the loud waves are breaking, Thou listen'st to the whispered voice Of him who calleth thee. Thee, too, I call, O Pallas! From thy high seat on Parnassus, Hearken to my voice, and grant Thy aid to one who needeth it.
The first line of this poem is lifted from The Raven, one of the real Edgar Allan Poe’s poems, but the rest of it seems to us to be original sentence constructions. While the above example was produced on its first try, and GPT-3 certainly can produce incredibly impressive output, it often will also produce duds - incongruous language, nonsense, or misinterpretations of your intent. So it’s important to keep in mind that for real-world applications you may need to try a few times to get a reasonable result. Of course, as these models continue to improve, we should expect their reliability to go up.
So what are people using GPT-3 for?
You can use it to write essays, with services like Shortly, or generate interactive adventure stories with the premium version of AI Dungeon. GPT-3 even wrote a short op-ed for The Guardian (don’t worry, the article you’re reading now was written by a human!). In one informal study, markers graded some essays written by GPT-3, and some by undergraduates. GPT-3 essays performed worse, but got a C on average, showing that it can write essays that are passable at college level. As this technology improves, it seems likely that high school students will be able to pass off AI-written essays as their own. Unlike standard plagiarism, every such essay written by GPT-3 will be unique, making it a lot harder for students to get caught. GPT-3 is excellent at summarizing articles, which the AI research assistant Elicit uses to generate short blurbs of scholarly papers that answer your research questions. GPT-3 can also simplify writing for younger or beginner audiences, which could be very useful - it took a lot of human effort to write the >200,000 articles in the Simple English Wikipedia; generating these automatically could open up expert knowledge to a broader audience in much less time. This also opens the door to writing that adapts to the individual reader- depending on your specific previous understanding, an AI could recraft writing to explain ideas that are new to you without repeating familiar ones. In the corporate world, companies are already using GPT-3 to generate marketing copy for you - they quickly churn out ad content and blog posts that are rich in keywords, helping you show up at the top of Google search results. One company founder actually told us he laid off his content writers and now uses AI instead. In all these applications, a strength of GPT-3 is that it generates original text most of the time (though it occasionally copies whole sentences from its training data). This also means it can generate tidal waves of original content while posing as many humans. This is a challenge for the trustworthiness of product reviews on sites like Amazon, and social media posts. Bad actors and political groups don’t need to hire hundreds of humans to write fake content if they can get an AI to do it faster. This may lead to an arms race: companies like Yelp and Amazon are already building additional AI systems to identify and remove this spam.
GPT-3 is also helping programmers. OpenAI recently developed a new version of GPT-3 called Codex, which excels at converting natural language into code. Codex powers GitHub Copilot, a tool that lives inside your code editor and can expand a sentence describing a function (e.g. “fetch NASA picture of the day”) into a full working code implementation of it. It’s seriously impressive, as you can see in these examples, and could open up software creation to a wider audience. On the other hand, it can often make mistakes, so code written using it must still be checked carefully. There’s a limit to the complexity of tasks it can handle. Also, Codex is trained on code released publicly on GitHub, raising questions about ownership - if the code that Codex writes is inspired by someone else’s work, who owns it? GitHub insists that “GitHub Copilot is a tool, like a compiler or a pen. The suggestions GitHub Copilot generates, and the code you write with its help, belong to you, and you are responsible for it.” - but some lawyers disagree.
Part 2: AI that "draws"
AI is now excellent at creating photorealistic faces
Do you recognize these people? Probably not, because none of them exist. They were all generated by AI. You can generate more examples at thispersondoesnotexist.com, and notice how there are typically almost no visual indicators that they are fake - they’re near photorealistic! Face Generator goes a step further and lets you adjust aspects of the generated faces, like their pose, emotion, age, skin tone, and hair length. Face generation is just one area of image generation where advanced AI is becoming highly capable. Ivan Braum, the founder of Face Generator, created it to solve a problem: for some products like dating apps, it’s hard to find real models who want to become the face of the product in marketing materials. “There are live people behind every picture, and their lives are a factor.”, Ivan said. AI-generated faces will also let websites and adverts tailor their experience towards you, by adjusting faces in promo images to your country or age. In other words, those users of the products whose faces you see may not be real users - in fact, they may not even be people at all! You’ve probably heard of deepfakes - a different application of AI related to faces. Deepfakes replace the face of one person in a video with another person, mirroring their facial expression and movements. One Tom Cruise impressionist went viral on TikTok with a highly convincing series of deepfakes.
Deepfakes are opening up new possibilities for movies. As powerful AI models become cheaper and widely available, smaller film studios will be able to imitate big-budget CGI at a fraction of the price. However, the existence of hard-to-detect deepfakes affects the trustworthiness of video evidence in general - and opens up concerning possibilities of fraud, pornographic blackmail, and disinformation. Imagine totally convincing fake videos of politicians committing indecent acts being leaked right before an election, produced at low cost. This may be a future we have to contend with.
What about art?
We’ve seen a lot of things that AI is now able to do. But you might think that there are some things that are fundamentally human - for example, can an AI create art? Isn’t creativity a mysterious process that can only take place in a conscious human brain? Nope! AI is already creating some works of art. In the following sections, we’ll survey original AI-generated images, visual art, videos, and music. Like many other lines in the sand that people have attempted to draw around tasks that are “fundamentally human”, this one has also begun to erode.
Combining concepts to generate images from text prompts
Image-generating AI is going beyond imitating reality. State-of-the-art models like GLIDE can now take text prompts written by a person and generate novel images that match the text. GLIDE can combine subjects and art styles in ways that it was never exposed to in its training data - and may have never before been combined by anyone (how many oil paintings of psychedelic hamster dragons have you seen?).
Source: The Singularity Is Near by Ray Kurzweil
Look at some of the amazing examples below of images that were generated by AI just based on a text prompt, for instance, “a corgi wearing a red bow tie and a purple party hat.”
Source: GLIDE, Nichol et al. (2021)
Notice in the above example of stacked cubes, GLIDE even creates an accurate reflection of the blue cube and diffused red light from the red cube. As with GPT-3, GLIDE is amazingly impressive, but often produces duds as well. Here are some examples where it messed up:
Source: GLIDE, Nichol et al. (2021)
For real-world applications where mistakes like this would be a big problem, the results would need to be curated, likely by having a human pick from a range of AI-generated options to avoid these mistakes. However, there are attempts to use an additional layer of AI to perform this curating process and help weed out bad results, further improving reliability. Here are some more examples, this time from OpenAI’s DALL-E, a variant of GPT-3 technology for image (rather than text) generation, showing how the algorithm can generate a bunch of options. (Yes, DALL-E is intentionally a portmanteau of WALL-E and Salvador Dalí!)
This stuff isn’t yet perfect (you can see some interesting ways that DALL-E falls short in OpenAI’s post) but it’s improving rapidly. DALL-E was released in January 2021. DALL-E is a big improvement on its predecessor, OpenAI’s earlier image generator Image GPT - even though Image GPT was released just 7 months earlier. The implications of image-generating AI are exciting - creating a visual sketch of an idea to communicate will be a rapid process with no artistic abilities needed. The stock photo industry may be substantially disrupted by image generation as this technology matures - getting the perfect image will be cheaper, faster, and each image can be fully tailored to your specific needs. Zooming out, entire activities like design and photography might look very different with an AI in the loop. While DALL-E and GLIDE can produce either aim to generate photorealistic or stylized images, other applications focus exclusively on creating stylized visual works of art. One fun example is wombo.art, which lets you generate art from a text prompt. You can choose from different styles like Pastel and Steampunk. Below, on the left, is an original piece we just generated, from the prompt “golden sun” and the Mystical style.
Source: wombo.art & artflow.ai
In a similar vein, artflow.ai generates portraits of characters from prompts. You can describe an art style in the prompt, or see what the AI comes up with! Above, on the right, is one artflow we created using their AI by providing the prompt “sunburnt vampire”. Visual artists like @RiversHaveWings and @unltd_dream_co are embracing these new methods. You can even buy framed prints of AI-generated art - this digital gallery sells them for $600 each! Interestingly, each work is “unique”, meaning that once you’ve bought it, they won’t sell a print of that work to anyone else. These image-generation AIs illustrate a general theme present across current generative AI models: they are well-suited to a collaborative “generate-evaluate” process, where an AI creates a number of options and the human selects their favorite, finetunes the prompt, or gets inspired about a new direction to pursue.
What’s next - videos and virtual worlds
Generating videos from prompts is the next step after images - though the technology isn’t nearly as accurate yet, early progress is being made. One 2018 paper from Gupta et al. uses AI to snip out parts of Flintstones clips and overlay them to generate scenes based on text prompts:
Another emerging application is the generation of virtual environments. Facebook Inc’s recent rebrand to Meta shows their commitment to the concept of a “metaverse”, an online space where people interact as digital avatars in virtual environments. Currently, creating these environments takes the time and expertise of professional 3D modelers, while non-experts have to make do with simple drag-and-drop interfaces that let you combine elements from a premade library (like the Sims games). This is set to change, as AI techniques are emerging to make bespoke creation accessible to a much broader audience. The academic tool COLMAP can turn real-world environments into immersive 3D virtual environments from as few as two photos, as you can see in the video below.
An early signal of interesting things to come, WordEyeWorld uses text commands to let you create and arrange objects in a virtual 3D world by describing them in ordinary language, like “a small biplane is 10 feet in front of the shack”.
It looks like the current version of WordsEyeWorld might be pulling from a premade library of objects, but it’s easy to imagine a future version that uses DALL-E-like generative techniques to create bespoke 3D objects matching a description. Or even creates full 3D environments exportable in VR based on a description (e.g., “a cozy Scandinavian village in winter”), that you could then tweak with further text prompts (“there is a large red-brick clock tower in the village square”). Imagine a metaverse that anyone can speedily create and evolve with words alone.
Part 3: AI that composes
Beyond visual art, AI-generated music is rapidly improving. OpenAI is once again a leader in this space - their Jukebox takes in as input a genre, artist, and lyrics, and generates an original piece of music audio file. Their blog post has fun examples of music generated in the style of particular artists, as well as GPT-3-style completions, where they give Jukebox 12 seconds of audio and it guesses what will come next. Other systems like AIVA take a different approach and instead generate MIDI files, which are like sheet music that can be played by a computer. This makes for some fairly impressive results - here’s a playlist of examples, though note that these are cherry-picked by the company so will not be a representative sample of AIVA’s output. For content creators like YouTubers, podcasters, and game developers, AI-generated music from services like AIVA and Amper is an easy and cheap way to accompany their work with original music that they own the rights to. For musicians, the development of advanced AI looks set to speed up songwriting and make composition even more accessible to people who don’t play instruments or know music theory. Rumor has it that some popular music you’ve likely heard was originally generated by AI (with humans then reworking it and recording an improved version). Surveying the technological developments across all these creative domains shows that the human brain is no longer the only generative engine we have access to. In an increasingly large number of areas, you can now offload the work of generating original ideas to an AI. Your job is as easy as picking your favorite result, improving what the AI produced, or tweaking the input to generate more ideas.
Part 4: AI that converses
AI-powered chatbots are now everywhere - they set timers in your kitchen, they provide customer support on websites, they let you use your phone or smartwatch hands-free in the car. While many of these chatbots focus on helping you perform tasks, another strand of progress is in chatbots that chat for the sake of chatting.
People are turning to chatbots like Replikan th for fun, companionship, and even to develop romantic relationships. Replika creates a personal companion for you that learns from your conversations - it remembers things you have discussed, and your feedback develops its conversational style. Some Replika users post screenshots online of romantic conversations with their Replikas, where they declare their love for each other. On the right is an example conversation someone had with their Replika chatbot (source: Reddit). You can find more examples in this recent Vice article. Of course, lots of people use chatbots more casually, turning to them occasionally to pass time, or using them to relieve stress or loneliness, or to help process difficult emotions like grief. There are many uses of chatbots that people find beneficial. Perhaps the most popular chatbot company, the China-based Microsoft spinoff Xiaoice, was valued at $1 billion last year and reports that it has over 660 million users. Here’s what the creators say about Xiaoice: “Xiaoice is a sophisticated conversationalist with a distinct personality. She can chime into a conversation with context-specific facts about things like celebrities, sports, or finance but she also has empathy and a sense of humor…Using sentiment analysis, she can adapt her phrasing and responses based on positive or negative cues from her human counterparts…She can tell jokes, recite poetry, share ghost stories, relay song lyrics, pronounce winning lottery numbers and much more. Like a friend, she can carry on extended conversations that can reach hundreds of exchanges in length.” Xiaoice has the ability to forge romantic connections with its users, which includes flirting, joking, and sexting. It seems unclear what the implications of chatbots like Xiaoice and Replika will be on our future relationships. Unlike humans, chatbots are always available, don’t need to have unwanted mood fluctuations, and can provide immediate support or comfort. On the other hand, the existence of this easy, safe option may discourage people from leaving their comfort zones to forge deep connections with other people. If they use chatbots a lot at the expense of human contact, people might begin to learn social behaviors that humans would find unacceptable.
Part 5: AI that does everything?
This technology is developing very fast. On all the fronts we have discussed - text, images, video, virtual worlds, and virtual companions - we should expect to see big advancements. It’s exciting to explore these new applications, though we are already seeing some negative short-term consequences. Looking to the longer term, some people, like philosopher Nick Bostrom and computer scientist Stuart Russell, fear that the consequences of building increasingly intelligent AI could be devastating. One day, if advanced AI can outperform humans across all or nearly all tasks, it may be very difficult to make sure the AI’s actions align with human values (for more on that topic, see Russell’s book Human Compatible). We hope you’ve found this overview of new developments in AI insightful! What do you think of AI, its potential opportunities, or its consequences? Comment below to share your thoughts with us!
We also have a full podcast episode about "What, if anything, do AIs understand?" that you may like:
Click here to access other streaming options and show notes.