Although we don’t yet have the robot butlers that science fiction has promised us, artificial intelligence is already here to help make your life easier. Large Language Models (LLMs), such as ChatGPT, Gemini, Claude, and Copilot, are a relatively new kind of artificial intelligence that can completely reshape how you perform a variety of tasks, boosting your productivity and taking weight off your shoulders. Yet many people don't know how to get the most out of these tools - either not using them at all, or using them in ways that are far from tapping their full potential.
This library of articles will help you make the most of AI tools to help you achieve your goals more effectively! We believe you'll find it useful regardless of whether you have no knowledge of AI, or you're already using AI on a daily basis. It is a deep dive into how to use these AIs to most effectively boost your productivity.
Contents
(Click any of these links to go directly to that section)
How They Work
There is a new type of artificial intelligence on the block, called Large Language Models (LLMs, for short). They’re machine learning algorithms that have been taught to recognize patterns in language and predict what text is likely to come next. As a result, they can generate very sophisticated and extremely helpful responses to your prompts.
So, how do LLMs work? The simplest explanation is that LLMs work very much like predictive text on your phone, but much, much smarter. Like predictive texting, the responses given by LLMs are generated by repeatedly predicting the next word (or ‘token’ which is similar enough that we don’t need to worry about that here), over and over, but LLMs operate on a vastly larger scale. Their "knowledge" stems from being trained on an astronomical number of articles, websites, books and other text sources. This huge pool of data allows them to not only predict the next word but also give the appearance of understanding context, tone, and even many intricate nuances of language.
While LLMs have their problems (discussed below), they provide much better results for many (but certainly not all) types of queries than traditional search engines, and they excel at making sure their responses fit the context of the query. Let’s look at a simple example. Compare the result from Google and ChatGPT 4 for the same simple query: “What is 100 x 25 cents?”
Here's what you get from a traditional Google search:
And here's what you get from ChatGPT:
Clearly, both answers are correct, but the answer from ChatGPT 4 ($25) is much more useful than the answer from Google (2500 cents). The LLM is able to reply in a way that is appropriate to the usual context of that kind of question so that the answer is as helpful as possible.
How They're Trained
LLMs trained on large datasets can use that learning to perform tasks they weren't explicitly trained on. In other words, they "generalize" from examples they have seen to new situations they have never seen (but that, nevertheless, partially resemble what they've seen). This makes LLMs incredibly versatile and useful. But the big online LLMs (like ChatGPT and Gemini) have also undergone a lot of fine-tuning. This is a process whereby they are further trained on smaller, more specific datasets, to perform better in certain domains such as conversation, coding, creative writing, legal writing, copywriting, and so on. Fine-tuning also sometimes includes (but is not limited to):
Supervised learning: the model is given explicit examples of inputs and the desired outputs if it were given those inputs (typically, these desired outputs are written by humans).
Reinforcement learning from human feedback: Sometimes abbreviated to ‘RLHF’, this involves using feedback from human evaluators. The LLM's responses are rated by humans, and these ratings are used to create another A.I. that can automatically score future outputs of the original A.I. These scores are then used to adjust the original A.I. model's parameters so that in the future it produces outputs that are more likely to score highly according to raters. This method is particularly useful for aligning the model’s outputs with human values and preferences (e.g., not to have violent or pornographic content in the LLM’s responses).
Adversarial training: This involves training the model with challenging examples, including attempts to trick or confuse the model. It's designed to improve the model's robustness and ability to handle edge cases.
There have been some controversies over the labor conditions involved in some of this training. We talk briefly about this in the Some Things to be Careful About section, below.
Limitations
Large Language Models are incredibly powerful and can help you in all sorts of ways, but they do have limits and problems. Let’s talk about them.
Remember that LLMs work like predictive text (albeit much smarter): they’re predicting the best next words to continue on from whatever prompt or query you input, based on the patterns they’ve learned in language from the specific training data they've been trained on. So, if the answer to your query does not present as a pattern in the text they've been trained on, then LLMs will not be able to learn it and base answers on it. For example, this limitation means that LLMs will struggle with the following:
Uncommon knowledge or recent developments: If you ask an LLM about very recent events or highly specialized knowledge that hasn't been widely written about, it may struggle to provide accurate information. For instance, asking about the details of a scientific discovery made just a few months ago or specific data from a study not widely published. Most of the big LLMs are now able to search the internet when constructing their responses to your prompts, which makes them more reliable on these topics, but it’s worth extra caution if you’re asking about topics along these lines (particularly if the A.I. does not employ a real-time search or you can't confirm the reliability of the source it retrieved it from while it performed an internet search).
Personalized or context-specific advice: LLMs can't provide advice or answers that require personal context they don't have access to. For example, if you ask, "Should I accept the job offer I received yesterday?" the LLM can't assess your personal circumstances, career goals, or the specifics of the job offer to give a tailored response. Of course, if you provide it with those details, it will do its best to offer advice based on the patterns in language it has learned and its training on how to give answers that humans tend to give high ratings to.
Creative solutions to new problems: If a problem is unique and doesn't have well-established solutions or discussions that occur in the training data, an LLM may not provide effective or innovative solutions. For example, brainstorming a novel marketing strategy for a new technology, or asking it to write your university-level essays might yield generic responses not fit for purpose. On the other hand, you may still find it useful as a dialogue partner in those situations to help you flesh out or explore your own ideas, or to help you expand your own brainstorming.
(Note: if you're using LLMs for any kind of school work, make sure you check your school's policy on their usage.)
Localized or cultural-specific queries: Questions that require understanding of local customs, dialects, or recent cultural phenomena might not be accurately answered if those specifics are not well-represented or patterned in the training data. For instance, asking for the significance of a local festival in a small community might not yield detailed answers. This is something they are improving at - particularly the models that are able to search the internet in real time.
Deep logical reasoning or complex problem-solving: While LLMs can simulate certain types of reasoning, they may struggle with complex problem-solving that requires deep logical reasoning, especially if it's not a common pattern in the data. For example, while LLMs have gotten better and better at solving math problems, a complex, multi-step math problem might be beyond its capabilities. And you may even find simple math problems that it fails at, if it's of a type that it has not seen in its training data or that it can't generalize (even though its training data had specific examples).
The Strawberry Problem: Because of the way that LLMs split words up into 'tokens' when they read prompts (and when they build responses), they don't see words like we do, and this means they struggle with some queries about words themselves. A popular illustration of this is the question "How many 'r's are there in the word 'strawberry'?" The correct answer is three, but LLMs currently get this wrong consistently.
Some Things to be Careful About
The way that LLMs are trained, and the nature of the technology itself, also result in some things it is worth being careful about, and keeping in mind when you use them:
Hallucinations: When LLMs answer your prompts confidently with false information, that’s called a ‘hallucination’. These hallucinations occur because LLMs are trained to look for patterns (i.e., they are trying to predict what text would likely follow the text given that a human would rate highly). That is not the same as only saying true things. Most people would say that they do not truly 'understand' (at least, not in the way humans understand) or verify the information they generate. They're not interested in truth, they're just predicting the best next words over and over again. For this reason, if you’re relying on LLMs to inform you about something, it’s best to make sure either:
(1) you can independently verify the information they give you, or
(2) the stakes are low, so you won’t run into trouble if the information the LLM gives you is wrong.
While LLM creators are working to improve the problem of hallucinations, this is still a very serious issue. To make matters worse, LLMs typically state their answers confidently, even when they are hallucinating. In this way, LLMs are kind of like knowing an incredibly knowledgable person who is also extremely overconfident and occasionally completely wrong - while you can often trust what such a person says, just because they state something confidently doesn't mean it's true (indeed, it has even been argued that this counts as a form of 'bullshitting'). So when it's important, you need to double check!
Biases: LLMs are trained on vast amounts of data taken from the internet, books, and so on. This means that the biases that show up as patterns in that training data (e.g., gender, racial, and cultural biases) can sometimes be found in LLMs’ responses. For instance, an LLM might associate certain professions or roles predominantly with one gender, reflecting societal biases. It’s very important to be aware of this if you’re planning to use LLMs in decision-making processes, content creation, or as part of systems that interact with a diverse user base. Even the steps that A.I. companies take to correct biases can end up causing other biases. The problem of reducing bias in A.I.s is a tricky one that is under active development.
Ethical concerns: You may also wish to be careful about a variety of other ethical issues relating to LLMs. For example:
Reporting from TIME revealed that OpenAI (the organization behind ChatGPT) relied on Kenyan workers who were paid less than $2 an hour (which is above the average hourly pay, but very low by U.S. standards) to ‘label’ extremely graphic and disturbing text content, which has been alleged to lead to mental health problems among those workers.
There are also concerns about the environmental impact of the technology.
Additionally, massive amounts of copyrighted materials have been used during the training of LLMs (without compensating the copyright holders), and the output of LLMs can sometimes be very similar to copyrighted works.
You might also worry about contributing to speeding up the risks associated with artificial intelligence. There is a growing movement of people worried that artificial intelligence poses an existential threat to humanity.
For some people, these issues are insurmountable. If you find yourself needing to use LLMs and want to mitigate ethical concerns, you can consider using free versions or use 'API playgrounds' where you pay per use (instead of a fixed rate per month for unlimited use) which, for most people, will be a dramatic amount less than the typical monthly subscription fee (e.g., $1 instead of $30).
These kinds of concerns might affect which LLM you choose to use, or whether you choose to use any at all.
Comments