Chapter 1 Basics of Large Language Models

1.1 Generative AI and Large Language Models (LLMs)

Generative AI and Large Language Models (LLMs) are at the forefront of artificial intelligence, revolutionizing how we interact with technology and process information. Here’s an introduction to these cutting-edge technologies:

1.1.1 Generative AI

Generative AI refers to systems capable of creating new content, such as text, images, music, or even code, based on the data they have been trained on. Unlike traditional AI, which focuses on recognizing patterns and making predictions, generative AI can produce original outputs. This capability is powered by advanced machine learning techniques, particularly deep learning.

Generative AI has a fascinating history that spans several decades. Here’s a brief overview:

1.1.2 Early Beginnings

  • 1950s-1960s: The foundation of generative AI was laid with the development of early machine learning algorithms. Arthur Samuel created the first machine learning program in 1952 for playing checkers 1. In 1961, Joseph Weizenbaum developed ELIZA, one of the first chatbots, which could simulate conversation using pattern matching and substitution methodology 1.

1.1.3 Advancements in Neural Networks

  • 1970s-1980s: The concept of neural networks gained traction. Frank Rosenblatt’s Perceptron, developed in 1957, was an early neural network model1. The backpropagation algorithm, essential for training deep neural networks, was popularized in the 1970s 1.

1.1.4 The Rise of Generative Models

  • 1990s-2000s: Research in AI continued, but it wasn’t until the introduction of Generative Adversarial Networks (GANs) by Ian Goodfellow in 2014 that generative AI saw a significant leap. GANs consist of two neural networks, a generator and a discriminator, that work together to create realistic data 1.

1.1.5 Modern Era

  • 2010s-Present: The development of transformer models, such as GPT (Generative Pre-trained Transformer) by OpenAI, revolutionized generative AI. These models can generate high-quality text, images, and even music. ChatGPT, for example, is a well-known application of this technology2.

1.1.6 Large Language Models (LLMs)

LLMs are a subset of generative AI specifically designed to understand and generate human language. These models are trained on vast amounts of text data and can perform a variety of language-related tasks, such as translation, summarization, question answering, and content creation. The most well-known LLMs include OpenAI’s GPT series and Google’s BERT.

1.1.7 Key Concepts

  1. Transformer Architecture: LLMs are built on the transformer architecture, which allows them to process and generate text efficiently. This architecture uses self-attention mechanisms to understand the context of words in a sentence.
  2. Training Data: LLMs are trained on diverse datasets that include books, articles, websites, and other text sources. The quality and diversity of this data are crucial for the model’s performance.
  3. Applications: LLMs are used in various applications, from chatbots and virtual assistants to automated content generation and language translation. They are also being explored for more complex tasks like coding and scientific research.

1.1.8 Practical Use Cases

  • Customer Support: Automating responses to common customer queries.
  • Content Creation: Generating articles, reports, and creative writing.
  • Language Translation: Providing accurate translations between different languages.
  • Data Analysis: Summarizing and interpreting large datasets.

1.2 Advantages of Transformer Models over LSTM

Parallelization: - Transformers: Allow for parallel processing of input sequences, significantly speeding up training and inference times. - LSTMs: Process input sequences sequentially, which can be slower and less efficient.

Handling Long-Range Dependencies: - Transformers: Use self-attention mechanisms to capture long-range dependencies more effectively. - LSTMs: Struggle with long-range dependencies due to vanishing gradient problems.

Scalability: - Transformers: Scale better with larger datasets and models, making them suitable for tasks requiring extensive data. - LSTMs: Can become inefficient and harder to train as the model size and dataset grow.

Flexibility: - Transformers: Can be adapted to various tasks (e.g., translation, summarization, question answering) with minimal changes to the architecture. - LSTMs: Often require more task-specific adjustments and tuning.

Attention Mechanism: - Transformers: Use self-attention to weigh the importance of different parts of the input sequence, improving context understanding. - LSTMs: Rely on hidden states to maintain context, which can be less effective for capturing complex relationships.

Memory Efficiency: - Transformers: More memory-efficient during training due to parallelization and reduced need for maintaining sequential states. - LSTMs: Require maintaining sequential states, which can consume more memory.

1.3 Predictive/Discriminative AI vs. Generative AI

1.3.1 Predictive/Discriminative AI

  1. Purpose:
    • Focuses on distinguishing between different classes or categories.
    • Used for tasks like classification, regression, and prediction.
  2. Functionality:
    • Learns the boundary between different classes.
    • Given an input, it predicts the probability of the input belonging to a particular class.
  3. Examples:
    • Logistic Regression
    • Support Vector Machines (SVM)
    • Neural Networks (when used for classification tasks)
  4. Training Data:
    • Requires labeled data for training.
    • Learns from the features of the input data to make predictions.
  5. Output:
    • Produces a label or a probability score for the input data.

1.4 Generative AI

  1. Purpose:
    • Focuses on generating new data samples that resemble the training data.
    • Used for tasks like image generation, text generation, and data augmentation.
  2. Functionality:
    • Learns the underlying distribution of the training data.
    • Can generate new, synthetic data samples that are similar to the training data.
  3. Examples:
    • Generative Adversarial Networks (GANs)
    • Variational Autoencoders (VAEs)
    • Language models like GPT-3 and GPT-4
  4. Training Data:
    • Can use both labeled and unlabeled data.
    • Learns to model the distribution of the input data to generate new samples.
  5. Output:
    • Produces new data samples, such as images, text, or audio, that are similar to the training data.

1.4.1 Key Differences

  • Objective:
    • Predictive/Discriminative AI aims to classify or predict outcomes based on input data.
    • Generative AI aims to create new data samples that mimic the training data.
  • Approach:
    • Predictive/Discriminative AI focuses on learning the decision boundary between classes.
    • Generative AI focuses on learning the data distribution to generate new samples.
  • Applications:
    • Predictive/Discriminative AI is used in tasks like spam detection, medical diagnosis, and sentiment analysis.
    • Generative AI is used in tasks like image synthesis, text generation, and creating realistic simulations.

1.5 Large Language Models (LLMs)

1.5.1 What is an LLM?

A Large Language Model (LLM) is a type of machine learning model designed to understand and generate human language. These models are trained on vast amounts of text data and are capable of performing a wide range of natural language processing (NLP) tasks, such as language translation, text summarization, and conversational AI.

1.5.1.1 Key Characteristics:

  • Scale: LLMs are characterized by their large number of parameters, often in the billions or even trillions.
  • Versatility: They can be fine-tuned for specific tasks or used in a general-purpose manner.
  • Architecture: Most modern LLMs are based on the Transformer architecture, which allows for efficient processing of large-scale text data.

1.5.2 How are LLMs Trained?

Training an LLM involves several key steps:

  1. Data Collection:
    • Sources: Text data is collected from diverse sources such as books, articles, websites, and other digital content.
    • Preprocessing: The collected data is cleaned and preprocessed to remove noise, convert text to lowercase, remove stop words, and tokenize the text into sequences.
  2. Model Configuration:
    • Architecture Setup: The Transformer architecture is configured with specific parameters, such as the number of layers, attention heads, and hidden units.
    • Hyperparameters: Key hyperparameters like learning rate, batch size, and training epochs are defined.
  3. Training Process:
    • Self-Supervised Learning: The model is trained using self-supervised learning, where it predicts the next word in a sentence based on the context provided by the preceding words.
    • Optimization: The model’s weights are adjusted iteratively to minimize the prediction error using optimization algorithms like Adam.
  4. Model Parallelism:
    • Distributed Training: Due to the large size of LLMs, training is distributed across multiple GPUs or TPUs to handle the computational load efficiently.
  5. Fine-Tuning:
    • Task-Specific Adjustment: After initial training, the model can be fine-tuned on smaller, task-specific datasets to improve performance on particular tasks, such as sentiment analysis or question answering.
  6. Evaluation and Testing:
    • Performance Metrics: The model is evaluated using metrics like perplexity, accuracy, and F1 score to ensure it meets the desired performance criteria.
    • Iterative Refinement: Based on evaluation results, the model may undergo further fine-tuning and adjustments.

1.5.2.1 Example LLMs:

  • GPT-3 and GPT-4: Developed by OpenAI, these models are known for their impressive language generation capabilities.
  • BERT: Developed by Google, BERT is optimized for understanding the context of words in a sentence.

LLMs have revolutionized the field of NLP by enabling more accurate and context-aware language understanding and generation.

1.6 What is a Token in the Language Model?

A token is the smallest unit of text that holds meaning for a language model. Tokens can be words, subwords, characters, or even whole sentences, depending on the tokenization method used. Tokenization is the process of breaking down text into these smaller units, which the model then uses to understand and generate language.

1.6.1 Key Points:

  • Words: In some models, each word is treated as a token.
  • Subwords: More commonly, words are broken down into subwords or morphemes, which helps the model handle rare or unknown words.
  • Characters: In certain cases, individual characters are used as tokens, especially for languages with complex morphology.
  • Sentences: For some applications, entire sentences or phrases can be treated as single tokens.

1.6.2 Example:

For the sentence “The quick brown fox jumps over the lazy dog”: - Word Tokens: [“The”, “quick”, “brown”, “fox”, “jumps”, “over”, “the”, “lazy”, “dog”] - Subword Tokens: [“The”, “quick”, “brown”, “fox”, “jump”, “s”, “over”, “the”, “lazy”, “dog”] - Character Tokens: [“T”, “h”, “e”, ” “,”q”, “u”, “i”, “c”, “k”, ” “,”b”, “r”, “o”, “w”, “n”, ” “,”f”, “o”, “x”, ” “,”j”, “u”, “m”, “p”, “s”, ” “,”o”, “v”, “e”, “r”, ” “,”t”, “h”, “e”, ” “,”l”, “a”, “z”, “y”, ” “,”d”, “o”, “g”]

Tokens are fundamental to how language models process and generate text, enabling them to understand and produce coherent language.

1.7 How to Estimate the Cost of Running SaaS-based and Open Source LLM Models

1.7.1 SaaS-based LLM Models

  1. Pay-per-Token Pricing:
    • Description: Costs are based on the number of tokens processed, including both input and output tokens.
    • Example: If a provider charges $0.02 per 1,000 tokens, and you use 500 tokens in a request, the cost would be $0.01.
  2. Subscription Plans:
    • Description: Monthly or annual subscription plans with different usage limits.
    • Example: A plan might offer 1 million tokens per month for a fixed fee.
  3. Additional Costs:
    • Fine-Tuning: Costs for customizing the model to specific tasks.
    • Storage: Costs for storing data and model checkpoints.
    • API Calls: Charges for each API request made to the model.

1.7.2 Open Source LLM Models

  1. Hardware Costs:
    • Description: Costs for GPUs or TPUs to run the models.
    • Example: Hosting an open-source model like Llama 3 on AWS might cost around $38 per hour for an ml.p4d.24xlarge instance1.
  2. Infrastructure Costs:
    • Description: Costs for cloud services or on-premises infrastructure.
    • Example: Expenses for servers, networking, and storage.
  3. Maintenance Costs:
    • Description: Ongoing costs for maintaining and updating the models.
    • Example: Regular updates, security patches, and performance optimizations.

1.7.3 Factors Influencing Costs

  1. Model Complexity:
    • Description: More complex models with larger parameter sizes require more computational resources.
    • Example: Scaling from a 7 billion parameter model to a 300 billion parameter model increases costs significantly1.
  2. Input Size:
    • Description: The number of tokens processed affects the cost.
    • Example: Larger inputs and outputs require more power to process.
  3. Media Type:
    • Description: The type of media (text, audio, video) impacts the cost.
    • Example: Processing audio and video typically demands more resources than text.
  4. Latency Requirements:
    • Description: Faster response times require more computational resources.
    • Example: Low-latency applications may need optimized infrastructure, increasing costs1.

By understanding these factors and cost components, you can better estimate and manage the expenses associated with running LLM models.

1.8 Explain the Temperature Parameter and How to Set It

1.8.1 What is the Temperature Parameter?

The temperature parameter in language models controls the randomness of the generated text. It influences the probability distribution over the next possible tokens, affecting how creative or deterministic the output will be.

1.8.2 How Does It Work?

  • High Temperature (e.g., 0.7 to 1.0):
    • Increases randomness and diversity in the output.
    • The model is more likely to choose less probable tokens, leading to more creative and varied responses.
    • Example: Setting the temperature to 0.9 might produce imaginative and unexpected text.
  • Low Temperature (e.g., 0.1 to 0.3):
    • Decreases randomness, making the output more focused and deterministic.
    • The model tends to choose the most probable tokens, resulting in more predictable and coherent responses.
    • Example: Setting the temperature to 0.2 might produce straightforward and precise text.

1.8.3 How to Set the Temperature?

  1. Determine the Desired Output:
    • Creative Tasks: For tasks requiring creativity, such as story writing or brainstorming, use a higher temperature (e.g., 0.7 to 1.0).
    • Factual Tasks: For tasks requiring accuracy and coherence, such as summarization or answering factual questions, use a lower temperature (e.g., 0.1 to 0.3).
  2. Experiment and Adjust:
    • Start with a moderate temperature (e.g., 0.5) and adjust based on the quality of the output.
    • Fine-tune the temperature setting to balance creativity and coherence according to the specific task.
  3. Consider the Context:
    • The context and nature of the task can influence the optimal temperature setting.
    • For example, conversational AI might benefit from a moderate temperature to maintain engagement while ensuring relevance.

1.8.4 Example Settings:

  • Creative Writing: Temperature = 0.8
  • Technical Explanation: Temperature = 0.2
  • Casual Conversation: Temperature = 0.5

By adjusting the temperature parameter, you can control the balance between creativity and coherence in the generated text, tailoring the output to suit different tasks and contexts.

1.9 Different Decoding Strategies for Picking Output Tokens

  1. Greedy Decoding
  • Description: Selects the token with the highest probability at each step.
  • Pros: Simple and fast.
  • Cons: Can lead to suboptimal and repetitive results.
  • Example: For the input “The cat”, if “sat” has the highest probability, it will be chosen next.
  1. Beam Search
  • Description: Considers multiple sequences (beams) and selects the one with the highest overall probability.
  • Pros: Produces more coherent and higher-quality results than greedy decoding.
  • Cons: Computationally expensive and can still miss some optimal sequences.
  • Example: Keeps track of the top N sequences at each step and expands them.
  1. Top-K Sampling
  • Description: Chooses from the top K most probable tokens.
  • Pros: Introduces randomness while maintaining a focus on high-probability tokens.
  • Cons: The choice of K can significantly affect the output quality.
  • Example: If K=5, the next token is chosen from the top 5 most probable tokens.
  1. Top-P (Nucleus) Sampling
  • Description: Chooses from the smallest set of tokens whose cumulative probability exceeds a threshold P.
  • Pros: Balances randomness and coherence by dynamically adjusting the number of considered tokens.
  • Cons: Requires careful tuning of the threshold P.
  • Example: If P=0.9, tokens are chosen from the set where the cumulative probability is at least 0.9.
  1. Temperature Sampling
  • Description: Adjusts the probability distribution of the next token by scaling logits before applying softmax.
  • Pros: Controls the randomness of the output.
  • Cons: Needs careful tuning to balance creativity and coherence.
  • Example: A higher temperature (e.g., 1.0) produces more diverse outputs, while a lower temperature (e.g., 0.2) produces more deterministic outputs.
  1. Hybrid Methods
  • Description: Combines multiple strategies to leverage their strengths.
  • Pros: Can produce high-quality and diverse outputs.
  • Cons: More complex to implement and tune.
  • Example: Using beam search with top-K sampling to balance quality and diversity.

By understanding and utilizing these decoding strategies, you can tailor the output of language models to better suit specific tasks and desired outcomes.

1.10 Different Ways to Define Stopping Criteria in Large Language Models

  1. Max Tokens
  • Description: Limits the number of tokens generated.
  • Usage: Set a maximum token limit to prevent overly long outputs.
  • Example: max_tokens=100 stops generation after 100 tokens.
  1. Stop Sequences
  • Description: Specifies sequences that, if generated, will stop the output.
  • Usage: Define specific phrases or tokens that signal the end of generation.
  • Example: stop=["\n", "End of text"] stops generation when a newline or “End of text” is produced.
  1. End-of-Sentence
  • Description: Stops when the model predicts the end of a sentence or a specific delimiter.
  • Usage: Useful for generating coherent and complete sentences.
  • Example: Stops generation when a period, exclamation mark, or question mark is generated.
  1. Contextual Indicators
  • Description: Custom conditions based on the content or context of the generated text.
  • Usage: Implement logic to stop generation based on specific contextual cues.
  • Example: Stop if a certain topic or keyword appears in the output.
  1. Entropy-Based Stopping
  • Description: Uses the entropy of the token distribution to decide when to stop.
  • Usage: Stops generation when the uncertainty (entropy) of the next token falls below a threshold.
  • Example: Implemented in speculative decoding techniques like AdaEDL1.
  1. Length-Based Stopping
  • Description: Stops based on the length of the generated text.
  • Usage: Define a minimum and maximum length for the output.
  • Example: min_length=50 and max_length=200 ensure the output is between 50 and 200 tokens long.
  1. Custom Stopping Criteria
  • Description: Define your own stopping criteria using custom logic.
  • Usage: Implement a subclass of the stopping criteria class and pass it to the model.
  • Example: Using a custom stopping criteria to stop generation based on specific token patterns2.

By using these different stopping criteria, you can control the length and quality of the generated text, ensuring it meets the desired requirements for various applications.

1.11 How to Use Stop Sequences in LLMs

1.11.1 What are Stop Sequences?

Stop sequences are specific strings or tokens that signal the language model to stop generating text. They help manage the length and relevance of the output, ensuring that the generated text meets the desired criteria.

1.11.2 How to Implement Stop Sequences

  1. Define Stop Sequences:
    • Identify the sequences that should trigger the model to stop generating text.
    • Example: Common stop sequences include newline characters, specific phrases, or end-of-sentence markers.
  2. Set Stop Sequences in the API:
    • Most language model APIs allow you to specify stop sequences as part of the request parameters.
    • Example: In OpenAI’s GPT-3 API, you can set stop sequences using the stop parameter.

1.11.3 Example Implementation

Here’s an example of how to use stop sequences in a request to a language model API:

import openai

# Define your API key
openai.api_key = 'your-api-key'

# Define the prompt and stop sequences
prompt = "Write a short story about a brave knight."
stop_sequences = ["\n", "The end"]

# Make the API request
response = openai.Completion.create(
    engine="davinci",
    prompt=prompt,
    max_tokens=150,
    stop=stop_sequences
)

# Print the generated text
print(response.choices[0].text.strip())

1.11.4 Benefits of Using Stop Sequences

Control Output Length:

  • Prevents the model from generating excessively long responses. Ensures the output is concise and to the point. Improve Relevance:

  • Stops generation when the desired information is provided. Helps maintain the focus and relevance of the output. Enhance Readability:

  • Ensures the generated text ends at a logical point, improving readability. Avoids abrupt or incomplete sentences. Tips for Effective Use Choose Clear Stop Sequences:

  • Select sequences that are unlikely to appear in the middle of the desired output. Example: Use specific phrases like “End of story” for narrative tasks. Test and Adjust:

  • Experiment with different stop sequences to find the most effective ones for your task. Adjust based on the model’s performance and the nature of the task. Combine with Other Parameters:

  • Use stop sequences in conjunction with other parameters like max_tokens and temperature to fine-tune the output. By effectively using stop sequences, you can better control the output of language models, ensuring it meets your specific needs and criteria.

1.12 Explain the Basic Structure of Prompt Engineering

1.12.1 What is Prompt Engineering?

Prompt engineering involves designing and optimizing prompts to guide a language model in generating desired outputs. It is a crucial technique for improving the performance and accuracy of language models in various tasks.

1.12.2 Basic Structure of Prompt Engineering

  1. Formulation:
    • Clear and Simple Language: Use straightforward and unambiguous language to avoid confusion.
    • Specific Instructions: Clearly state what you want the model to do.
    • Example: “Translate the following English sentence to French: ‘Hello, how are you?’”
  2. Context:
    • Provide Relevant Information: Include necessary context to help the model understand the task.
    • Background Details: Add any background information that might be relevant.
    • Example: “You are a helpful assistant. Translate the following sentence to French.”
  3. Examples:
    • Demonstrate Desired Output: Provide examples of the input and the expected output.
    • Few-Shot Learning: Use a few examples to show the model what kind of responses are expected.
    • Example: “Translate the following sentences to French: ‘Good morning’ -> ‘Bonjour’, ‘Thank you’ -> ‘Merci’”
  4. Instructions:
    • Explicit Commands: Give clear and direct instructions to the model.
    • Step-by-Step Guidance: Break down complex tasks into simpler steps.
    • Example: “First, identify the main topic of the paragraph. Then, summarize it in one sentence.”
  5. Parameters:
    • Define Specifics: Set parameters like length, tone, or point-of-view to tailor the output.
    • Adjust Settings: Use parameters such as temperature, max tokens, and stop sequences to control the generation process.
    • Example: “Summarize the following article in 100 words or less.”

1.12.3 Example Prompt

Here’s an example of a well-structured prompt for a summarization task:

You are a helpful assistant. Summarize the following article in 100 words or less.
Article: "Artificial intelligence is transforming the world. From healthcare to finance, AI is being used to improve efficiency and outcomes. In healthcare, AI algorithms can analyze medical images to detect diseases early. In finance, AI is used to detect fraudulent transactions and manage investments. As AI technology continues to evolve, its impact on various industries will only grow."

Summary

Tips for Effective Prompt Engineering - Iterate and Refine:

Continuously test and refine your prompts based on the model’s performance. Experiment with different phrasings and structures to find the most effective prompts. - Be Specific:

The more specific and detailed your prompt, the better the model can understand and generate the desired output. - Use Examples:

Providing examples can significantly improve the model’s ability to perform the task correctly. - Monitor and Adjust:

Regularly monitor the outputs and make adjustments to the prompts as needed to maintain accuracy and relevance. By following these guidelines, you can effectively design prompts that guide language models to produce high-quality and relevant outputs for various tasks.

1.13 Explain In-Context Learning

1.13.1 What is In-Context Learning?

In-context learning is a capability of large language models where the model learns to perform a task by conditioning on input-output examples provided within the prompt. Instead of requiring explicit training or fine-tuning on a specific task, the model uses the context given in the prompt to understand and execute the task.

1.13.2 How Does In-Context Learning Work?

  1. Providing Examples:
    • The user includes a few examples of the task within the prompt.
    • These examples demonstrate the desired input-output relationship.
  2. Contextual Understanding:
    • The model uses these examples to infer the task and generate appropriate responses for new inputs.
    • The examples serve as a form of implicit training within the prompt itself.
  3. Generating Output:
    • Based on the provided examples, the model generates outputs for new inputs that follow the same pattern or logic.

1.13.3 Example of In-Context Learning

Here’s an example of how in-context learning can be used for a translation task:

Translate the following English sentences to French:

  1. English: “Hello, how are you?” French: “Bonjour, comment ça va?”

  2. English: “What is your name?” French: “Comment tu t’appelles?”

  3. English: “I would like a coffee.” French: “Je voudrais un café.”

  4. English: “Where is the nearest train station?” French: “Où est la gare la plus proche?”

  5. English: “Can you help me?” French:

In this example, the model uses the provided English-French sentence pairs to understand the task of translation. When given a new English sentence, it generates the corresponding French translation based on the pattern observed in the examples.

1.13.4 Benefits of In-Context Learning

  1. Flexibility:

Allows the model to adapt to various tasks without explicit retraining. Users can quickly switch between different tasks by changing the examples in the prompt. 2. Efficiency:

Reduces the need for extensive fine-tuning or additional training data. Enables rapid prototyping and experimentation with different tasks. 3. Ease of Use:

Simplifies the process of task specification by using natural language examples. Makes it accessible for users without deep technical expertise in machine learning.

1.13.5 Applications of In-Context Learning

  • Language Translation: Translating text between different languages using example pairs.
  • Text Summarization: Summarizing long articles or documents by providing summary examples.
  • Question Answering: Answering questions based on provided context and example answers.
  • Text Generation: Generating creative content like stories or poems based on example prompts.

In-context learning leverages the power of large language models to perform a wide range of tasks efficiently and effectively, making it a valuable tool in natural language processing.

1.14 Types of Prompt Engineering

1.14.1 Zero-Shot Prompting

  • Description: The model is given a task without any examples.
  • Usage: Relies on the model’s pre-trained knowledge to generate responses.
  • Example: “Translate ‘Hello, how are you?’ to French.”

1.14.2 One-Shot Prompting

  • Description: The model is provided with one example of the task.
  • Usage: Helps the model understand the task with a single example.
  • Example:

Translate the following sentence to French: English: “Hello, how are you?” French: “Bonjour, comment ça va?” Translate “What is your name?” to French.

  1. Few-Shot Prompting
  • Description: The model is given a few examples to illustrate the task.
  • Usage: Provides multiple examples to improve the model’s understanding and performance.

Example: Translate the following sentences to French: 1. English: “Hello, how are you?” French: “Bonjour, comment ça va?” 2. English: “What is your name?” French: “Comment tu t’appelles?” 3. English: “I would like a coffee.” French: “Je voudrais un café.” Translate “Where is the nearest train station?” to French.

  1. Chain-of-Thought Prompting Description: Breaks down the problem into a series of logical steps. Usage: Helps the model follow a structured approach to solve complex tasks.

Example: Solve the math problem step-by-step: Problem: “What is 15% of 200?” Step 1: Convert the percentage to a decimal: 15% = 0.15 Step 2: Multiply the decimal by the number: 0.15 * 200 = 30 Answer: 30

  1. Negative Prompting Description: Specifies what the model should avoid generating. Usage: Helps control the model’s output by indicating undesired responses.

Example: Write a short story about a dragon. Avoid using the words “fire” and “castle.”

  1. Contextual Prompting Description: Provides additional context or background information to guide the model. Usage: Enhances the model’s understanding by setting the scene or providing relevant details.

Example: You are a helpful assistant. Translate the following sentence to French: English: “Can you help me?”

  1. Instruction-Based Prompting Description: Gives explicit instructions on how to perform the task. Usage: Directs the model with clear and detailed commands.

Example: Summarize the following article in one sentence: Article: “Artificial intelligence is transforming various industries by improving efficiency and outcomes. In healthcare, AI algorithms can analyze medical images to detect diseases early. In finance, AI is used to detect fraudulent transactions and manage investments.”

By using these different types of prompt engineering, you can effectively guide language models to generate accurate and relevant outputs for a wide range of tasks.