AWS Certified AI Practitioner (AIF-C01)
AWS Certified AI Practitioner (AIF-C01) certification study notes, this guide will help you with quick revision before the exam. it can use as study notes for your preparation.
DashboardPractice Test 2
- A company needs to choose a model from Amazon Bedrock to use internally. The company must identify a model that generates responses in a style that the company’s employees prefer. What should the company do to meet these requirements?
- A. Evaluate the models by using built-in prompt datasets.
- B. Evaluate the models by using a human workforce and custom prompt datasets.
- C. Use public model leaderboards to identify the model.
- D. Use the model InvocationLatency runtime metrics in Amazon CloudWatch when trying models.
Answer
Correct answer: B
Explanation: Evaluating models using a human workforce and custom prompt datasets ensures that the model generates responses in the style that aligns with the company’s preferences. The other options either do not provide direct feedback on style preferences or are not specific enough for determining suitability based on employee preferences.
- A student at a university is copying content from generative AI to write essays. Which challenge of responsible generative AI does this scenario represent?
- A. Toxicity
- B. Hallucinations
- C. Plagiarism
- D. Privacy
Answer
Correct answer: C
Explanation: Copying content from generative AI to write essays without proper attribution constitutes plagiarism, which is a key challenge of responsible generative AI. The other options are unrelated to this specific issue.
- A company needs to build its own large language model (LLM) based on only the company’s private data. The company is concerned about the environmental effect of the training process. Which Amazon EC2 instance type has the LEAST environmental effect when training LLMs?
- A. Amazon EC2 C series
- B. Amazon EC2 G series
- C. Amazon EC2 P series
- D. Amazon EC2 Trn series
Answer
Correct answer: D
Explanation: Amazon EC2 Trn series instances (powered by AWS Trainium chips) are designed to provide efficient and environmentally friendly training of large machine learning models. They are optimized for energy efficiency, which reduces the environmental impact of the training process. The other instance types are not specifically optimized for minimizing environmental effects during training.
- A company wants to build an interactive application for children that generates new stories based on classic stories. The company wants to use Amazon Bedrock and needs to ensure that the results and topics are appropriate for children. Which AWS service or feature will meet these requirements?
- A. Amazon Rekognition
- B. Amazon Bedrock playgrounds
- C. Guardrails for Amazon Bedrock
- D. Agents for Amazon Bedrock
Answer
Correct answer: C
Explanation: Guardrails for Amazon Bedrock can help ensure that the output generated by Amazon Bedrock is appropriate for children. Guardrails are used to apply content moderation, guidelines, and ensure safety by filtering potentially harmful or inappropriate content, which is essential when building an interactive application for children.
- A company is building an application that needs to generate synthetic data that is based on existing data. Which type of model can the company use to meet this requirement?
- A. Generative adversarial network (GAN)
- B. XGBoost
- C. Residual neural network
- D. WaveNet
Answer
Correct answer: A
Explanation: GANs are specifically designed for generating synthetic data. They consist of two neural networks: a generator and a discriminator. The generator creates synthetic data that resembles the real data, and the discriminator tries to distinguish between real and generated data. This process enables GANs to generate realistic synthetic data, making them ideal for use cases where synthetic data is needed based on existing data.
- A digital devices company wants to predict customer demand for memory hardware. The company does not have coding experience or knowledge of ML algorithms and needs to develop a data-driven predictive model. The company needs to perform analysis on internal data and external data. Which solution will meet these requirements?
- A. Store the data in Amazon S3. Create ML models and demand forecast predictions by using Amazon
- B. Import the data into Amazon SageMaker Data Wrangler. Create ML models and demand forecast
- C. Import the data into Amazon SageMaker Data Wrangler. Build ML models and demand forecast
- D. Import the data into Amazon SageMaker Canvas. Build ML models and demand forecast predictions by
Answer
Correct answer: D
Explanation: Amazon SageMaker Canvas is a no-code tool that allows users to build ML models and make predictions without requiring programming knowledge. It is ideal for users with no coding experience, providing an easy interface for importing data and generating predictive models. The other options require more technical expertise or are not designed for no-code model building.
- A company has installed a security camera. The company uses an ML model to evaluate the security camera footage for potential thefts. The company has discovered that the model disproportionately flags people who are members of a specific ethnic group. Which type of bias is affecting the model output?
- A. Measurement bias
- B. Sampling bias
- C. Observer bias
- D. Confirmation bias
Answer
Correct answer: B
Explanation: Sampling bias occurs when the training data is not representative of the overall population, leading to disproportionate flagging of specific groups. In this case, the model may have been trained on biased data that did not adequately represent all ethnic groups, resulting in skewed predictions. The other types of bias do not directly apply to the selection of training data or its representativeness.
- A company is building a customer service chatbot. The company wants the chatbot to improve its responses by learning from past interactions and online resources. Which AI learning strategy provides this self-improvement capability?
- A. Supervised learning with a manually curated dataset of good responses and bad responses
- B. Reinforcement learning with rewards for positive customer feedback
- C. Unsupervised learning to find clusters of similar customer inquiries
- D. Supervised learning with a continuously updated FAQ database
Answer
Correct answer: B
Explanation: Reinforcement learning allows the chatbot to learn from interactions by receiving rewards for positive customer feedback, which helps the model self-improve over time. The other options do not directly provide a mechanism for continuous self-improvement based on interactions.
- An AI practitioner has built a deep learning model to classify the types of materials in images. The AI practitioner now wants to measure the model performance. Which metric will help the AI practitioner evaluate the performance of the model?
- A. Confusion matrix
- B. Correlation matrix
- C. R2 score
- D. Mean squared error (MSE)
Answer
Correct answer: A
Explanation: A confusion matrix provides detailed insights into the performance of a classification model by showing the true positives, false positives, true negatives, and false negatives. This metric helps evaluate how well the model classifies the different types of materials in images. The other metrics are not as suitable for evaluating a classification model.
- A company has built a chatbot that can respond to natural language questions with images. The company wants to ensure that the chatbot does not return inappropriate or unwanted images. Which solution will meet these requirements?
- A. Implement moderation APIs.
- B. Retrain the model with a general public dataset.
- C. Perform model validation.
- D. Automate user feedback integration.
Answer
Correct answer: A
Explanation: Implementing moderation APIs can help filter and block inappropriate or unwanted images before they are returned by the chatbot. The other options do not directly address ensuring that the chatbot avoids returning inappropriate images.
- An AI practitioner is using an Amazon Bedrock base model to summarize session chats from the customer service department. The AI practitioner wants to store invocation logs to monitor model input and output data. Which strategy should the AI practitioner use?
- A. Configure AWS CloudTrail as the logs destination for the model.
- B. Enable model invocation logging in Amazon Bedrock.
- C. Configure AWS Audit Manager as the logs destination for the model.
- D. Configure model invocation logging in Amazon EventBridge.
Answer
Correct answer: B
Explanation: Enabling invocation logging in Amazon Bedrock allows the AI practitioner to monitor and store the input and output data for model invocations. The other options are not directly used for logging model invocations in Amazon Bedrock.
- A company is building an ML model to analyze archived data. The company must perform inference on large datasets that are multiple GBs in size. The company does not need to access the model predictions immediately. Which Amazon SageMaker inference option will meet these requirements?
- A. Batch transform
- B. Real-time inference
- C. Serverless inference
- D. Asynchronous inference
Answer
Correct answer: A
Explanation: Batch transform is ideal for processing large datasets that do not require real-time predictions. It allows the company to perform inference on multiple GBs of data efficiently without needing immediate results. The other options are more suitable for scenarios requiring real-time or near real-time access.
- Which term describes the numerical representations of real-world objects and concepts that AI and natural language processing (NLP) models use to improve understanding of textual information?
- A. Embeddings
- B. Tokens
- C. Models
- D. Binaries
Answer
Correct answer: A
Explanation: Embeddings are numerical representations of real-world objects and concepts that help AI and NLP models understand and work with textual information more effectively by capturing relationships and similarities between words or phrases. The other options do not describe this concept.
- A research company implemented a chatbot by using a foundation model (FM) from Amazon Bedrock. The chatbot searches for answers to questions from a large database of research papers. After multiple prompt engineering attempts, the company notices that the FM is performing poorly because of the complex scientific terms in the research papers. How can the company improve the performance of the chatbot?
- A. Use few-shot prompting to define how the FM can answer the questions.
- B. Use domain adaptation fine-tuning to adapt the FM to complex scientific terms.
- C. Change the FM inference parameters.
- D. Clean the research paper data to remove complex scientific terms.
Answer
Correct answer: B
Explanation: Domain adaptation fine-tuning allows the FM to better understand the complex scientific terms by training it with domain-specific data, improving its performance on such specialized content. The other options are either insufficient or not directly related to handling complex terminology effectively.
- A company wants to use a large language model (LLM) on Amazon Bedrock for sentiment analysis. The company needs the LLM to produce more consistent responses to the same input prompt. Which adjustment to an inference parameter should the company make to meet these requirements?
- A. Decrease the temperature value.
- B. Increase the temperature value.
- C. Decrease the length of output tokens.
- D. Increase the maximum generation length.
Answer
Correct answer: A
Explanation: Decreasing the temperature value makes the model’s output more deterministic and consistent by reducing randomness in response generation. The other adjustments do not directly ensure consistent responses.
- A company wants to develop a large language model (LLM) application by using Amazon Bedrock and customer data that is uploaded to Amazon S3. The company’s security policy states that each team can access data for only the team’s own customers. Which solution will meet these requirements?
- A. Create an Amazon Bedrock custom service role for each team that has access to only the team’s customer data.
- B. Create a custom service role that has Amazon S3 access. Ask teams to specify the customer name on each Amazon Bedrock request.
- C. Redact personal data in Amazon S3. Update the S3 bucket policy to allow team access to customer data.
- D. Create one Amazon Bedrock role that has full Amazon S3 access. Create IAM roles for each team that have access to only each team’s customer folders.
Answer
Correct answer: A
Explanation: Creating a custom Amazon Bedrock service role for each team with restricted access to only the team’s customer data ensures compliance with the security policy, providing the necessary data segregation and access control. The other options do not effectively enforce team-specific access or may pose risks of broader access than allowed by the policy.
- A medical company deployed a disease detection model on Amazon Bedrock. To comply with privacy policies, the company wants to prevent the model from including personal patient information in its responses. The company also wants to receive notification when policy violations occur. Which solution meets these requirements?
- A. Use Amazon Macie to scan the model’s output for sensitive data and set up alerts for potential
- B. Configure AWS CloudTrail to monitor the model’s responses and create alerts for any detected personal
- C. Use Guardrails for Amazon Bedrock to filter content. Set up Amazon CloudWatch alarms for notification
- D. Implement Amazon SageMaker Model Monitor to detect data drift and receive alerts when model quality
Answer
Correct answer: C
Explanation: Guardrails for Amazon Bedrock can be used to filter content and ensure that personal patient information is not included in model responses. Setting up Amazon CloudWatch alarms allows the company to receive notifications when policy violations occur. The other options are not specifically designed for filtering model output and monitoring policy compliance.
- A company manually reviews all submitted resumes in PDF format. As the company grows, the company expects the volume of resumes to exceed the company’s review capacity. The company needs an automated system to convert the PDF resumes into plain text format for additional processing. Which AWS service meets this requirement?
- A. Amazon Textract
- B. Amazon Personalize
- C. Amazon Lex
- D. Amazon Transcribe
Answer
Correct answer: A
Explanation: Amazon Textract can extract text from PDF documents, making it suitable for converting resumes into plain text for further processing. The other services do not provide functionality to extract text from PDFs.
- An education provider is building a question and answer application that uses a generative AI model to explain complex concepts. The education provider wants to automatically change the style of the model response depending on who is asking the question. The education provider will give the model the age range of the user who has asked the question. Which solution meets these requirements with the LEAST implementation effort?
- A. Fine-tune the model by using additional training data that is representative of the various age ranges
- B. Add a role description to the prompt context that instructs the model of the age range that the response
- C. Use chain-of-thought reasoning to deduce the correct style and complexity for a response suitable for
- D. Summarize the response text depending on the age of the user so that younger users receive shorter
Answer
Correct answer: B
Explanation: Adding a role description to the prompt is the simplest and most effective way to adjust the model’s response style based on the user’s age range. It requires minimal implementation effort and effectively tailors the output. The other options involve more complex processes, such as fine-tuning or additional reasoning steps.
- Which strategy evaluates the accuracy of a foundation model (FM) that is used in image classification tasks?
- A. Calculate the total cost of resources used by the model.
- B. Measure the model’s accuracy against a predefined benchmark dataset.
- C. Count the number of layers in the neural network.
- D. Assess the color accuracy of images processed by the model.
Answer
Correct answer: B
Explanation: Evaluating a foundation model’s accuracy by measuring its performance against a predefined benchmark dataset is the standard approach for assessing accuracy in image classification tasks. The other options do not provide an appropriate measure of classification accuracy.
- An accounting firm wants to implement a large language model (LLM) to automate document processing. The firm must proceed responsibly to avoid potential harms. What should the firm do when developing and deploying the LLM? (Choose 2)
- A. Include fairness metrics for model evaluation.
- B. Adjust the temperature parameter of the model.
- C. Modify the training data to mitigate bias.
- D. Avoid overfitting on the training data.
- E. Apply prompt engineering techniques.
Answer
Correct answer: A, C
Explanation: Include fairness metrics for model evaluation: Fairness metrics help ensure that the LLM is unbiased and treats all cases equitably, which is essential for responsible AI use. Modify the training data to mitigate bias: Adjusting the training data helps reduce any inherent bias that might exist, contributing to a more fair and responsible LLM. The other options are related to general model optimization but do not directly address responsible AI practices regarding potential harms like bias and fairness.
- A company is building an ML model. The company collected new data and analyzed the data by creating a correlation matrix, calculating statistics, and visualizing the data. Which stage of the ML pipeline is the company currently in?
- A. Data pre-processing
- B. Feature engineering
- C. Exploratory data analysis
- D. Hyperparameter tuning
Answer
Correct answer: C
Explanation: The company is currently in the exploratory data analysis (EDA) stage, which involves summarizing data through statistics, visualizations, and correlation matrices to understand the dataset before moving on to modeling. The other options are subsequent steps in the ML pipeline.
- A company has documents that are missing some words because of a database error. The company wants to build an ML model that can suggest potential words to fill in the missing text. Which type of model meets this requirement?
- A. Topic modeling
- B. Clustering models
- C. Prescriptive ML models
- D. BERT-based models
Answer
Correct answer: D
Explanation: BERT-based models are well-suited for natural language understanding tasks, including filling in missing words, because they use contextual information to predict missing tokens in a text. The other types of models are not designed for this type of text completion task.
- A company wants to display the total sales for its top-selling products across various retail locations in the past 12 months. Which AWS solution should the company use to automate the generation of graphs?
- A. Amazon Q in Amazon EC2
- B. Amazon Q Developer
- C. Amazon Q in Amazon QuickSight
- D. Amazon Q in AWS Chatbot
Answer
Correct answer: C
Explanation: Amazon Q in Amazon QuickSight allows users to ask questions in natural language and automatically generate graphs and visualizations to display insights, such as total sales for top-selling products. The other options do not provide the same functionality for generating visual analytics.
- A company is building a chatbot to improve user experience. The company is using a large language model (LLM) from Amazon Bedrock for intent detection. The company wants to use few-shot learning to improve intent detection accuracy. Which additional data does the company need to meet these requirements?
- A. Pairs of chatbot responses and correct user intents
- B. Pairs of user messages and correct chatbot responses
- C. Pairs of user messages and correct user intents
- D. Pairs of user intents and correct chatbot responses
Answer
Correct answer: C
Explanation: Few-shot learning involves providing the model with a few examples to help it understand how to perform the task. For intent detection, the company needs pairs of user messages and the correct user intents, which will help the LLM improve its accuracy in detecting user intents. The other options do not provide the necessary pairing for improving intent detection.
- A company is using few-shot prompting on a base model that is hosted on Amazon Bedrock. The model currently uses 10 examples in the prompt. The model is invoked once daily and is performing well. The company wants to lower the monthly cost. Which solution will meet these requirements?
- A. Customize the model by using fine-tuning.
- B. Decrease the number of tokens in the prompt.
- C. Increase the number of tokens in the prompt.
- D. Use Provisioned Throughput.
Answer
Correct answer: B
Explanation: Decreasing the number of tokens in the prompt reduces the amount of data being processed, thereby lowering the cost of using the model. Since the model is performing well, reducing the prompt size is a costeffective way to maintain performance while lowering expenses. The other options either increase costs or are unrelated to prompt size.
- An AI practitioner is using a large language model (LLM) to create content for marketing campaigns. The generated content sounds plausible and factual but is incorrect. Which problem is the LLM having?
- A. Data leakage
- B. Hallucination
- C. Overfitting
- D. Underfitting
Answer
Correct answer: B
Explanation: Hallucination occurs when a large language model generates content that appears plausible and factual but is incorrect or fabricated. This is a common issue with LLMs. The other options do not describe this particular behavior.
- An AI practitioner trained a custom model on Amazon Bedrock by using a training dataset that contains confidential data. The AI practitioner wants to ensure that the custom model does not generate inference responses based on confidential data. How should the AI practitioner prevent responses based on confidential data?
- A. Delete the custom model. Remove the confidential data from the training dataset. Retrain the custom
- B. Mask the confidential data in the inference responses by using dynamic data masking.
- C. Encrypt the confidential data in the inference responses by using Amazon SageMaker.
- D. Encrypt the confidential data in the custom model by using AWS Key Management Service (AWS KMS).
Answer
Correct answer: A
Explanation: To ensure that the custom model does not generate responses based on confidential data, the best approach is to retrain the model without including the confidential data. This prevents the model from learning patterns associated with that sensitive information, thereby avoiding its use in inference. The other options do not address the root cause of the issue—removing confidential data from the training process.
- A company has built a solution by using generative AI. The solution uses large language models (LLMs) to translate training manuals from English into other languages. The company wants to evaluate the accuracy of the solution by examining the text generated for the manuals. Which model evaluation strategy meets these requirements?
- A. Bilingual Evaluation Understudy (BLEU)
- B. Root mean squared error (RMSE)
- C. Recall-Oriented Understudy for Gisting Evaluation (ROUGE)
- D. F1 score
Answer
Correct answer: A
Explanation: The BLEU (Bilingual Evaluation Understudy) score is a common metric used to evaluate the accuracy of machine translation by comparing the generated translation with reference translations. It is specifically designed for translation tasks, whereas the other metrics are not suitable for evaluating translation quality.
- A large retailer receives thousands of customer support inquiries about products every day. The customer support inquiries need to be processed quickly. The company wants to implement Agents for Amazon Bedrock. What are the key benefits of using Amazon Bedrock agents that could help this retailer?
- A. Generation of custom foundation models (FMs) to predict customer needs
- B. Automation of repetitive tasks and orchestration of complex workflows
- C. Automatically calling multiple foundation models (FMs) and consolidating the results
- D. Selecting the foundation model (FM) based on predefined criteria and metrics
Answer
Correct answer: B
Explanation: Amazon Bedrock agents help automate repetitive tasks and orchestrate complex workflows, which is ideal for handling thousands of customer support inquiries efficiently. This helps reduce response times and improves productivity. The other options do not directly address automation and orchestration of tasks for customer support.
- Which option is a benefit of ongoing pre-training when fine-tuning a foundation model (FM)?
- A. Helps decrease the model’s complexity
- B. Improves model performance over time
- C. Decreases the training time requirement
- D. Optimizes model inference time
Answer
Correct answer: B
Explanation: Ongoing pre-training helps enhance a foundation model’s performance by continuously updating it with new data, thereby improving its ability to generalize and perform well on different tasks. The other options do not directly relate to the benefits of ongoing pre-training.
- What are tokens in the context of generative AI models?
- A. Tokens are the basic units of input and output that a generative AI model operates on, representing words, subwords, or other linguistic units.
- B. Tokens are the mathematical representations of words or concepts used in generative AI models.
- C. Tokens are the pre-trained weights of a generative AI model that are fine-tuned for specific tasks.
- D. Tokens are the specific prompts or instructions given to a generative AI model to generate output.
Answer
Correct answer: A
Explanation: Tokens are the smallest units (e.g., words, subwords, or characters) that generative AI models use to process text. They form the basis of both the input given to and the output generated by the model. The other options do not accurately describe tokens in this context.
- A company wants to assess the costs that are associated with using a large language model (LLM) to generate inferences. The company wants to use Amazon Bedrock to build generative AI applications. Which factor will drive the inference costs?
- A. Number of tokens consumed
- B. Temperature value
- C. Amount of data used to train the LLM
- D. Total training time
Answer
Correct answer: A
Explanation: Inference costs for large language models are typically driven by the number of tokens processed during input and output, as each token incurs computational resources. The other factors (temperature value, training data, and training time) do not directly impact inference costs.
- A company is using Amazon SageMaker Studio notebooks to build and train ML models. The company stores the data in an Amazon S3 bucket. The company needs to manage the flow of data from Amazon S3 to SageMaker Studio notebooks. Which solution will meet this requirement?
- A. Use Amazon Inspector to monitor SageMaker Studio.
- B. Use Amazon Macie to monitor SageMaker Studio.
- C. Configure SageMaker to use a VPC with an S3 endpoint.
- D. Configure SageMaker to use S3 Glacier Deep Archive.
Answer
Correct answer: C
Explanation: Configuring Amazon SageMaker to use a VPC with an S3 endpoint ensures secure, direct, and managed data flow between Amazon S3 and SageMaker Studio notebooks. This setup avoids public internet exposure and maintains data integrity during transfers. The other options do not provide a solution for managing the data flow in this context.
- A company has a foundation model (FM) that was customized by using Amazon Bedrock to answer customer queries about products. The company wants to validate the model’s responses to new types of queries. The company needs to upload a new dataset that Amazon Bedrock can use for validation. Which AWS service meets these requirements?
- A. Amazon S3
- B. Amazon Elastic Block Store (Amazon EBS)
- C. Amazon Elastic File System (Amazon EFS)
- D. AWS Snowcone
Answer
Correct answer: A
Explanation: Amazon S3 is the most suitable AWS service for uploading and storing datasets used for validation purposes. It is highly scalable and integrated with Amazon Bedrock, allowing easy access to data for model validation. The other options do not provide the same level of integration or suitability for managing datasets in this context.
- Which prompting attack directly exposes the configured behavior of a large language model (LLM)?
- A. Prompted persona switches
- B. Exploiting friendliness and trust
- C. Ignoring the prompt template
- D. Extracting the prompt template
Answer
Correct answer: D
Explanation: “Extracting the prompt template” is a type of prompting attack that involves directly exposing the configured behavior or the underlying system prompt of a large language model (LLM). This attack can reveal sensitive details about how the model operates, including its internal instructions or restrictions, which are typically not intended to be disclosed to the user. Such an exposure can compromise the security and reliability of the LLM by making it vulnerable to further exploitation or misuse.
- A company wants to use Amazon Bedrock. The company needs to review which security aspects the company is responsible for when using Amazon Bedrock. Which security aspect will the company be responsible for?
- A. Patching and updating the versions of Amazon Bedrock
- B. Protecting the infrastructure that hosts Amazon Bedrock
- C. Securing the company’s data in transit and at rest
- D. Provisioning Amazon Bedrock within the company network
Answer
Correct answer: C
Explanation: According to the AWS Shared Responsibility Model, AWS manages the security of the cloud, including the infrastructure and services like Amazon Bedrock. Customers are responsible for security in the cloud, which encompasses protecting their data, managing access controls, and configuring security settings for their applications. Reference: https://aws.amazon.com/compliance/shared-responsibility-model/?nc1=h_ls
- A social media company wants to use a large language model (LLM) to summarize messages. The company has chosen a few LLMs that are available on Amazon SageMaker JumpStart. The company wants to compare the generated output toxicity of these models. Which strategy gives the company the ability to evaluate the LLMs with the LEAST operational overhead?
- A. Crowd-sourced evaluation
- B. Automatic model evaluation
- C. Model evaluation with human workers
- D. Reinforcement learning from human feedback (RLHF)
Answer
Correct answer: B
Explanation: Automatic model evaluation is the strategy that allows the company to evaluate the LLMs with the least operational overhead. This method leverages automated tools and processes to assess the toxicity or quality of the generated output without the need for manual intervention or crowd-sourced input. By using pre-built evaluation metrics or toxicity detection models, the company can quickly and efficiently evaluate multiple models without the complexity and time required for human evaluations.
- A company is testing the security of a foundation model (FM). During testing, the company wants to get around the safety features and make harmful content. Which security technique is this an example of?
- A. Fuzzing training data to find vulnerabilities
- B. Denial of service (DoS)
- C. Penetration testing with authorization
- D. Jailbreak
Answer
Correct answer: D
Explanation: “Jailbreaking” refers to attempts to bypass or disable the built-in safety features and restrictions of a system, in this case, a foundation model (FM). This technique involves trying to circumvent the safeguards that prevent the model from generating harmful or unsafe content. Jailbreaking is often performed to exploit vulnerabilities in a model’s filtering or safety protocols, making it a direct attempt to undermine its protections.
- A company needs to use Amazon SageMaker for model training and inference. The company must comply with regulatory requirements to run SageMaker jobs in an isolated environment without internet access. Which solution will meet these requirements?
- A. Run SageMaker training and inference by using SageMaker Experiments.
- B. Run SageMaker training and inference by using network isolation.
- C. Encrypt the data at rest by using encryption for SageMaker geospatial capabilities.
- D. Associate appropriate AWS Identity and Access Management (IAM) roles with the SageMaker jobs.
Answer
Correct answer: B
Explanation: Network isolation in Amazon SageMaker allows you to run training and inference jobs in an environment that does not have access to the internet. This helps ensure that the data and the model do not inadvertently access external resources, meeting regulatory compliance requirements for isolated environments.
- An ML research team develops custom ML models. The model artifacts are shared with other teams for integration into products and services. The ML team retains the model training code and data. The ML team wants to build a mechanism that the ML team can use to audit models. Which solution should the ML team use when publishing the custom ML models?
- A. Create documents with the relevant information. Store the documents in Amazon S3.
- B. Use AWS AI Service Cards for transparency and understanding models.
- C. Create Amazon SageMaker Model Cards with intended uses and training and inference details.
- D. Create model training scripts. Commit the model training scripts to a Git repository.
Answer
Correct answer: C
Explanation: Amazon SageMaker Model Cards are designed to document the essential details about machine learning models, including their intended uses, training datasets, training parameters, evaluation metrics, and inference environment. This provides a centralized mechanism to store and audit the metadata of the models, which is ideal for the ML team’s need to share and audit models effectively. T Key benefits of Amazon Model Cards: SageMaker Standardized documentation of models’ characteristics and intended use cases. for auditing purposes. Transparency and traceability Integration with other AWS services for lifecycle management.
- A software company builds tools for customers. The company wants to use AI to increase software development productivity. Which solution will meet these requirements?
- A. Use a binary classification model to generate code reviews.
- B. Install code recommendation software in the company’s developer tools.
- C. Install a code forecasting tool to predict potential code issues.
- D. Use a natural language processing (NLP) tool to generate code.
Answer
Correct answer: D
Explanation: Natural language processing (NLP) tools can be used to generate code from high-level descriptions or suggestions, which can greatly enhance software development productivity. By leveraging NLP models, developers can automate repetitive coding tasks, generate code snippets, or even complete blocks of code based on natural language inputs. This can speed up development and reduce errors.
- A retail store wants to predict the demand for a specific product for the next few weeks by using the Amazon SageMaker DeepAR forecasting algorithm. Which type of data will meet this requirement?
- A. Text data
- B. Image data
- C. Time series data
- D. Binary data
Answer
Correct answer: C
Explanation: The Amazon SageMaker DeepAR forecasting algorithm is specifically designed for forecasting scalar (onedimensional) time series data using recurrent neural networks (RNNs). It excels when trained on datasets containing hundreds of related time series, enabling it to learn patterns across multiple series and provide accurate forecasts. Reference:
- A large retail bank wants to develop an ML system to help the risk management team decide on loan allocations for different demographics. What must the bank do to develop an unbiased ML model?
- A. Reduce the size of the training dataset.
- B. Ensure that the ML model predictions are consistent with historical results.
- C. Create a different ML model for each demographic group.
- D. Measure class imbalance on the training dataset. Adapt the training process accordingly.
Answer
Correct answer: D
Explanation: In machine learning, class imbalance occurs when certain classes are underrepresented in the training dataset, leading to biased model predictions. To develop an unbiased model, it’s crucial to assess the class distribution and adjust the training process to address any imbalances. This can be achieved through techniques such as oversampling the minority class, undersampling the majority class, or applying class weights to ensure the model treats all classes equitably.
- Which prompting technique can protect against prompt injection attacks?
- A. Adversarial prompting
- B. Zero-shot prompting
- C. Least-to-most prompting
- D. Chain-of-thought prompting
Answer
Correct answer: A
Explanation: Adversarial prompting is a technique used to defend against prompt injection attacks by crafting inputs that are specifically designed to identify and neutralize malicious prompts. This approach involves generating prompts that can detect and mitigate adversarial inputs, thereby enhancing the robustness of language models against such attacks.
- A company has fine-tuned a large language model (LLM) to answer questions for a help desk. The company wants to determine if the fine-tuning has enhanced the model’s accuracy. Which metric should the company use for the evaluation?
- A. Precision
- B. Time to first token
- C. F1 score
- D. Word error rate
Answer
Correct answer: C
Explanation: The F1 score is a metric that combines precision and recall into a single value, providing a balance between the two. It is particularly useful in evaluating models where there is an uneven class distribution, as it considers both false positives and false negatives. The F1 score is calculated as the harmonic mean of precision and recall: This metric ranges from 0 to 1, with 1 indicating perfect precision and recall. In the context of evaluating a fine-tuned large language model (LLM) for a help desk application, the F1 score is appropriate because it assesses the model’s ability to provide accurate and relevant responses, balancing the trade-off between precision (correctness of responses) and recall (completeness of relevant responses). Reference:
- A company is using Retrieval Augmented Generation (RAG) with Amazon Bedrock and Stable Diffusion to generate product images based on text descriptions. The results are often random and lack specific details. The company wants to increase the specificity of the generated images. Which solution meets these requirements?
- A. Increase the number of generation steps.
- B. Use the MASK_IMAGE_BLACK mask source option.
- C. Increase the classifier-free guidance (CFG) scale.
- D. Increase the prompt strength.
Answer
Correct answer: C
Explanation: In Stable Diffusion, the classifier-free guidance (CFG) scale parameter controls how closely the generated image adheres to the provided text prompt. By increasing the CFG scale, the model places more emphasis on the prompt, leading to images that more accurately reflect the specified details. However, it’s important to balance this setting, as excessively high values can result in less diverse and potentially lowerquality images.
- A company wants to implement a large language model (LLM) based chatbot to provide customer service agents with real-time contextual responses to customers’ inquiries. The company will use the company’s policies as the knowledge base. Which solution will meet these requirements MOST cost-effectively?
- A. Retrain the LLM on the company policy data.
- B. Fine-tune the LLM on the company policy data.
- C. Implement Retrieval Augmented Generation (RAG) for in-context responses.
- D. Use pre-training and data augmentation on the company policy data.
Answer
Correct answer: C
Explanation: Retrieval Augmented Generation (RAG) integrates external data sources with LLMs to produce accurate and contextually relevant outputs without the need for extensive retraining. By connecting the chatbot to the company’s policy documents, RAG enables the model to retrieve pertinent information in real-time, ensuring responses are both accurate and up-to-date. This approach is cost-effective as it leverages existing data without the computational expenses associated with retraining or fine-tuning large models. Reference:
- A company wants to create a new solution by using AWS Glue. The company has minimal programming experience with AWS Glue. Which AWS service can help the company use AWS Glue?
- A. Amazon Q Developer
- B. AWS Config
- C. Amazon Personalize
- D. Amazon Comprehend
Answer
Correct answer: A
Explanation: Amazon Q Developer is a tool designed to help users with minimal programming experience to work with AWS Glue. It provides a graphical user interface and simplifies the creation of data transformation and extraction workflows, allowing users to perform tasks like querying and working with data without needing deep coding skills.
- A company is developing a mobile ML app that uses a phone’s camera to diagnose and treat insect bites. The company wants to train an image classification model by using a diverse dataset of insect bite photos from different genders, ethnicities, and geographic locations around the world. Which principle of responsible AI does the company demonstrate in this scenario?
- A. Fairness
- B. Explainability
- C. Governance
- D. Transparency
Answer
Correct answer: A
Explanation: The company is actively seeking to ensure that the image classification model is trained on a diverse dataset that includes insect bite photos from various genders, ethnicities, and geographic locations. This reflects the fairness principle of responsible AI, which emphasizes creating models that make unbiased decisions across all demographic groups. By including a diverse range of data, the company is aiming to prevent biases that could lead to inaccurate diagnoses or treatments for certain groups of people. Fairness ensures that AI systems do not discriminate based on race, gender, geography, or other characteristics.