Skip to the content.
N
Notezio
/
AWS Certified AI Practitioner (AIF-C01)
Azure Certification Notes
GenAI Capabilities and Challenges
Capabilities of Generative AI
Adaptability : Can adjust responses across tasks and domains
Responsiveness : Generates outputs in real time
Simplicity : Abstracts complex logic behind simple interfaces
Creativity : Produces novel ideas, content, and solutions
Data efficiency : Learns effectively from smaller datasets
Personalization : Tailors outputs to individual users
Scalability : Serves millions of users simultaneously
Challenges of Generative AI
Regulatory violations : May violate laws if not governed properly
Social risks : Can amplify bias or misinformation
Data security & privacy : Risk of leaking sensitive information
Toxicity : Can generate harmful or offensive content
Hallucinations : Produces confident but incorrect outputs
Interpretability : Hard to explain complex models
Nondeterminism : Same input may produce different outputs
Plagiarism & cheating : Risk of copying copyrighted content
Toxicity
Toxicity refers to AI-generated content that is offensive, disturbing, abusive, hateful, or inappropriate.
Defining what qualifies as “toxic” is challenging because it often depends on: Cultural context, Audience, Intent
There is a thin boundary between filtering toxic content and censorship, especially when free expression is involved.
Also, questions to be considered: what about quotations of someone that can be considered toxic? Should they be included?
Mitigation Strategies
Training data curation
Identify and remove toxic or offensive phrases before model training
Balance datasets to reduce bias amplification
Guardrails and moderation models
Automatically detect and block harmful content
Filter outputs based on predefined safety categories
Human review
Use human-in-the-loop workflows for ambiguous or borderline cases
Hallucinations
Hallucinations occur when a model generates confident-sounding but factually incorrect information.
This happens because large language models:
Predict the next most likely word
Do not truly understand factual correctness
As a result, models may:
Invent non-existent facts
Cite fake sources
Provide incorrect explanations that appear plausible
Mitigation Strategies
User education : Inform users that generated content is not guaranteed to be correct
Verification requirements : Cross-check outputs against trusted or authoritative sources
Output labeling : Clearly mark AI-generated content as unverified or machine-generated
Retrieval-based grounding : Use external knowledge sources (e.g., search engines, databases) to anchor responses
Plagiarism and Cheating
Worries that Gen AI can be used to write college essays, writing samples for job applications, and other forms of cheating or illicit copying
Debates on this topic are actively happening
Some are saying the new technologies should be accepted and others are saying it should be prohibited
Difficulties in tracing the source of a specific output of an LLM
Rise of technologies to detect if text or image have been generated by with AI
Prompt Misuses
Poisoning :
Poisoning involves intentionally injecting malicious, biased, or misleading data into training datasets.
This can cause the model to: Produce biased outputs, Generate harmful or offensive content
Often difficult to detect without careful data governance.
Hijacking and Prompt Injection :
Prompt injection embeds hidden or manipulative instructions inside user prompts.
The goal is to Override system instructions and Alter model behavior
This can hijack model’s behavior to produce outputs that align with the attacker’s intentions (e.g., generate misinformation or run malicious code).
Exposure of Sensitive Information :
The risk of exposing sensitive and confidential information to a model during training or inference
The model can then reveal this sensitive data from their training corpus, leading to potential data leaks or privacy violations
Prompt Leaking :
The unintentional disclosure or leakage of the prompts or inputs used within a model
It can expose protected data or other data used by the model, such as how the model works
Jailbreaking :
AI models are typically trained with certain ethical and safety constraints in place to prevent misuse or harmful output
Jailbreaking is a way to circumvent the constraints and safety measures implemented in a generative model to gain unauthorized access or functionality
Key Takeaway
Toxicity → harmful content
Hallucinations → confident but false information
Plagiarism → ethical and legal risk
Prompt misuse → security vulnerability
Mitigation = guardrails + human review + governance