Data Extraction

Now that you've learned some theory and got your project setup, it's time to ship some code. You will build and run a script that extracts info from text using the AI SDK's generateText method. This will show you firsthand how tweaking your prompt or swapping models instantly changes your results.

How Your Script Works

How script works

Analyzing the Starter Script

Open your project code. Look for app/(1-extraction)/extraction.ts and essay.txt.

Update the contents of extraction.ts with this code that extracts names from the essay:

typescript

Run Your First AI Script!

From your terminal, run:

bash

You'll see the AI extracting names from the essay. Your first feature works. Nice!

text

Verification Task

Check app/(1-extraction)/essay.txt and use search (Cmd+F/Ctrl+F) to verify the names. Did the AI nail it or miss some?

LLMs process text as 'tokens' (~4 chars each). Understanding tokens helps optimize speed and cost:

Visualize tokenization at tiktokenizer.vercel.app

Count tokens programmatically with tiktoken: pnpm add tiktoken

Monitor usage to estimate costs and stay within context limits

Try pasting different prompts into Tiktokenizer to see surprising patterns (spaces matter!).

Iteration is Everything

Running the script once is just the start. Working with LLMs is all about iteration. Play with the prompt and see for yourself:

Challenge 1: Prompt Engineering – Change the Task

Task: Swap the prompt to the following:

text

Action: Save and re-run pnpm extraction
Observe: See how one prompt change completely transforms what your app does

Challenge 2: Model Swapping – Upgrade the Brain

Task: Keep the summary prompt but change the model using the following code block:

text

Action: Save and run again
Observe: Compare results. Better quality? Worth the extra cost/time?

Available OpenAI Models via Vercel AI Gateway:

openai/gpt-5 - Most capable for complex reasoning

openai/gpt-4.1 - Fast & cost-effective for most tasks (non-reasoning)

openai/gpt-5-nano - Fastest for simple tasks

openai/gpt-4.1-mini - Previous generation, still capable (non-reasoning)

Available Anthropic Models via Vercel AI Gateway:

anthropic/claude-sonnet-4 - Strong reasoning & analysis

Available Google Models via Vercel AI Gateway:

google/gemini-2.5-pro - Advanced multimodal capabilities

google/gemini-2.5-flash - Fast responses, good balance

google/gemini-2.5-flash-lite - Lightweight & quick

google/gemini-2.0-flash - Previous flash version

See the Vercel AI Gateway models for pricing & details, or the OpenAI models documentation for OpenAI-specific info.

Simply swap the model string to experiment - the AI SDK handles all the provider differences for you!

Real-World Applications

This simple extraction pattern powers serious production features like:

Content Moderation: Finding problematic content
Research Tools: Pulling key data from papers
Data Pipelines: Converting messy text to clean data
Compliance Systems: Identifying PII/sensitive info

It's the same pattern: send content + instructions, process the response.

Key things to remember

generateText = your basic AI workhorse
The prompt = what guides the AI
The model = power/speed/cost tradeoff
Iteration = the key to success

Troubleshooting Guide

API Key Errors (401): Check your .env.local file. Key spelled right? Pasted fully? Account has credits?
Rate Limiting (429): Hit usage limits. Wait a bit or upgrade your plan.
Module Errors: Run pnpm install again. Maybe clear node_modules first.
Timeouts: Larger models are slower. Normal. Check internet if consistent fails.
Command not found: Make sure pnpm is installed globally and run pnpm install in project root.

What's Next: Model Types and Performance

You've built your first AI script and experienced the power of prompt engineering. In the next lesson, you'll learn about different model types and their performance characteristics. Understanding when to use fast models vs reasoning models is crucial for building AI features that deliver the right user experience.

After that, you'll be ready for "invisible AI" - behind-the-scenes features that enhance your product's UX using the patterns you've learned here.

How Your Script Works

How script works

Analyzing the Starter Script

Open your project code. Look for app/(1-extraction)/extraction.ts and essay.txt.

Update the contents of extraction.ts with this code that extracts names from the essay:

typescript

Run Your First AI Script!

From your terminal, run:

bash

You'll see the AI extracting names from the essay. Your first feature works. Nice!

text

Verification Task

Check app/(1-extraction)/essay.txt and use search (Cmd+F/Ctrl+F) to verify the names. Did the AI nail it or miss some?

LLMs process text as 'tokens' (~4 chars each). Understanding tokens helps optimize speed and cost:

Visualize tokenization at tiktokenizer.vercel.app

Count tokens programmatically with tiktoken: pnpm add tiktoken

Monitor usage to estimate costs and stay within context limits

Try pasting different prompts into Tiktokenizer to see surprising patterns (spaces matter!).

Iteration is Everything

Running the script once is just the start. Working with LLMs is all about iteration. Play with the prompt and see for yourself:

Challenge 1: Prompt Engineering – Change the Task

Task: Swap the prompt to the following:

text

Action: Save and re-run pnpm extraction
Observe: See how one prompt change completely transforms what your app does

Challenge 2: Model Swapping – Upgrade the Brain

Task: Keep the summary prompt but change the model using the following code block:

text

Action: Save and run again
Observe: Compare results. Better quality? Worth the extra cost/time?

Available OpenAI Models via Vercel AI Gateway:

openai/gpt-5 - Most capable for complex reasoning

openai/gpt-4.1 - Fast & cost-effective for most tasks (non-reasoning)

openai/gpt-5-nano - Fastest for simple tasks

openai/gpt-4.1-mini - Previous generation, still capable (non-reasoning)

Available Anthropic Models via Vercel AI Gateway:

anthropic/claude-sonnet-4 - Strong reasoning & analysis

Available Google Models via Vercel AI Gateway:

google/gemini-2.5-pro - Advanced multimodal capabilities

google/gemini-2.5-flash - Fast responses, good balance

google/gemini-2.5-flash-lite - Lightweight & quick

google/gemini-2.0-flash - Previous flash version

See the Vercel AI Gateway models for pricing & details, or the OpenAI models documentation for OpenAI-specific info.

Simply swap the model string to experiment - the AI SDK handles all the provider differences for you!

Real-World Applications

This simple extraction pattern powers serious production features like:

Content Moderation: Finding problematic content
Research Tools: Pulling key data from papers
Data Pipelines: Converting messy text to clean data
Compliance Systems: Identifying PII/sensitive info

It's the same pattern: send content + instructions, process the response.

Key things to remember

generateText = your basic AI workhorse
The prompt = what guides the AI
The model = power/speed/cost tradeoff
Iteration = the key to success

Troubleshooting Guide

API Key Errors (401): Check your .env.local file. Key spelled right? Pasted fully? Account has credits?
Rate Limiting (429): Hit usage limits. Wait a bit or upgrade your plan.
Module Errors: Run pnpm install again. Maybe clear node_modules first.
Timeouts: Larger models are slower. Normal. Check internet if consistent fails.
Command not found: Make sure pnpm is installed globally and run pnpm install in project root.

What's Next: Model Types and Performance

After that, you'll be ready for "invisible AI" - behind-the-scenes features that enhance your product's UX using the patterns you've learned here.

Builders Guide to the AI SDK

How Your Script Works

Analyzing the Starter Script

Run Your First AI Script!

Iteration is Everything

Challenge 1: Prompt Engineering – Change the Task

Challenge 2: Model Swapping – Upgrade the Brain

Real-World Applications

Key things to remember

Further Reading (Optional)

What's Next: Model Types and Performance

Data Extraction

How Your Script Works

Analyzing the Starter Script

Run Your First AI Script!

Iteration is Everything

Challenge 1: Prompt Engineering – Change the Task

Challenge 2: Model Swapping – Upgrade the Brain

Real-World Applications

Key things to remember

Further Reading (Optional)

What's Next: Model Types and Performance