About

LLMs are powerful—but also vulnerable. SwiftGuard is the next-generation defense system that protects AI models from jailbreaking attacks without slowing them down.

Practice

Purpose: Protects AI models from jailbreaking attacks.
Core Technology: Single-Pass Detection (SPD) & Logit-Based Classification.
Use Cases: AI Security for Healthcare, Finance, and Enterprise AI.
Efficiency: Minimal computational overhead for seamless user experience.

COMPETENCIES

88%
Attack Detection Accuracy
9.67%
False Positive Rate
85.48%
F1 Score
0.3s
Average Response Time

Our Code Our Report

Choose Your Story

The Right Protection for Your Needs

SwiftGuard offers two powerful configurations to match your specific requirements. Choose between speed and precision based on your use case.

SwiftGuard Classic

Our speed-optimized solution with higher detection rate

88.08%
Attack Detection
9.67%
False Positive Rate
85.48%
F1 Score
0.3s
Response Time

Ideal For:

Consumer applications
High-traffic environments
Real-time chat systems
Scenarios where detection rate and speed are critical

SwiftGuard Precision

Our reliability-optimized solution with minimal false positives

86.5%
Attack Detection
0.5%
False Positive Rate
92.51%
F1 Score
4.62s
Response Time

Ideal For:

Enterprise environments
Healthcare & financial systems
Scientific research applications
Scenarios where minimizing false positives is essential

Understanding the Tradeoff

The fundamental tradeoff in LLM protection systems is between speed and precision. SwiftGuard Classic processes prompts with minimal overhead, making it ideal for applications where user experience depends on rapid response times. SwiftGuard Precision incorporates an additional preliminary classifier that significantly reduces false positives at the cost of increased processing time, making it the preferred choice for environments where accuracy is the top priority.

Both configurations use the same core technology but optimize for different priorities, allowing you to select the right tool for your specific needs.

How It Works

Our Model Pipeline

SwiftGuard uses a two-stage filtering system to efficiently identify and block harmful prompts while allowing legitimate queries to pass through with minimal latency.

Benign Prompt Flow

User Prompt

↓

Rule-Based Classifier

↓

Preliminary Check
✓ Passes

↓

Prompt Processed by LLM

Safe prompts flow through our system with minimal overhead, ensuring a seamless user experience.

Adversarial Prompt Flow

Harmful Prompt

↓

Rule-Based Classifier

↓

Suspicious Pattern
⚠ Flagged

↓

Single-Pass Detection (SPD)

↓

High Risk Score
❌ Harmful

↓

Prompt Blocked & Logged

Malicious jailbreak attempts are identified with 88% accuracy and blocked before they can reach the underlying LLM, protecting the system from exploitation.

Why It Matters

Securing the Future of AI

As large language models become increasingly embedded in critical systems, protecting them from exploitation is not just a technical challenge—it's an ethical imperative.

The Stakes Are High

Without robust protection, LLMs can be manipulated to:

Generate harmful content despite safety guardrails
Leak sensitive information from training data
Execute potentially dangerous code or commands
Undermine trust in AI systems across sectors

SwiftGuard addresses these vulnerabilities head-on, providing a critical layer of defense without compromising the user experience.

Real-World Impact

Our exceptional results demonstrate SwiftGuard's potential to transform LLM security:

88% detection accuracy identifies the vast majority of jailbreak attempts
Low 9.67% false positive rate ensures legitimate queries aren't blocked
Strong 85.48% F1 score validates our balanced approach to security
Ultra-fast 0.3s average response time maintains seamless user interaction

These metrics translate to safer, more reliable AI systems that organizations can deploy with confidence.

SwiftGuard represents a significant advancement in AI safety technology—balancing robust protection with computational efficiency to secure the next generation of language models.

About Us

Meet Our Team

SwiftGuard was developed as our capstone project at UC San Diego. Our team is passionate about AI security and dedicated to creating robust solutions that protect LLMs from adversarial attacks.

Shreya Sudan

Data Scientist
Specialized in developing our test dataset of adversarial prompts and the web design.

Arman Rahman

Machine Learning Engineer
Focused on developing and optimizing the classification models that power SwiftGuard's detection capabilities.

Donald Taggart

Data Analyst
Specialized in jailbreak attack analysis and developing our advertising elements.

Dante Testini

System Architect
Responsible for the design and implementation of the rule-based preliminary classifier and system quality assurance.

Faculty Mentor

Prof. Barna Saha
Provided guidance and expertise in machine learning and security throughout the development of SwiftGuard.

Faculty Mentor

Prof. Arya Mazumdar
Provided guidance and expertise in machine learning and security throughout the development of SwiftGuard.

Contact

Contact Us!

Your message was sent, thank you!

Our Team

Shreya Sudan - Data Scientist
shreyasudan2211@gmail.com
LinkedIn | GitHub
Personal Website

Arman Rahman - ML Engineer
arahman@ucsd.edu
LinkedIn | GitHub

Donald Taggart - Data Analyst
dtaggart@ucsd.edu
LinkedIn | GitHub

Dante Testini - System Architect
dtestini@ucsd.edu
LinkedIn | GitHub

Faculty Mentors

Prof. Barna Saha
bsaha@ucsd.edu

Prof. Arya Mazumdar
arya@ucsd.edu

Hello, World.

I'm SwiftGuard.

About

Practice

COMPETENCIES

Choose Your Story

The Right Protection for Your Needs

SwiftGuard Classic

Ideal For:

SwiftGuard Precision

Ideal For:

Understanding the Tradeoff

How It Works

Our Model Pipeline

Benign Prompt Flow

Adversarial Prompt Flow

Why It Matters

Securing the Future of AI

The Stakes Are High

Real-World Impact

About Us

Meet Our Team

Shreya Sudan

Arman Rahman

Donald Taggart

Dante Testini

Faculty Mentor

Faculty Mentor

Contact

Contact Us!

Our Team

Faculty Mentors