TrustCart is a real-time fraud detection system for e-commerce listings. It combines a trained XGBoost classifier, statistical anomaly scoring, Groq LLM reasoning, and semantic duplicate detection to surface risky listings from Google Shopping and eBay — before you buy.
Online shopping fraud isn't a website problem — it's a listing problem. A legitimate site like eBay or Google Shopping can host thousands of fraudulent listings side by side with genuine ones. Existing tools don't solve this.
In July 2025, Mozilla shut down Fakespot — the closest tool to what I wanted. What remained were tools that either check whether a website is trustworthy (ScamAdviser, F-Secure) or analyse reviews for fakery (ReviewMeta) — neither of which tells you whether the specific iPhone listing you're about to click is a scam.
I wanted a tool that answers one question: is this particular listing safe to buy from? TrustCart is that tool.
| Tool | What it checks | Platform coverage | Still active? |
|---|---|---|---|
| Fakespot | Fake reviews | Amazon, eBay, Walmart | ❌ Shut down Jul 2025 |
| ReviewMeta | Fake reviews | Amazon only | ✅ |
| Camelizer | Price history / fake discounts | Amazon only | ✅ |
| Honey | Price comparison, coupons | Multi-platform | ✅ |
| ScamAdviser | Website reputation | Site-level only | ✅ |
| F-Secure | Website safety | Site-level only | ✅ |
| Counterfake | Counterfeit sellers | Enterprise SaaS | ✅ |
Every buyer-facing tool on this list has at least one of these fundamental limitations:
- Site-level, not listing-level — ScamAdviser tells you eBay.com is safe. It says nothing about the seller charging $49 for an "iPhone 15 Pro" in the listings.
- Single platform — ReviewMeta and Camelizer only cover Amazon, where most complaints actually originate, but leave eBay and Google Shopping completely unchecked.
- Reviews only — A listing can have zero reviews and still be fraudulent. Review-based tools are blind to brand-new scam listings.
- No ML risk scoring — Price history trackers and coupon finders are savings tools, not fraud detectors. None of them run a trained classifier against listing features.
TrustCart is the only consumer-facing tool that operates at the individual listing level with a full ML pipeline — not review sentiment, not site reputation, not price history alone.
Three signal types, combined:
| Signal | How TrustCart uses it | What others do |
|---|---|---|
| Statistical | Price percentile rank, outlier scoring, seller trust weights | Camelizer does price history (Amazon only) |
| ML Classification | XGBoost on 17 features — seller rating, sales volume, price percentile, condition, platform | Nobody else applies a trained classifier to individual listings |
| LLM Reasoning | Groq/LLaMA generates plain-English red flags and a buy recommendation | Fakespot had NLP for reviews; no tool explains listing-level fraud in plain English |
What no competitor does at all:
- Cross-platform duplicate detection — TF-IDF cosine similarity flags when the same listing appears across Google Shopping and eBay at different prices, exposing arbitrage scams and price manipulation.
- Calibrated trust scoring — The LLM verdict gates the safety score. An AVOID recommendation hard-caps the score at 20%, preventing the model from contradicting itself.
- Multi-platform scraping in one query — One search returns ranked, risk-scored results from both Google Shopping and eBay simultaneously.
The gap Fakespot's shutdown left is real. TrustCart fills it — not for review analysis, but for the harder problem of per-listing fraud risk at the point of search.
A search query triggers a four-stage pipeline:
- Scraping — Listings fetched live from Google Shopping and eBay via SerpAPI
- Statistical Scoring — Percentile-based price analysis, seller trust signals, and a weighted 4-factor rule model
- XGBoost Classification — 17-feature gradient-boosted classifier assigns a fraud probability to each listing
- Groq LLM Explanation — Plain-English fraud reasoning, red flags, and a buy recommendation for the top risky items
Results are deduplicated using TF-IDF cosine similarity to surface cross-platform price comparisons.
Benchmarked on 714 real scraped listings (Google Shopping + eBay), labeled via Groq LLM and validated against rule-based scores.
| Metric | XGBoost | Rule-Based | Delta |
|---|---|---|---|
| F1 Score | 91.6% | 84.5% | +7.1% |
| Recall | 98.0% | 77.3% | +20.7% |
| Precision | 86.0% | 93.2% | −7.2% |
| Accuracy | 88.8% | 82.4% | +6.4% |
| AUC (ROC) | 92.4% | 93.6% | −1.2% |
High recall (98%) is the priority — catching fraudulent listings matters more than the occasional false positive.
| Category | F1 | Category | F1 |
|---|---|---|---|
| Used Cars | 100% | Gaming Laptop | 95.2% |
| Luxury Watch | 100% | Headphones | 96.6% |
| Luxury Handbag | 100% | PS5 / Console | 96.6% |
| iPhone | 91.6% | Books | 85.6% |
| Hair Dryer | 80.0% | Furniture | 69.6% |
| Rank | Feature | Importance |
|---|---|---|
| 1 | Seller Rating | 60.0% |
| 2 | Quantity Sold (log) | 14.8% |
| 3 | Product Rating | 10.9% |
| 4 | Price Percentile | 3.2% |
| 5 | Review Count (log) | 2.8% |
| 6 | Log Price | 2.4% |
| 7 | Condition (New) | 1.9% |
| 8 | Seller Feedback % | 1.8% |
| 9 | Platform (eBay) | 1.1% |
User Query
│
▼
┌─────────────────────────────────┐
│ FastAPI Backend │
└──────────┬──────────────────────┘
│
┌──────┴──────┐
▼ ▼
SerpAPI Fraud Detection Pipeline
Scraping │
│ ├─ 1. Statistical Scoring
│ Google Price: 50% | Seller: 25%
│ Shopping Attributes: 15% | History: 10%
│ eBay │
│ ├─ 2. XGBoost Classifier (17 features)
│ seller_rating, quantity_sold,
│ price_percentile, rating, reviews,
│ seller_feedback_pct, platform,
│ condition, dynamic_trust flags
│ │
│ ├─ 3. Groq LLM Explanation
│ llama-3.1-8b-instant
│ Structured JSON · Cached
│ │
│ └─ 4. Duplicate Detection
│ TF-IDF cosine similarity (0.82)
│ Cross-platform pair matching
│
└──────────────────────────
Tailwind CSS / Vanilla JS
Glass-morphism UI
Real-time animated stages
| Layer | Technology |
|---|---|
| Backend | FastAPI 0.128 (Python 3.11+) |
| ML Model | XGBoost 1.7+, scikit-learn |
| LLM | Groq API — LLaMA 3.1-8B Instant |
| Duplicate Detection | TF-IDF cosine similarity |
| Data Collection | SerpAPI (Google Shopping + eBay) |
| Frontend | HTML5, Tailwind CSS, Vanilla JavaScript |
| Deployment | Railway (CI/CD from GitHub) |
| Component | Weight | Signals |
|---|---|---|
| Price Analysis | 50% | Percentile rank, outlier removal (>10× median), trusted seller discount |
| Seller Reputation | 25% | Item rating, review count, seller rating, eBay feedback %, dynamic trust |
| Product Attributes | 15% | Condition, title length and quality |
| Historical Patterns | 10% | Platform risk, category-specific baselines |
Dynamic trusted seller logic: eBay sellers with ≥1,000 ratings and ≥98% positive feedback are automatically trusted — no hardcoding needed.
Risk thresholds: LOW (< 0.25) · MEDIUM (0.25–0.55) · HIGH (≥ 0.55)
Trained on 10,000 synthetic listings (35% fraud / 45% legitimate / 20% edge cases) and validated on 714 real scraped listings labeled via Groq LLM.
17 input features across price, seller trust, platform, condition, and sales volume. quantity_sold emerged as the #2 most important feature (14.8%) — high-volume listings are a strong legitimacy signal.
Generates structured fraud analysis for the top 5 risky items per search: scam probability, specific red flags, plain-English reasoning, and a buy recommendation. Trust score is capped by the LLM output — an AVOID verdict limits the safety score to 20%, preventing contradictory results.
TF-IDF vectorization with cosine similarity threshold of 0.82 and Union-Find clustering. Flags when the same item appears across multiple sellers or platforms so users can compare before buying.
- Python 3.11+
libomp(macOS only):brew install libomp- SerpAPI key · Groq API key
git clone https://github.com/Msundara19/Trustcart.git
cd Trustcart
brew install libomp # macOS only
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env # Add SERPAPI_KEY and GROQ_API_KEY
uvicorn main:app --reload # → http://localhost:8000- Multi-platform scraping — Google Shopping + eBay
- Statistical fraud scoring — weighted 4-factor model
- XGBoost classifier — 17 features, 91.6% F1 / 98.0% recall
- Groq LLM explanations with calibrated risk thresholds
- Semantic duplicate detection — cross-platform pairing
- Prediction logging for continuous dataset growth
- Production deployment on Railway (CI/CD from GitHub)
- Browser extension (Chrome / Firefox)
- Historical price tracking
- Amazon + AliExpress integration
- Image-based counterfeit detection
- No user data collected — stateless API, no tracking or storage
- API keys stored in environment variables, never committed
- HTTPS enforced in production (Railway)
Meenakshi Sridharan
📧 msridharansundaram@hawk.illinoistech.edu
| Tool | Role |
|---|---|
| Groq | Ultra-fast LLM inference (LPU hardware) |
| SerpAPI | Google Shopping + eBay scraping |
| XGBoost | Gradient boosting classifier |
| FastAPI | Async Python web framework |
| Railway | Zero-config cloud deployment |