A Rust library to shield your system from malicious and unwanted websites by categorizing and blocking them.
Add spider_firewall to your Cargo project with:
cargo add spider_firewallThe small tier is enabled by default. Enable medium or large for broader coverage — each tier includes all sources from the tier(s) below it.
| Tier | FST Size | Focus | Feature Flag |
|---|---|---|---|
| small (default) | ~13 MB | Ads, tracking, malware, phishing, scams, adult/porn | small |
| medium | ~26 MB | + ransomware, fraud, abuse, threat intel, extended phishing | medium |
| large | ~52 MB | + redirect/typosquatting, extended ads/tracking, full URLhaus | large |
# Default — small tier, all categories:
spider_firewall = "2.35"
# Medium tier:
spider_firewall = { version = "2.35", features = ["medium"] }
# Large tier:
spider_firewall = { version = "2.35", features = ["large"] }
# Small tier, only bad + ads (no tracking/gambling):
spider_firewall = { version = "2.35", default-features = false, features = ["default-tls", "bad", "ads", "small"] }Categories can be toggled independently (all enabled by default):
| Feature | Description |
|---|---|
bad |
Malware, phishing, scams, fraud, ransomware, abuse |
ads |
Advertising domains |
tracking |
Tracking and analytics domains |
gambling |
Gambling domains |
ip |
Known-bad IPv4 network ranges (Spamhaus DROP) — opt-in, see IP blocking |
You can check if a website is part of the bad websites list using the is_bad_website_url function.
use spider_firewall::is_bad_website_url;
fn main() {
let u = url::Url::parse("https://badwebsite.com").expect("parse");
let blocked = is_bad_website_url(u.host_str().unwrap_or_default());
println!("Is blocked: {}", blocked);
}You can add your own websites to the block list using the define_firewall! macro. This allows you to categorize new websites under a predefined or new category.
use spider_firewall::is_bad_website_url;
// Add "bad.com" to a custom category.
define_firewall!("unknown", "bad.com");
fn main() {
let u = url::Url::parse("https://bad.com").expect("parse");
let blocked = is_bad_website_url(u.host_str().unwrap_or_default());
println!("Is blocked: {}", blocked);
}You can specify websites to be blocked under specific categories such as "ads".
use spider_firewall::is_ad_website_url;
// Add "ads.com" to the ads category.
define_firewall!("ads", "ads.com");
fn main() {
let u = url::Url::parse("https://ads.com").expect("parse");
let blocked = is_ad_website_url(u.host_str().unwrap_or_default());
println!("Is blocked: {}", blocked);
}Enable the opt-in ip feature to also block known-bad IPv4 network ranges. The ranges are
embedded at build time from the Spamhaus DROP list and matched
via longest-prefix (binary) search. IPv6 currently always returns false.
spider_firewall = { version = "2.35", features = ["ip"] }use spider_firewall::{is_bad_ip, is_bad_ip_str};
fn main() {
assert!(!is_bad_ip("8.8.8.8".parse().unwrap())); // legitimate hosts are not blocked
let blocked = is_bad_ip_str("1.2.3.4"); // convenience: parses the string
println!("Is blocked: {}", blocked);
}The feed is rate-limited (~1 download/day) and revocable, so it is fetched non-fatally at build
time — a failed or rate-limited fetch yields zero ranges rather than breaking the build, and emits a
cargo:warning reporting the embedded range count (or that IP blocking is inactive).
For production builds where IP blocking must not silently disable on a rate-limited fetch, set
SPIDER_FIREWALL_IP_STRICT=1: the build then fails loudly if the DROP fetch returns zero ranges
(instead of shipping with IP blocking inactive). Retry once the ~1/day limit resets.
Attribution: IP range data is provided by The Spamhaus Project under the Spamhaus DROP terms (free for any use, attribution required). © The Spamhaus Project.
| Source | Categories | License |
|---|---|---|
| ShadowWhisperer/BlockLists | bad, ads, tracking, gambling | MIT |
| badmojr/1Hosts Lite | ads, tracking | MPL-2.0 |
| spider-rs/bad_websites | bad | MIT |
| Steven Black Unified Hosts | bad | MIT |
| Block List Project — Malware | bad | MIT |
| Block List Project — Phishing | bad | MIT |
| Block List Project — Scam | bad | MIT |
| URLhaus Filter (domains) | bad | CC0/MIT |
| Steven Black Hosts — Porn | bad (adult/porn) | MIT |
| malware-filter — Phishing | bad (phishing) | CC0/MIT |
| Source | Categories | License |
|---|---|---|
| Block List Project — Ransomware | bad | MIT |
| Block List Project — Fraud | bad | MIT |
| Block List Project — Abuse | bad | MIT |
| Phishing.Database — Active Domains | bad | MIT |
| Stamparm/maltrail — Suspicious | bad | MIT |
| phishdestroy/destroylist — Primary Active | bad (phishing/scam) | MIT |
| Source | Categories | License |
|---|---|---|
| Block List Project — Redirect | bad | MIT |
| Block List Project — Tracking | tracking | MIT |
| Block List Project — Ads | ads | MIT |
| Stamparm/maltrail — Malware | bad | MIT |
| abuse.ch URLhaus Hostfile | bad | CC0 |
The initial build can take longer, approximately 5-10 minutes, as it may involve compiling dependencies and generating necessary data files.
Contributions and improvements are welcome. Feel free to open issues or submit pull requests on the GitHub repository.
This project is licensed under the MIT License.