Pi-hole: A Raspberry Pi Ad-Blocker with DNS Caching (Ultra-fast)

Pi-hole: A Raspberry Pi Ad-Blocker with DNS Caching (Ultra-fast)

This version of the Pi-hole is outdated

Click the image below to read about the new version.

pihole-millions

Inspired by the AdTrap, I wanted to make a low-cost, roll-your-own alternative that would neutralize ads before they reach your device.  The Raspberry Pi fit this need.  Unfortunately, the Pi only has 100BaseT (but there are alternatives), which isn’t ideal, but it still ran very fast for me.

The Raspberry Pi runs as a DNS server and redirects queries for advertisements to a local Web server, which will display 1×1 transparent image instead of the ad.

How It Works

Block online ads by using your Raspberry Pi to manipulate advertising URLs.

Network map of Raspberry Pi ad-blocking pi-hole

Another benefit of this is that your other network devices can use it instead of a browser plugin ad-blocker.  This way, your computer doesn’t have to do the processing of blocking the ads, which frees up resources for other applications.

Requirements For This Walkthrough

Materials

  1. Local network
  2. Mac or PC
  3. Raspberry Pi running Raspbian “wheezy” running a lighttpd Web server
  4. HDMI Cable (*optional)
  5. Keyboard (*optional)
  6. Mouse (*optional)
  7. Monitor with HDMI input (*optional)
  8. *If the Raspberry Pi is set up as a headless machine, you will not need a monitor, keyboard, or mouse–just another computer, which would be used to access it remotely over the network via SSH.  This is the recommended method.

Downloads

I have copies of all the files on Github, but in the walkthrough, I show how to make each one.  You can download them if you want, but it is not required as the files will be made in the walkthrough.

  1. Pi-hole

Resources

Automated Setup

Instructions are available. These instructions will install a newer version of the Pi-hole, which is different than what is described in this article.

Alternative Methods

In my research, there were a few different ways to do this.  I wanted a way that would be the least disruptive to normal network activity while still providing a cool feature.  Below are some of these methods:

  • Pixelserv (too slow)
  • Apache (too heavy-duty for the Pi)
  • Access point with DNS, DHCP, and Apache (inconvenient, and bloated)

I decided to go with the following for this setup:

  • lighttpd Web server (lighter weight than Apache–better suited for the Pi)
  • dnsmasq  running only DNS (lighter-weight and faster because caching will be enabled)
  • serving requests over eth0  and not wlan0  (faster and more reliable.  It can also be hooked up directly to a router)

Conceptual Overview

  1. Set up a lighttpd Web Server where ads will be redirected to
  2. Install the DNS server software and utilities
  3. Configure DNS
  4. Create a script to redirect ad servers/URLs back to the Pi (and not your device)
  5. Point your device’s DNS server at the Pi-hole [Critical Step]
  6. Test it out!

Pre-requisites

Install and enable a lighttpd Web server on the Raspberry Pi.

Prepare the Pi

To ensure everything is up-to-date,  run:

sudo apt-get -y update
sudo apt-get -y upgrade

Install DNS

sudo apt-get -y install dnsutils dnsmasq

This will install some DNS troubleshooting utilities like dig , as well as dnsmasq  (the DNS server).

Stop DNS

To make certain it is not running while we modify the files, manually stop the service.

sudo service dnsmasq stop

Edit the DNS Config File

To get a quick view of what options are enabled by default for the dnsmasq  service, run this command:

cat /etc/dnsmasq.conf | grep -v "#" | sed '/^$/d'

This will show all the options that are enabled by parsing out all the comments and blank lines (if it returns nothing, then no options are enabled, which might be the case if this is a brand new install).

We will be making our own file, but this is a nice way to see what is enabled if you happened to already have dnsmasq  running.  The command above can also be applied to other config files with a lot of comments in it.

Since we are going to make our own DNS config file, rename the default one so it is available as a backup in case the system turns unstable:

sudo mv /etc/dnsmasq.conf /etc/dnsmasq.conf.orig

Then, create a new file, which will have only the options we want:

sudo vi /etc/dnsmasq.conf

You can either uncomment the options below or just paste the following into the new /etc/dnsmasq.conf file adjusting any values if necessary:

domain-needed
interface=eth0
min-port=4096
cache-size=10000
log-queries
bogus-priv
# Uncomment the next line and comment out the remaining lines if you want to use /etc/resolv.conf
#strict-order
no-resolv
server=127.0.0.1
server=8.8.8.8
server=8.8.4.4

The options above are as follows:

  • domain-needed:  never forward names without a dot or domain part
  • interface: specifies that the ethernet port is used
  • min-port: the ports used will always to larger than that specified. This is useful for systems behind firewalls
  • cache-size: set as large as possible (10,000) to allow for super-fast DNS queries
  • log-queries: logs all name resolutions  to /var/log/daemon.log
  • bogus-priv: never forward reverse-lookup queries which have the local subnet’s IP range to upstream
  • no-resolv: do not use /etc/resolv.conf  (see below–this method is a bit easier than using a separate file)
  • server: nameservers to use

Edit the Resolver File

This file sets what servers to use to try to resolve domain names.  dnsmasq  checks this file by default unless you use the no-resolv  option.  Since we are setting up our own DNS server, we can set ourselves as the first one–this will decrease query times.  Edit /etc/resolv.conf:

sudo vi /etc/resolv.conf

There can only be three entries in this file for nameservers, so just add yourself (127.0.0.1) and Google’s public DNS servers (for resolving non-ad queries):

nameserver 127.0.0.1
nameserver 8.8.8.8
nameserver 8.8.4.4

This will allow all queries to go through the Raspberry Pi first, and if it cannot resolve the name, it will use Google’s DNS servers.  It is also possible you don’t need this file at all, as if there are no entries, it will just use the default nameserver on the machine.

I have found that this file can get overwritten at reboot, which causes the Pi-hole to stop working.  If you do want to use the /etc/resolv.conf  file, use the following solution from Jeff:

  1. Uncomment and modify /etc/dhcp/dhclient.conf  to make it read: prepend domain-name-servers 127.0.0.1
  2. Add this additional line in /etc/dnsmasq.conf : strict-order

Open the Black Hole

Next, create a script that will pull known ad URLs from a Website and save them in a file.  This file will tell DNS to re-route any queries to those domains to the IP specified, which for this setup, will be the Raspberry Pi (and not the other devices on the network).  This script will be appropriately named, gravity.sh  as it is pulling in ad URLs.  A trimmed-down version is below, or you can check out the advanced version, which pulls URLs from multiple sources.

Make sure to modify the value for the $piholeIP  variable to be the IP address of your Raspberry Pi (192.168.1.101 in my example):

sudo vi /usr/local/bin/gravity.sh
#!/bin/bash
# Address to send ads to (your Raspberry Pi's IP)
piholeIP="192.168.1.101"
eventHorizion="/etc/dnsmasq.d/adList.conf"
curl -s -d mimetype=plaintext -d hostformat=unixhosts http://pgl.yoyo.org/adservers/serverlist.php? | sort | sed '/^$/d' | awk -v "IP=$piholeIP" '{sub(/\r$/,""); print "address=/"$0"/"IP}' > $eventHorizion
service dnsmasq restart

Make the script executable an then manually run it once to verify functionality:

sudo chmod 755 /usr/local/bin/gravity.sh
sudo /usr/local/bin/gravity.sh

If it completed successfully, there will be a new file named /etc/dnsmasq.d/adList.conf .  If you look at this file, it will have a huge list (but not that huge) of ad servers and the IP address–192.168.1.101 (or whatever you set it to) in this example (the Raspberry Pi).

address=/101com.com/192.168.1.101
address=/101order.com/192.168.1.101
address=/123found.com/192.168.1.101
address=/180hits.de/192.168.1.101
address=/180searchassistant.com/192.168.1.101
address=/1x1rank.com/192.168.1.101
.....
..... 
.....
address=/zeus.developershed.com/192.168.1.101
address=/zeusclicks.com/192.168.1.101
address=/zintext.com/192.168.1.101
address=/zmedia.com/192.168.1.101

Now What?

So how does this block ads?  Any file in the /etc/dnsmasq.d  directory is considered a config file and loaded up when dnsmasq  starts.  The syntax of the example line below says “if the address queried is 101com.com (an ad server) send it to this IP address (the Pi–192.168.1.101 or whatever you set it to).”

address=/101com.com/192.168.1.101

This file is full of these rules, which are loaded when dnsmasq  is.  If a DNS query matches one of the rules, it will send the request to 192.168.1.101, which is the Raspberry Pi.  Since the request gets redirected there, it won’t reach your other device(s).

DNS is set up to accept queries and re-route ad URLs to an IP address.  The service should have been started when you ran the script.

Create A Recurring Task to Update the Ad List

In case the Website that compiles the list of ad servers ever updates, set gravity.sh  to run once a week using crontab.  This is completely optional, and may be a bit of overkill, but here is is anyway:

sudo crontab -e

Append the following line to the bottom of the file and save it:

@weekly /usr/local/bin/gravity.sh

Now the ad list should always stay up-to-date if the Website it is being pulled from ever updates it.

Verify DNS Functionality

Set Clients To Use The Pi As Their DNS Server

From another device on the network, change your DNS server to point to the Raspberry Pi.  On the Mac, you can change this in the Advanced Network Preferences.

advanced net prefs

Then run the dig  command from that device.  Any ad servers on the list should point to the Raspberry Pi instead of the real address.

Run the dig  command on a Website you have never visited (just to ensure the response is not already cached).  In the example below, I dig a site that I have not been to and is not an ad server.

dig theupabcs.com

Take note of two things in the screen shot or the results below.

  1. the Website’s real address is shown
  2. the response time is 64ms.

jacobsalmela_digslow

Now, take a look at the results when I run the same command again.  Thanks to the caching we enabled, the response time is now 6ms; much better!  This happens because we set the option cache=10000  in the /etc/dnsmasq.conf  file.  This cache is stored in RAM and will be flushed whenever the service is restarted.

jacobsalmela_digcache

Next, I dig  a site that is an ad server (and on the adList.conf  file), and the response gets answered by the Raspberry Pi, but the IP that gets returned is not the real IP address of the Website.  Instead, it is the IP address of the Raspberry Pi.

dig a-ads.com

digad

Create A Webpage That Will Pose As the Ad URLs Website

Now that DNS is working and rerouting ad URLs to the Raspberry Pi, we can set up a blank Webpage that will be served (instead of responding with an error message).  After some more testing, I determined the response time is about the same whether you run the Web server or not, so you can decide wether or not to employ it.  However, if you want to replace the ad content with something else, you will want to have the server set up.  Plus, it seems better to serve something instead of continually erroring-out.

First, set up a lighttpd Web server.  The service basically just needs to be turned on and one line added to the config file.

Once your Web server is up and running, create a new directory for the page, download the image.

sudo mkdir -p /var/www/pihole
sudo curl -o /var/www/pihole/pihole.png https://dl.dropboxusercontent.com/u/16366947/Photos/pihole.png

Then, edit the file to contain a 1×1 image found here.

sudo vi /var/www/pihole/index.html

And add the following content:

<html><body><img src="pihole/pihole.png"></img></body></html>

Now you have a Webpage that serves up a tiny, transparent image, which will take place of the ads.

Edit the lighttpd Config File

Add the following to the end of /etc/lighttpd/lighttpd.conf :

$HTTP["host"] =~ ".*" {
     url.rewrite = (".*" => "pihole/index.html")
}

This will rewrite all URLs to point to the blank page instead of a page that doesn’t exist.  You view the entire file here.

Alternatively…

Just send all ads directly the the transparent image file by editing  /etc/lighttpd/lighttpd.conf to read:

$HTTP["host"] =~ ".*" {
     url.rewrite = (".*" => "pihole/pihole.png")
}

Restart Services and Test

If you made a lot of changes while editing the files, restart everything:

sudo service dnsmasq restart
sudo service lighttpd restart

Now, use the curl  command to try to download an advertising domain:

curl -I doubleclick.com

The command above will download just the header information from the site (the -I  option), which will be resolved to the Raspberry Pi (192.168.1.101) and not the real domain.  There output of the command should be similar to the following:

HTTP/1.1 200 OK
Content-Type: text/html
Accept-Ranges: bytes
ETag: "3978160023"
Last-Modified: Wed, 11 Jun 2014 01:10:23 GMT
Date: Wed, 11 Jun 2014 01:52:40 GMT
Server: lighttpd/1.4.31

The main line you should be interested in is the HTTP/1.1 200 OK , which is a message saying that the content was delivered OK.  You can also take note of the server, which is the lighttpd  server that you set up.

If you decided not to use the Web server, the result of your curl  command would look like this:

curl: (7) Failed to connect to ads.hulu.com port 80: Connection refused

Serving A Blank Page vs. Blank Image vs. Not Using the lighttpd Server

You can easily determine which is the fastest method by using the time  command.  It will record the time it takes to run a command.  Using the one-liner below, it will attempt to curl  the URL ten times and report back how long it took.

for i in {1..10};do time curl -I doubleclick.net >/dev/null;done 2>&1 | awk '/real/ {print $2}'

With the Web server turned off, you get results like this:

0m0.247s
0m0.212s
0m0.197s
0m0.198s
0m0.204s
0m0.206s
0m0.204s
0m0.217s
0m0.232s
0m0.198s

Then, with it turned on (serving up a blank page instead of nothing):

0m0.234s
0m0.192s
0m0.193s
0m0.195s
0m0.200s
0m0.205s
0m0.205s
0m0.204s
0m0.205s
0m0.205s

So the processing time is pretty negligible either way you go.  But I still opt for the error-free method.

Also, some people have reported some sites display empty boxes where the ads used to be, while other sites remove the ad space completely.  See examples of this below.

pi-hole-example1

pi-hole-example2

I think the transparent 1×1 image works the best for most situations, but you can try it out and see how things work for you.  It’s likely that the Pi-hole won’t be able to block every ad perfectly, but it does work for the most part.  There Internet is organic and advertisers will always find a way to inject ads, but this solution seems to work!

Success!

Now your device should be blocking ads from the list as long as it has the Rapberry Pi set at your DNS server.  You can test it here or here.

Advanced Improvements

Watching the Log File

When we edited the dnsmasq.conf  file, we added the debug option (log-queries) in there.  This is very useful for finding out what URLs ads might be coming from.  You can simply watch the log file as you navigate to a site on your computer.  In the example below, I had my Apple TV using the Raspberry Pi as its DNS server.  This allowed me to watch what URLs Hulu was using when the ads appeared on the TV.

cat /var/log/daemon.log | grep -i hulu

Using grep , I was able to parse out the just the links relating to hulu .

ads-v-darwin.hulu.com
ads-v-darwin.hulu.com.c.footprint.net
appletv.app.hulu.com
assets.huluim.com
c.p.hulu.com
general.hulu.com.c.footprint.net
httpls-1.hulu.com
httpls-e.hulu.com
ib.huluim.com
play.hulu.com
play.hulu.com.akadns.net
play.hulu.com.akadns.net
pt.hulu.com
s.hulu.com
secure.hulu.com
star.app.hulu.com.akadns.net
t.hulu.com
track.hulu.com

If you just want to watch the log file in real time, use this command.  It is kind of fun to see what pops up when you navigate to a site.

tail -f /var/log/daemon.log

Once you find a URL you want to block, you can just append it to your adList.conf  file.

Blocking Even More Ads

The advanced setup gets ad URLs from different locations and compiles them into one place for even more ad-blocking power.  I am also close to skipping Hulu plus video ads, but seem to be stuck.

Open Source

The Pi-hole is completely open source and free.  Any feedback, suggestions, or improvements are welcomed.

https://github.com/jacobsalmela/pi-hole