We want to place sslip.io on the Public Suffix List so we don't need to
pester Let's Encrypt for rate limit increases.
According to https://publicsuffix.org/submit/:
> owners of privately-registered domains who themselves issue subdomains
to mutually-untrusting parties may wish to be added to the PRIVATE
section of the list.
References:
- https://publicsuffix.org/
- https://github.com/publicsuffix/list/pull/2206
[Fixes#57]
Previously when the NS records were returned, ns-aws was always returned
first. Coincidentally, 64% of the queries were directed to ns-aws. And
once I exceeded AWS's 10 TB bandwidth limit, AWS began gouging me for
bandwidth charges, and $12.66/month rapidly climbed to $62.30
I'm hoping that by randomly rotating the order of nameservers, the
traffic will balance across the nameservers.
Current snapshot (already ns-ovh is helping):
ns-aws.sslip.io
"Queries: 237744377 (1800.6/s)"
"Answered Queries: 63040894 (477.5/s)"
ns-azure.sslip.io
"Queries: 42610823 (323.4/s)"
"Answered Queries: 14660603 (111.3/s)"
ns-gce.sslip.io
"Queries: 59734371 (454.1/s)"
"Answered Queries: 17636444 (134.1/s)"
ns-ovh.sslip.io
"Queries: 135897332 (1034.4/s)"
"Answered Queries: 36010164 (274.1/s)"
- located in Warsaw, Poland
- IPv4: 51.75.53.19
- IPv6: 2001:41d0:602:2313::1
The crux of this is to take the load off ns-aws, which jumped from
$12.66 → $20.63 → $38.51 → $62.30 in the last four months due to
bandwidth charges exceeding 10 TB.
The real fix is to randomize the order in which the nameservers are
returned.
I'm no longer engaged on setting up k-v.io; I thought it'd be cool to
have a DNS-backed etcd implementation, but now I don't care anymore.
There were technical challenges, too: Specifically, updating values did
not play well with DNS caching — you'd get the old value after updating.
If the service became popular, I'd quickly run out of disk space on my
tiny cloud VMs.
The service would most likely be used by people doing data exfiltration
via DNS. I already have enough problems with sslip.io scammers — the
last thing I want is to sign up for dealing with k-v.io scammers.
This commit removes the etcd configuration, certificates, and pipelines.
I had big plans for feeding in the configuration of the DNS server with
a JSON file, but since then I've come to consider command-line flags
good enough, so there's no reason to leave this useless file lingering —
it'll only server to confuse.
Meant for obtaining wildcard certs from Let's Encrypt using the DNS-01
challenge.
- introduce a variant of `blocklist.txt` to be used for testing
(`blocklist-test.txt`) because the blocklist has grown so large it
clutters the test output
- more rigorous about lowercasing hostnames when matching against
customized records. This needs to be extendend when we parse _any_
arguments
TODOs:
- remove the wildcard DNS servers
- update instructions
The sslip.io service has been abused by scammers and phishers who create
sites that masquerade as legitimate sites. For example,
<https://nf-43-134-66-67.sslip.io/sg> masqueraded as Netflix.
To combat this, we've undertaken to block all sites that masquerade as a
legitimate sites, but this had the unfortunate consequence of ensnaring
a legitimate staging site (th-ab.de).
This commit assists developers by updating the documentation to warn
developers not to index their staging site.
[#53]
When we promoted the Golang code to the root of the repo, we neglected
to update the paths in the documentation, helper scripts, and pipelines.
This commit addresses that oversight by updating the paths.
The last BOSH Release was cut over two years ago (Feb 26, 2022), and I
don't think we're ever gonna cut another one, so I'm clearing out the
BOSH-related files.
I deployed to BOSH until I decided k8s was the way to go, and then later
decided to deploy to standalone VMs
- That's where the code is expected to be
- The only reason the code was buried two directories down was because
it was originally a BOSH release
- There hasn't been a BOSH release in over two years; last one was Feb
26, 2022
- Other than a slight adjustment to the relative location of
`blocklist.txt` file in the integration tests, there were no other
changes
Companies who run their own sslip.io DNS nameservers may want to restrict
the resolution of public IPs to mitigiate bad actors from impersonating
them. For example, the corporation Pivotal, which owns the domain
pivotal.io, may want to set `-public=false` when they delegate the
domain `xip.pivotal.io` to their internal instances of sslip.io
nameservers, which enables their developers to use their internal IPs
(e.g. 10-9-9-31.xip.pivotal.io) while preventing a bad actor from using
a public IP (e.g. 52-0-56-137.xip.pivotal.io) to trick users.
- `-public` defaults to `true`
- `-public=true` enables resolution of all hostnames with embedded IP
addresses
- `-public=false` restricts resolution to hostnames with private IP
addresses
The following ranges are not considered public and always resolve:
- 10/8, 172.16/12, 192.168/16 — RFC 1918
- fc/7 — RFC 4193
- 100.64/10 — CG-NAT
- 127/8, ::1 — loopback
- 169.254/16 — IPv4 link local
- fe80/10 — IPv6 link local
- 64:ff9b:1/48 — IPv4/IPv6 translation private internet
- 2001:20/28 — ORCHIDv2
- 2001:db8/32 — Documentation
- This IPv6 address is "ephemeral" in the sense that if a `terraform
destroy` and `terraform apply` are run I'll get a different address
(not `2600:1900:4000:4d12::`)
- I don't plan on updating the WHOIS information because the address is
somewhat ephemeral
Previously the GCP NS was a k8s container, but now it's a standalone VM
(for, believe it or not, cost reasons: it was cheaper to assign a static
IP to a VM than to a load balancer).
The instructions now include the procedure to update the GCP VM.
Also, we double-checked that all servers had the same version number
twice, and now we only do it once. And we incorporate it with another
step, so there are two fewer steps to follow.
I don't need this k8s configuration for sslip.io (DNS, NTP) because I'm
no longer hosting on GKE now that it has an ephemeral IP instead of a
reserved IP because otherwise I'd have to pay $360 extra per year for a
premium-tier load balancer.
The GKE's cluster's IP address is now an ephemeral IP because otherwise
I'd have to pay $360 extra per year from a premium-tier load balancer.
I don't want my website to point to an ephemeral address that quickly
becomes stale, so I'm pointing from what previously was the GKE
cluster's address to the AWS's NS server's address.
My integration tests would randomly fail 5% of the time (based on a sample size
of 59 tests), and the reason was that my algorithm for choosing a random
port was flawed. I was very proud of that algorithm, so accepting that
it was flawed was a bitter pill.
One of the problems was that it had unnecessarily limited the range of
available ports to pick from to 1,000. This change expands that
selection to 64,511.
I changed to a less-clever-but-more-reliable algorithm, and the results
are stunning order-of-magnitude increase in reliability: 0.5% failure
rate (based on 210 tests).
Fixes, when running `ginkgo -r -p -until-it-fails .`
```
Got stuck at:
...
2023/11/19 21:52:15 I couldn't bind via UDP to any IPs on port 1941, so I'm exiting
Waiting for:
Ready to answer queries
```
I'd assert that the server had exited with a 1 (error condition) when it
couldn't bind via UDP to any addresses; however, I wrote the expectation
wrong, and sometimes the server hadn't exited by the time I made the
assertion, resulting in an exit code of -1 (not yet exited) instead of
1.
Using an async assertion `Eventually()`, with a switch `ExitCode()` →
`Exit()`, fixes that problem.
Fixes, during `ginkgo -r -p`:
```
[FAILED] Expected
<int>: -1
to equal
<int>: 1
In [It] at: /home/cunnie/workspace/sslip.io/src/sslip.io-dns-server/integration_test.go:400 @ 10/02/23 03:58:15.824
```
- If it can't bind to all addresses via TCP, log the ones it could &
couldn't bind to & keep running
- If it can't bind to any address via TCP, keep running (unlike UDP
which must fail)
The big challenge in writing these tests is that the binding behavior is
different for macOS (Ventura 13.6 (22G120)) than for Linux.
Specifically, to "squat" on an address, macOS must listen on ALL TCP
addresses (INADDR_ANY) plus the specific address. Linux only needs to
listen to the specific address.
I have no idea what the behavior on Windows is.
I also removed listenPort as a top-level variable; it didn't need to be
top level.
I was printing out the throughput (queries/second) in the middle of the
ginkgo tests, and it was unseemly and didn't belong.
I changed the test to make sure that the throughput was > 1,000 queries
per second. No unnecessary output.
I've wanted sslip.io to bind to both UDP & TCP, mostly because TCP is
more secure (at least with regards to DNS cache poisoning).
In general, the process to receive a packet, whether TCP or UDP, is
similar.
- UDP uses `net.UDPConn`, TCP uses `net.TCPListener`
- Once bound, UDP uses `ReadFromUDP()` to get the data; TCP first
requires an `AcceptTCP()` followed by a `Read()`
- Technically you can ask several queries over a single TCP socket, but
I close the connection after the first query.
- DNS TCP packet has a two-byte length field that has no counterpart in
the DNS UDP packet.
- The TCP integration tests are lacking.
The integration test which worked fine on my dual-stack laptop failed on
my IPv4-only Concourse.
Fixes, when running `ginkgo -r -p .` on an IPv4-only machine:
```
sslip.io-dns-server When it can't bind to a port on loopback [BeforeEach] prints an informative message and continues
[BeforeEach] /tmp/build/b4e0c68a/sslip.io/src/sslip.io-dns-server/integration_test.go:399
[It] /tmp/build/b4e0c68a/sslip.io/src/sslip.io-dns-server/integration_test.go:409
[FAILED] Unexpected error:
<*net.OpError | 0xc000310d20>:
listen udp [::1]:1918: socket: address family not supported by protocol
```