The torrent of traffic I'm receiving has caused my AWS bill to spike
from $9 to $148, all of the increase due to bandwidth charges.
I'm still maintaining ns-aws; the VM still continue to run, and continue
to serve web traffic, and maintain its hostname and IP addresses;
however, it will no longer be in the list of NS records for sslip.io.
There are much less expensive hosting providers. OVH is my current
favorite.
Rather than bloating the code with yet another flag, one that only I
would use, and in only one specific case (ns-aws.sslip.io), it would be
better to simply take ns-aws.sslip.io out of the NS list.
I'm being gouged by bandwidth costs by AWS. Last month's bill was $148,
and all but $9 was about bandwidth.
My bandwidth has been inexplicably climbing since February:
Billing
Month Total GB % increase
2024/2 37.119
2024/3 52.953 42.66%
2024/4 58.745 10.94%
2024/5 69.307 17.98%
2024/6 173.371 150.15%
2024/7 334.064 92.69%
2024/8 539.343 61.45%
2024/9 568.745 5.45%
2024/10 1365.305 140.06%
The new flag will allow me to throttle the AWS bandwidth to ~287 queries
/ second, which, according to my calculations, will max out the free
100 GB bandwidth without dipping into the for-pay bandwidth.
We want to place sslip.io on the Public Suffix List so we don't need to
pester Let's Encrypt for rate limit increases.
According to https://publicsuffix.org/submit/:
> owners of privately-registered domains who themselves issue subdomains
to mutually-untrusting parties may wish to be added to the PRIVATE
section of the list.
References:
- https://publicsuffix.org/
- https://github.com/publicsuffix/list/pull/2206
[Fixes#57]
Previously when the NS records were returned, ns-aws was always returned
first. Coincidentally, 64% of the queries were directed to ns-aws. And
once I exceeded AWS's 10 TB bandwidth limit, AWS began gouging me for
bandwidth charges, and $12.66/month rapidly climbed to $62.30
I'm hoping that by randomly rotating the order of nameservers, the
traffic will balance across the nameservers.
Current snapshot (already ns-ovh is helping):
ns-aws.sslip.io
"Queries: 237744377 (1800.6/s)"
"Answered Queries: 63040894 (477.5/s)"
ns-azure.sslip.io
"Queries: 42610823 (323.4/s)"
"Answered Queries: 14660603 (111.3/s)"
ns-gce.sslip.io
"Queries: 59734371 (454.1/s)"
"Answered Queries: 17636444 (134.1/s)"
ns-ovh.sslip.io
"Queries: 135897332 (1034.4/s)"
"Answered Queries: 36010164 (274.1/s)"
- located in Warsaw, Poland
- IPv4: 51.75.53.19
- IPv6: 2001:41d0:602:2313::1
The crux of this is to take the load off ns-aws, which jumped from
$12.66 → $20.63 → $38.51 → $62.30 in the last four months due to
bandwidth charges exceeding 10 TB.
The real fix is to randomize the order in which the nameservers are
returned.
I'm no longer engaged on setting up k-v.io; I thought it'd be cool to
have a DNS-backed etcd implementation, but now I don't care anymore.
There were technical challenges, too: Specifically, updating values did
not play well with DNS caching — you'd get the old value after updating.
If the service became popular, I'd quickly run out of disk space on my
tiny cloud VMs.
The service would most likely be used by people doing data exfiltration
via DNS. I already have enough problems with sslip.io scammers — the
last thing I want is to sign up for dealing with k-v.io scammers.
This commit removes the etcd configuration, certificates, and pipelines.
I had big plans for feeding in the configuration of the DNS server with
a JSON file, but since then I've come to consider command-line flags
good enough, so there's no reason to leave this useless file lingering —
it'll only server to confuse.
Meant for obtaining wildcard certs from Let's Encrypt using the DNS-01
challenge.
- introduce a variant of `blocklist.txt` to be used for testing
(`blocklist-test.txt`) because the blocklist has grown so large it
clutters the test output
- more rigorous about lowercasing hostnames when matching against
customized records. This needs to be extendend when we parse _any_
arguments
TODOs:
- remove the wildcard DNS servers
- update instructions
The sslip.io service has been abused by scammers and phishers who create
sites that masquerade as legitimate sites. For example,
<https://nf-43-134-66-67.sslip.io/sg> masqueraded as Netflix.
To combat this, we've undertaken to block all sites that masquerade as a
legitimate sites, but this had the unfortunate consequence of ensnaring
a legitimate staging site (th-ab.de).
This commit assists developers by updating the documentation to warn
developers not to index their staging site.
[#53]
When we promoted the Golang code to the root of the repo, we neglected
to update the paths in the documentation, helper scripts, and pipelines.
This commit addresses that oversight by updating the paths.
The last BOSH Release was cut over two years ago (Feb 26, 2022), and I
don't think we're ever gonna cut another one, so I'm clearing out the
BOSH-related files.
I deployed to BOSH until I decided k8s was the way to go, and then later
decided to deploy to standalone VMs
- That's where the code is expected to be
- The only reason the code was buried two directories down was because
it was originally a BOSH release
- There hasn't been a BOSH release in over two years; last one was Feb
26, 2022
- Other than a slight adjustment to the relative location of
`blocklist.txt` file in the integration tests, there were no other
changes
Companies who run their own sslip.io DNS nameservers may want to restrict
the resolution of public IPs to mitigiate bad actors from impersonating
them. For example, the corporation Pivotal, which owns the domain
pivotal.io, may want to set `-public=false` when they delegate the
domain `xip.pivotal.io` to their internal instances of sslip.io
nameservers, which enables their developers to use their internal IPs
(e.g. 10-9-9-31.xip.pivotal.io) while preventing a bad actor from using
a public IP (e.g. 52-0-56-137.xip.pivotal.io) to trick users.
- `-public` defaults to `true`
- `-public=true` enables resolution of all hostnames with embedded IP
addresses
- `-public=false` restricts resolution to hostnames with private IP
addresses
The following ranges are not considered public and always resolve:
- 10/8, 172.16/12, 192.168/16 — RFC 1918
- fc/7 — RFC 4193
- 100.64/10 — CG-NAT
- 127/8, ::1 — loopback
- 169.254/16 — IPv4 link local
- fe80/10 — IPv6 link local
- 64:ff9b:1/48 — IPv4/IPv6 translation private internet
- 2001:20/28 — ORCHIDv2
- 2001:db8/32 — Documentation
- This IPv6 address is "ephemeral" in the sense that if a `terraform
destroy` and `terraform apply` are run I'll get a different address
(not `2600:1900:4000:4d12::`)
- I don't plan on updating the WHOIS information because the address is
somewhat ephemeral
Previously the GCP NS was a k8s container, but now it's a standalone VM
(for, believe it or not, cost reasons: it was cheaper to assign a static
IP to a VM than to a load balancer).
The instructions now include the procedure to update the GCP VM.
Also, we double-checked that all servers had the same version number
twice, and now we only do it once. And we incorporate it with another
step, so there are two fewer steps to follow.
I don't need this k8s configuration for sslip.io (DNS, NTP) because I'm
no longer hosting on GKE now that it has an ephemeral IP instead of a
reserved IP because otherwise I'd have to pay $360 extra per year for a
premium-tier load balancer.
The GKE's cluster's IP address is now an ephemeral IP because otherwise
I'd have to pay $360 extra per year from a premium-tier load balancer.
I don't want my website to point to an ephemeral address that quickly
becomes stale, so I'm pointing from what previously was the GKE
cluster's address to the AWS's NS server's address.
My integration tests would randomly fail 5% of the time (based on a sample size
of 59 tests), and the reason was that my algorithm for choosing a random
port was flawed. I was very proud of that algorithm, so accepting that
it was flawed was a bitter pill.
One of the problems was that it had unnecessarily limited the range of
available ports to pick from to 1,000. This change expands that
selection to 64,511.
I changed to a less-clever-but-more-reliable algorithm, and the results
are stunning order-of-magnitude increase in reliability: 0.5% failure
rate (based on 210 tests).
Fixes, when running `ginkgo -r -p -until-it-fails .`
```
Got stuck at:
...
2023/11/19 21:52:15 I couldn't bind via UDP to any IPs on port 1941, so I'm exiting
Waiting for:
Ready to answer queries
```