Commit Graph

600 Commits

Author SHA1 Message Date
Brian Cunnie
bcbf75a25b Shorten TTL for publicly-accessible A & AAAA records
If we have IPs that we need to block, I want them to time-out within the
hour.

TTL: 604800 → 3600 (1 week → 1 hour)
2023-11-19 13:50:28 -08:00
Brian Cunnie
aacd566ab4 3.0.0: enable TCP binding in addition to UDP 3.0.0 2023-10-04 08:07:03 -07:00
Brian Cunnie
85991a0793 Document the "TCP/UDP" metrics
Also, the README points out that we now bind to both UDP & TCP;
previously it said that we only bound to UDP.
2023-10-04 07:52:52 -07:00
Brian Cunnie
2df94a4352 🐞 Make integration tests more robust
I'd assert that the server had exited with a 1 (error condition) when it
couldn't bind via UDP to any addresses; however, I wrote the expectation
wrong, and sometimes the server hadn't exited by the time I made the
assertion, resulting in an exit code of -1 (not yet exited) instead of
1.

Using an async assertion `Eventually()`, with a switch `ExitCode()` →
`Exit()`, fixes that problem.

Fixes, during `ginkgo -r -p`:
```
  [FAILED] Expected
      <int>: -1
  to equal
      <int>: 1
  In [It] at: /home/cunnie/workspace/sslip.io/src/sslip.io-dns-server/integration_test.go:400 @ 10/02/23 03:58:15.824
```
2023-10-02 04:51:55 -07:00
Brian Cunnie
ce94dfc20b Metrics: track the use of TCP vs. UDP
Why implement a feature w/out measuring how much it gets used?

I want to know who, if anyone, is using TCP queries.

TODO: update the documentation.
2023-10-01 18:35:49 -07:00
Brian Cunnie
358d85bb04 Test TCP binding failure modes
- If it can't bind to all addresses via TCP, log the ones it could &
  couldn't bind to & keep running
- If it can't bind to any address via TCP, keep running (unlike UDP
  which must fail)

The big challenge in writing these tests is that the binding behavior is
different for macOS (Ventura 13.6 (22G120)) than for Linux.
Specifically, to "squat" on an address, macOS must listen on ALL TCP
addresses (INADDR_ANY) plus the specific address. Linux only needs to
listen to the specific address.

I have no idea what the behavior on Windows is.

I also removed listenPort as a top-level variable; it didn't need to be
top level.
2023-10-01 14:18:03 -07:00
Brian Cunnie
2eab823fc1 Less verbose testing
I was printing out the throughput (queries/second) in the middle of the
ginkgo tests, and it was unseemly and didn't belong.

I changed the test to make sure that the throughput was > 1,000 queries
per second. No unnecessary output.
2023-10-01 13:59:53 -07:00
Brian Cunnie
7bf617996d Bump serial 2023031500 → 2023093000 2023-09-30 12:56:25 -07:00
Brian Cunnie
9873f4f3f2 Bump dependencies 2023-09-30 12:49:49 -07:00
Brian Cunnie
81f9f6c9a3 Listen on TCP, not solely UDP
I've wanted sslip.io to bind to both UDP & TCP, mostly because TCP is
more secure (at least with regards to DNS cache poisoning).

In general, the process to receive a packet, whether TCP or UDP, is
similar.

- UDP uses `net.UDPConn`, TCP uses `net.TCPListener`
- Once bound, UDP uses `ReadFromUDP()` to get the data; TCP first
  requires an `AcceptTCP()` followed by a `Read()`
- Technically you can ask several queries over a single TCP socket, but
  I close the connection after the first query.
- DNS TCP packet has a two-byte length field that has no counterpart in
  the DNS UDP packet.
- The TCP integration tests are lacking.
2023-09-30 09:14:20 -07:00
Brian Cunnie
b09bccdd86 🐞 Make integration tests IPv4/IPv6 stack-agnostic
The integration test which worked fine on my dual-stack laptop failed on
my IPv4-only Concourse.

Fixes, when running `ginkgo -r -p .` on an IPv4-only machine:
```
sslip.io-dns-server When it can't bind to a port on loopback [BeforeEach] prints an informative message and continues
  [BeforeEach] /tmp/build/b4e0c68a/sslip.io/src/sslip.io-dns-server/integration_test.go:399
  [It] /tmp/build/b4e0c68a/sslip.io/src/sslip.io-dns-server/integration_test.go:409
  [FAILED] Unexpected error:
      <*net.OpError | 0xc000310d20>:
      listen udp [::1]:1918: socket: address family not supported by protocol
```
2023-09-30 09:01:01 -07:00
Brian Cunnie
4f81fccb7f Get rid of WaitGroups
I wasn't using them the way they're supposed to be used. I was using
them because they were "cool" and I wanted to force-fit them.
Specifically, I never called `WaitGroup.Done()`. Instead of using
WaitGroups to keep from exiting, I now dive into a readFrom(), which
never returns.
2023-09-25 10:53:46 -07:00
Brian Cunnie
5e81d41637 Test when it can't bind to INADDR_ANY 2023-09-25 10:52:47 -07:00
Brian Cunnie
81b142925f Clarify UDP binding code
In preparation for TCP binding, I re-worked the UDP binding process so
that it could be more understandable and more easily replicated.

I don't know that it's more understandable. I may have failed.
2023-09-24 10:07:12 +02:00
Brian Cunnie
4a095ca2b6 Test when it's unable to bind to _any_ port
Spoiler alert: it prints an error message and then exits non-zero.
2023-09-19 16:41:04 +02:00
Brian Cunnie
78e9ddf229 New test includes speed benchmarks: queries/second
I was worried that the DNS server had no headroom left on the DNS server
after one incident where the CI was red and the responses were "choppy".
Rebooting (restarting?) fixed the problem.

- ~19k Apple M2
- ~8k vSphere Xeon D-1736 2.7GHz
- ~6k AWS Graviton T2
- ~5k Azure Xeon E5-2673 v4 @ 2.30GHz

The busiest server, ns-aws.nono.io, handles ~132 queries/second.

It seems there's enough headroom for 37x (5000/132) the current traffic
on the slowest server.
2023-09-18 09:55:53 +02:00
Brian Cunnie
49ea1d36a8 Lay groundwork for allowing TCP Connections
- enhance logging messages to call out UDP as appropriate
- make the variable name UDP-specific
2023-09-13 15:11:52 +02:00
Brian Cunnie
d7526c6ca2 tools.go: go generate installs ginkgo
Works in conjunction with `go mod tidy`.

I probably did this wrong, but I don't care:

<https://www.jvt.me/posts/2022/06/15/go-tools-dependency-management/>
2023-09-13 14:58:32 +02:00
Brian Cunnie
e24797ff79 Bump dependencies go get -u -t; go mod tidy 2023-09-07 17:54:12 -07:00
Brian Cunnie
97b2a2936f Bump dependencies 2023-08-12 13:49:36 -07:00
Brian Cunnie
b9130c130a Leave a TODO breadcrumb for IDNA2008
If I ever want to make sure the results are IDNA2008-compliant, I'll
know which test to start with.

One of the things that held me back was that I couldn't find a spec for
what constitutes IDNA2008 compliance.

[#30]
2023-08-12 13:44:30 -07:00
Brian Cunnie
fd1665120e Update GKE node public IP addrs (etcd.pem) 2023-08-12 12:33:00 -07:00
Brian Cunnie
431cac2692 Fuzz-test the IPv6 PTR integration tests
This commit introduces fuzz-testing for the PTR lookups' integration
test.

This commit does NOT successfully surface the following error condition.
In that sense, this commit is a failure:
```
/usr/bin/dig @ns.sslip.io -x ::11b7:bf0a:0:0:d410 +short
  /usr/bin/dig: '--11b7-bf0a-0-0-d410.sslip.io.' is not a legal IDNA2008 name (string start/ends with forbidden hyphen), use +noidnout
```

- moves helper functions for test into a separate package,
  `xip/testhelper`.
- uses `dig`'s `-x` flag to make PTR lookup tests more readable, e.g.
  `dig -x ::1`

This IDN complaint has at least one related commit
([06f1556](06f1556699)).

[#30]
2023-07-12 11:45:57 -04:00
Brian Cunnie
2f9c75891d Bump dependencies go get -u -t; go mod tidy 2023-07-12 07:06:21 -04:00
Brian Cunnie
549d6713b8 Update GKE node public IP addrs (etcd.pem) 2023-07-10 06:21:38 -04:00
Brian Cunnie
92531f9460 Update GKE node public IP addrs (etcd.pem) 2023-06-02 08:14:30 -07:00
Brian Cunnie
55fc17559f Concourse: docker-image → registry-image
This allows our Concourse CI to pull the new multi-platform OCI Docker
images instead of pulling very stale, old Docker images.

Fixes, from <https://ci.nono.io/teams/main/pipelines/sslip.io/jobs/unit/builds/97>:
```
Ginkgo detected a version mismatch between the Ginkgo CLI and the version of Ginkgo imported by your packages:
  Ginkgo CLI Version:
    2.5.0
  Mismatched package versions found:
    2.8.4 used by sslip.io-dns-server, xip
```
2023-04-18 08:56:31 -07:00
Brian Cunnie
2356dbb451 Remove key-value from sslip.io home page
We've removed the key-value feature, so there's no need to describe them
on the home page. I also updated the examples.
2023-03-15 13:58:35 -07:00
Brian Cunnie
fb755b89a1 Update SOA to the Ides of March (3/15)
"Beware the ides of March."

William Shakespeare, _Julius Caesar_, Act 1, Scene 2

I should have bumped the SOA _before_ I cut release 2.7.0.
2023-03-15 13:53:50 -07:00
Brian Cunnie
463071ff90 Pipeline: DNS servers are tested against HEAD
...instead of latest release. This happens, for example, if I didn't fix
the specs before rolling out a new release. I may change this back in
the future.
2023-03-14 10:56:53 -04:00
Brian Cunnie
3e688e61de dns-servers test: remove key-value tests
We are no longer doing key-value-over-DNS.

Fixes <https://ci.nono.io/teams/main/pipelines/sslip.io/jobs/dns-servers/builds/1097>
```
rspec './spec/check-dns_spec.rb[1:17:1]' # sslip.io k-v.io tested on the ns-aws.sslip.io. nameserver sets a value, 1678804743, on the key sslipio-spec.k-v.io
rspec './spec/check-dns_spec.rb[1:17:2]' # sslip.io k-v.io tested on the ns-aws.sslip.io. nameserver gets the newly-set value, 1678804743, from the key, sslipio-spec.k-v.io
rspec './spec/check-dns_spec.rb[1:33:1]' # sslip.io k-v.io tested on the ns-azure.sslip.io. nameserver sets a value, 1678804743, on the key sslipio-spec.k-v.io
rspec './spec/check-dns_spec.rb[1:33:2]' # sslip.io k-v.io tested on the ns-azure.sslip.io. nameserver gets the newly-set value, 1678804743, from the key, sslipio-spec.k-v.io
rspec './spec/check-dns_spec.rb[1:49:1]' # sslip.io k-v.io tested on the ns-gce.sslip.io. nameserver sets a value, 1678804743, on the key sslipio-spec.k-v.io
rspec './spec/check-dns_spec.rb[1:49:2]' # sslip.io k-v.io tested on the ns-gce.sslip.io. nameserver gets the newly-set value, 1678804743, from the key, sslipio-spec.k-v.io
```
2023-03-14 10:40:04 -04:00
Brian Cunnie
1cf277c706 🐞 Tweak
Fixes, `fly trigger-job ...`:
```
error: resource not found
```
Fixes, `kubectl logs ...`:
```
flag provided but not defined: -etcdHost
Usage of /usr/sbin/sslip.io-dns-server:
```
2023-03-13 19:56:56 -04:00
Brian Cunnie
451ad0ef5f 2.7.0: remove key-value store 2.7.0 2023-03-13 16:46:20 -04:00
Brian Cunnie
a62a797fe5 Disable DNS-backed key-value store
I'm disabling the key-value store because no one was using it.

There are other reasons, too:

- The removal of the `etcd` library dropped the executable size by over
  half from 17MB to 7MB
- I didn't want users who've deployed it internally to be "surprised" by
  unexpected key-value features
- Key-value-over-DNS has a seamy side to it: "data exfiltration". I know
  there are legitimate uses for it, but I've come to believe that a
  Key-value-over-HTTP solution is preferable because it's not only more
  legitimate but also because it eliminates the DNS caching problem.
2023-03-13 16:44:30 -04:00
Brian Cunnie
326b717eb7 Bump dependencies go get -u -t 2023-02-28 17:41:38 -08:00
Brian Cunnie
e858c69248 Google Analytics: switch from UA to GA4
From
<https://support.google.com/analytics/answer/10759417>:

> Google Analytics 4 is replacing Universal Analytics. On July 1, 2023
all standard Universal Analytics properties will stop processing new
hits.

I wonder if Google Analytics is worth the trouble.
2023-02-05 17:52:56 -08:00
Brian Cunnie
7fa7a51453 Better etcd instructions when GKE re-IPs 2023-01-27 06:47:38 -08:00
Brian Cunnie
1ac38b2544 etcd.pem has updated GKE node public IP addrs
Fixes:
```
Jan 26 21:17:42 ns-aws etcd[508]: rejected connection from "34.121.219.144:39244" (error "tls: \"34.121.219.144\" does not match any of DNSNames [\"ns-aws.sslip.io\" \"ns-azure.sslip.io\" \"ns-gce.sslip.io\" \"ns-aws\" \"ns-azure\" \"ns-gce\"] (lookup ns-gce: Temporary failure in name resolution)", ServerName "ns-aws.sslip.io", IPAddresses ["127.0.0.1" "52.0.56.137" "52.187.42.158" "104.155.144.4" "34.71.136.235" "104.198.25.221" "35.223.15.132" "::1" "2600:1f18:aaf:6900::a"], DNSNames ["ns-aws.sslip.io" "ns-azure.sslip.io" "ns-gce.sslip.io" "ns-aws" "ns-azure" "ns-gce"])
```
2023-01-27 05:58:16 -08:00
Brian Cunnie
81dde02fdc Bump dependencies go get -u -t
Fixes:
```
Ginkgo detected a version mismatch between the Ginkgo CLI and the version of Ginkgo imported by your packages:
```
2023-01-11 14:12:19 -08:00
Brian Cunnie
a6defe5d33 Log executable path, version when starting 2023-01-11 14:04:21 -08:00
Brian Cunnie
94e0bb7abd 🐞 k8s deployment: flag is -quiet not quiet
This one cost me, in Google Cloud, $21.63. Sigh.
2023-01-02 13:58:08 -08:00
Brian Cunnie
0623523a6d Wildcard certs: show people an easier way 2022-12-11 18:21:17 -08:00
Brian Cunnie
0fdb9f27bc Dockerfile: deprecate custom registry-image
Now that we're on Concourse 7.9, we no longer need the custom Concourse
registry-image resource, so we use the stock resource instead.
2022-12-11 18:00:16 -08:00
Stefan Sundin
c2f4b9c80c Update module path. 2022-12-07 06:34:46 -08:00
Stefan Sundin
29a8ba0777 Fix broken links. 2022-12-07 06:34:46 -08:00
Brian Cunnie
2422c73a1b 🐞 k8s: DNS server args are args, not command
Fixes, when `kubectl describe pod sslip.io-xxx-yy`:

> Warning  Failed     72s (x4 over 113s)  kubelet            Error:
failed to create containerd task: failed to create shim task: OCI
runtime create failed: runc create failed: unable to start container
process: exec: "-etcdHost": executable file not found in $PATH: unknown
2022-11-27 06:53:39 -08:00
Brian Cunnie
37e4ab7537 🐞 GKE deployment saves $$ by not logging so much
Also, it uses the new ENTRYPOINT instead of the old CMD.
2022-11-26 18:42:16 -08:00
Brian Cunnie
8052a84428 🐞 Docker build fails when curl fails
Our CI sometimes builds "broken" docker images because it fails
downloading the proper executable (because I haven't populated the
GitHub release yet).

I'd like it to fail rather than publish broken images.

Fixes, during `docker run -it --rm cunnie/sslip.io-dns-server`:
```
exec /usr/sbin/sslip.io-dns-server: exec format error
```
2022-11-26 18:39:30 -08:00
Brian Cunnie
036b7a0c3a Instructions for fixing GKE's etcd's certs
...because they're so volatile. It's super annoying.
2022-11-26 17:18:23 -08:00
Brian Cunnie
513ef81acb etcd.pem has updated GKE node public IP addrs
Fixes:
```
Nov 26 16:50:36 ns-aws etcd[508]: rejected connection from "35.223.15.132:56234" (error "tls: \"35.223.15.132\" does not match any of DNSNames [\"ns-aws.sslip.io\" \"ns-azure.sslip.io\" \"ns-gce.sslip.io\" \"ns-aws\" \"ns-azure\" \"ns-gce\"] (lookup ns-gce: Temporary failure in name resolution)", ServerName "ns-aws.sslip.io", IPAddresses ["127.0.0.1" "52.0.56.137" "52.187.42.158" "104.155.144.4" "34.123.7.26" "34.121.225.254" "34.70.136.153" "::1" "2600:1f18:aaf:6900::a"], DNSNames ["ns-aws.sslip.io" "ns-azure.sslip.io" "ns-gce.sslip.io" "ns-aws" "ns-azure" "ns-gce"])
```
2022-11-26 16:50:15 -08:00