Confession: I have no idea why I didn't use the global variable `port`
instead of deciding to thread `port` as a parameter.
But for some reason I felt that it was a good idea. Oh well. Committing
these changes before they're lost.
Parallelizable tests (`ginkgo -r -p .`) were failing on my 20-core
(`-nodes=20`) Mac Studio. We narrowed this down to two causes:
1. The servers sometimes took longer than the hard-coded 3-second delay
to become ready to answer queries.
2. The blocklist was downloaded asynchronously, and sometimes weren't
ready by the time the queries were run.
To address these, we did the following:
1. Rather than hard-code a 3-second delay, we modified the server to
signal that it's ready to answer queries (by printing "Ready to
answer queries" to the log). We now wait for that string to appear
before we begin testing the server. IMHO, this is a much better
solution than a hard-coded delay.
2. The initial download of the blocklist occurs synchronously, and
subsequent downloads, asynchronously.
Drive-bys:
- If the server can't bind to even one address, it exits.
- Refactored the blocklist code; the nested if-then-else were too deep
Fixes:
```
Expected
<string>: 43.134.66.67
to match regular expression
<string>: \A52.0.56.137\n\z
In [It] at: /Users/cunnie/workspace/sslip.io/src/sslip.io-dns-server/integration_test.go:421
```
We'd like to parallelize the tests to lay the foundation for the
upcoming expansion of flags passed to the executable (e.g.
`-nameservers`), which will spawn a series of executables, each of which
takes 3 seconds to spin up, and running that sequentially would make
testing tiresome.
- We've migrated away from `serverSession.Err).Should(Say())`
to `serverSession.Err.Contents())).Should(MatchRegexp())`. `Say()`
depends on ordering, `MatchRegexp()` doesn't.
- We introduce a short, 50-millisecond `Sleep()` in `isPortFree()` to
eliminate a race condition introduced by parallelization where the
same port is returned twice.
- Some of our `DescribeTable` tests were order-dependent; we moved them
outside the table.
- We parallelize our pipeline's unit tests.
- For the `k-v.io` tests, we used different keys for each `It()` block
to avoid pollution. We are also more careful about waiting for the
setup to complete before running the actual test.
As a side-effect of parallelizing the tests, we no longer require `sudo`
on Linux to run the tests, for we no longer attempt to bind to port 53;
instead, we bind to a series of available unprivileged ports.
Previously our integration tests bound to port 53, and, if that failed,
fell back to binding to port 3553.
This commit introduces code to scan for an open port and uses that,
which lays the foundation for potentially parallelizing the integration
tests.
We implement PTR records for IPv6, for example:
2.a.b.b.4.0.2.9.a.e.e.6.e.c.4.1.0.f.9.6.0.0.1.0.6.4.6.0.1.0.6.2.ip6.arpa →
2601-646-100-69f0-14ce-6eea-9204-bba2.sslip.io.
We implement PTR records for IPv4.
When a PTR record is not found (e.g. "127.in-addr.arpa"), it returns the
SOA record, but, unlike other record lookups (e.g. "MX"), the SOA's
mname is locked to "sslip.io" because setting the mname to
"127.in-addr.arpa" doesn't make sense.
To be done:
- Implement IPv6
- Implement Metrics
- Update README
- Deploy new version
Prohibit setting DNS-01 challenge TXT record `_acme-challenge.k-v.io`
Although it may appear the TXT record can be set or deleted, it's
hardcoded to the string, "Please don't try to procure a k-v.io cert via
DNS-01 challenge". Setting a custom value was easier than writing a
special code path.
Special thanks to [Alan Liang](http://symb.olic.link/):
> ... one could easily add (and modify) a TXT record at
_acme-challenge.k-v.io, which I believe is used for verifying domain
ownership at various cert providers, so anyone could in theory obtain
valid SSL certs for k-v.io and *.k-v.io
I've chosen to add the website to GKE, not Hetzner, because I get fewer
strident abuse messages from GKE.
I'm dismayed that when I make a small change to the DNS, I need to go
through the laborious release process for it to take effect. Sigh. Maybe
that's something I'll fix another day.
We don't return the deleted value because doing that would have the
unintended consequence of postponing the deletion: downstream caching
servers would cache the deleted value for up to three more minutes. We'd
rather have the key deleted sooner rather than later.
Some APIs, e.g. etcd's, return a list of deleted values on return: those
APIs can afford to do so because they don't need to worry about DNS
propagation.
We also lengthen the timeout of an `etcd` API call from 500 msec to 1928
msecs; 500 msec was too close; some calls routinely took 480 msec to
complete, and we wanted more headroom.
We also no longer do two `etcd` operations when we delete a value.
Previously we would do a GET followed by a DELETE, but since we're not
returning the value deleted, there's no point to the GET. Furthermore,
the GET was never necessary, for the `etcd` DELETE API call returned the
values deleted.
Drive-by:
- README: install gingko the proper way, with `go install`
[fixes#17]
Now that we're no longer create BOSH releases, we don't need to bury the
`src/` directory under `bosh-release`; we can now place it under the
repo root, and we no longer need to fiddle with symbolic links.
We're not creating BOSH releases because when we decided to implement a
key-value store, we'd have to create an `etcd` BOSH release, and we
didn't want to invest the time.