Commit Graph

16 Commits

Author SHA1 Message Date
ffb0b66d5b cmd/k8s-operator: advertise VIPServices in ProxyGroup config (#14946)
Now that packets flow for VIPServices, the last piece needed to start
serving them from a ProxyGroup is config to tell the proxy Pods which
services they should advertise.

Updates tailscale/corp#24795

Change-Id: Ic7bbeac8e93c9503558107bc5f6123be02a84c77
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-03-06 14:05:41 +00:00
b406f209c3 cmd/{k8s-operator,containerboot},kube: ensure egress ProxyGroup proxies don't terminate while cluster traffic is still routed to them (#14436)
Some checks are pending
CI / cross (386, linux) (push) Waiting to run
CI / cross (amd64, darwin) (push) Waiting to run
CI / cross (amd64, freebsd) (push) Waiting to run
CI / cross (amd64, openbsd) (push) Waiting to run
CI / cross (amd64, windows) (push) Waiting to run
CI / cross (arm, 5, linux) (push) Waiting to run
CI / cross (arm, 7, linux) (push) Waiting to run
CI / cross (arm64, darwin) (push) Waiting to run
CI / cross (arm64, linux) (push) Waiting to run
CI / cross (arm64, windows) (push) Waiting to run
CI / cross (loong64, linux) (push) Waiting to run
CI / ios (push) Waiting to run
CI / crossmin (amd64, illumos) (push) Waiting to run
CI / crossmin (amd64, plan9) (push) Waiting to run
CI / crossmin (amd64, solaris) (push) Waiting to run
CI / crossmin (ppc64, aix) (push) Waiting to run
CI / android (push) Waiting to run
CI / wasm (push) Waiting to run
CI / tailscale_go (push) Waiting to run
CI / fuzz (push) Waiting to run
CI / depaware (push) Waiting to run
CI / go_generate (push) Waiting to run
CI / go_mod_tidy (push) Waiting to run
CI / licenses (push) Waiting to run
CI / staticcheck (386, windows) (push) Waiting to run
CI / staticcheck (amd64, darwin) (push) Waiting to run
CI / staticcheck (amd64, linux) (push) Waiting to run
CI / staticcheck (amd64, windows) (push) Waiting to run
CI / notify_slack (push) Blocked by required conditions
CI / check_mergeability (push) Blocked by required conditions
cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies

This change is part of work towards minimizing downtime during update
rollouts of egress ProxyGroup replicas.
This change:
- updates the containerboot health check logic to return Pod IP in headers,
if set
- always runs the health check for egress PG proxies
- updates ClusterIP Services created for PG egress endpoints to include
the health check endpoint
- implements preshutdown endpoint in proxies. The preshutdown endpoint
logic waits till, for all currently configured egress services, the ClusterIP
Service health check endpoint is no longer returned by the shutting-down Pod
(by looking at the new Pod IP header).
- ensures that kubelet is configured to call the preshutdown endpoint

This reduces the possibility that, as replicas are terminated during an update,
a replica gets terminated to which cluster traffic is still being routed via
the ClusterIP Service because kube proxy has not yet updated routig rules.
This is not a perfect check as in practice, it only checks that the kube
proxy on the node on which the proxy runs has updated rules. However, overall
this might be good enough.

The preshutdown logic is disabled if users have configured a custom health check
port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in
future setting of that env var for operator proxies might be disallowed (as users
shouldn't need to configure this for a Pod directly).
This is backwards compatible with earlier proxy versions.

Updates tailscale/tailscale#14326


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-29 07:35:50 +00:00
817ba1c300 cmd/{k8s-operator,containerboot},kube/kubetypes: parse Ingresses for ingress ProxyGroup (#14583)
cmd/k8s-operator: add logic to parse L7 Ingresses in HA mode

- Wrap the Tailscale API client used by the Kubernetes Operator
into a client that knows how to manage VIPServices.
- Create/Delete VIPServices and update serve config for L7 Ingresses
for ProxyGroup.
- Ensure that ingress ProxyGroup proxies mount serve config from a shared ConfigMap.

Updates tailscale/corp#24795


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-21 05:21:03 +00:00
48a95c422a cmd/containerboot,cmd/k8s-operator: reload tailscaled config (#14342)
Some checks are pending
CI / cross (386, linux) (push) Waiting to run
CI / cross (amd64, darwin) (push) Waiting to run
CI / cross (amd64, freebsd) (push) Waiting to run
CI / cross (amd64, openbsd) (push) Waiting to run
CI / cross (amd64, windows) (push) Waiting to run
CI / cross (arm, 5, linux) (push) Waiting to run
CI / cross (arm, 7, linux) (push) Waiting to run
CI / cross (arm64, darwin) (push) Waiting to run
CI / cross (arm64, linux) (push) Waiting to run
CI / cross (arm64, windows) (push) Waiting to run
CI / cross (loong64, linux) (push) Waiting to run
CI / ios (push) Waiting to run
CI / crossmin (amd64, illumos) (push) Waiting to run
CI / crossmin (amd64, plan9) (push) Waiting to run
CI / crossmin (amd64, solaris) (push) Waiting to run
CI / crossmin (ppc64, aix) (push) Waiting to run
CI / android (push) Waiting to run
CI / wasm (push) Waiting to run
CI / tailscale_go (push) Waiting to run
CI / fuzz (push) Waiting to run
CI / depaware (push) Waiting to run
CI / go_generate (push) Waiting to run
CI / go_mod_tidy (push) Waiting to run
CI / licenses (push) Waiting to run
CI / staticcheck (386, windows) (push) Waiting to run
CI / staticcheck (amd64, darwin) (push) Waiting to run
CI / staticcheck (amd64, linux) (push) Waiting to run
CI / staticcheck (amd64, windows) (push) Waiting to run
CI / notify_slack (push) Blocked by required conditions
CI / check_mergeability (push) Blocked by required conditions
cmd/{k8s-operator,containerboot}: reload tailscaled configfile when its contents have changed

Instead of restarting the Kubernetes Operator proxies each time
tailscaled config has changed, this dynamically reloads the configfile
using the new reload endpoint.
Older annotation based mechanism will be supported till 1.84
to ensure that proxy versions prior to 1.80 keep working with
operator 1.80 and newer.

Updates tailscale/tailscale#13032
Updates tailscale/corp#24795

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-10 07:29:11 +00:00
8d4ca13cf8 cmd/k8s-operator,k8s-operator: support ingress ProxyGroup type (#14548)
Currently this does not yet do anything apart from creating
the ProxyGroup resources like StatefulSet.

Updates tailscale/corp#24795

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-08 13:43:17 +00:00
cc168d9f6b cmd/k8s-operator: fix ProxyGroup hostname (#14336)
Updates tailscale/tailscale#14325

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-16 06:11:18 +00:00
df94a14870 cmd/k8s-operator: don't error for transient failures (#14073)
Every so often, the ProxyGroup and other controllers lose an optimistic locking race
with other controllers that update the objects they create. Stop treating
this as an error event, and instead just log an info level log line for it.

Fixes #14072

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-05 12:11:22 +00:00
aa43388363 cmd/k8s-operator: fix a bunch of status equality checks (#14270)
Updates tailscale/tailscale#14269

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-04 06:46:51 +00:00
9f9063e624 cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor (#14248)
* cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor

Adds a new spec.metrics.serviceMonitor field to ProxyClass.
If that's set to true (and metrics are enabled), the operator
will create a Prometheus ServiceMonitor for each proxy to which
the ProxyClass applies.
Additionally, create a metrics Service for each proxy that has
metrics enabled.

Updates tailscale/tailscale#11292

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-03 12:35:25 +00:00
d8a3683fdf cmd/k8s-operator: restart ProxyGroup pods less (#14045)
We currently annotate pods with a hash of the tailscaled config so that
we can trigger pod restarts whenever it changes. However, the hash
updates more frequently than is necessary causing more restarts than is
necessary. This commit removes two causes; scaling up/down and removing
the auth key after pods have initially authed to control. However, note
that pods will still restart on scale-up/down because of the updated set
of volumes mounted into each pod. Hopefully we can fix that in a planned
follow-up PR.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-11-12 14:18:19 +00:00
8ba9b558d2 envknob,kube/kubetypes,cmd/k8s-operator: add app type for ProxyGroup (#14029)
Sets a custom hostinfo app type for ProxyGroup replicas, similarly
to how we do it for all other Kubernetes Operator managed components.

Updates tailscale/tailscale#13406,tailscale/corp#22920

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-07 12:42:29 +00:00
f6d4d03355 cmd/k8s-operator: don't error out if ProxyClass for ProxyGroup not found. (#13736)
We don't need to error out and continuously reconcile if ProxyClass
has not (yet) been created, once it gets created the ProxyGroup
reconciler will get triggered.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-09 13:23:00 +01:00
07c157ee9f cmd/k8s-operator: base ProxyGroup StatefulSet on common proxy.yaml definition (#13714)
As discussed in #13684, base the ProxyGroup's proxy definitions on the same
scaffolding as the existing proxies, as defined in proxy.yaml

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 20:05:08 +01:00
36cb2e4e5f cmd/k8s-operator,k8s-operator: use default ProxyClass if set for ProxyGroup (#13720)
The default ProxyClass can be set via helm chart or env var, and applies
to all proxies that do not otherwise have an explicit ProxyClass set.
This ensures proxies created by the new ProxyGroup CRD are consistent
with the behaviour of existing proxies

Nearby but unrelated changes:

* Fix up double error logs (controller runtime logs returned errors)
* Fix a couple of variable names

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 17:34:34 +01:00
7f016baa87 cmd/k8s-operator,k8s-operator: create ConfigMap for egress services + small fixes for egress services (#13715)
cmd/k8s-operator, k8s-operator: create ConfigMap for egress services + small reconciler fixes

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-07 20:12:56 +01:00
e48cddfbb3 cmd/{containerboot,k8s-operator},k8s-operator,kube: add ProxyGroup controller (#13684)
Implements the controller for the new ProxyGroup CRD, designed for
running proxies in a high availability configuration. Each proxy gets
its own config and state Secret, and its own tailscale node ID.

We are currently mounting all of the config secrets into the container,
but will stop mounting them and instead read them directly from the kube
API once #13578 is implemented.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-07 14:58:45 +01:00