Rsync from/to a Kubernetes pod

Use rsync to transfer file from/to a pod

Feel kubectl cp command is a bit of inconvenience:

  1. Can't just use target directory. I must assign file name.
  2. Not compressed - poor performance on transferring big file.

And these inconveniences are kind of expected behavior. The doc says the cp command is basically:

kubectl exec -n <some-namespace> <some-pod> -- tar cf - /tmp/foo | tar xf - -C /tmp/bar

Then start looking for alternative solutions - And the answer is rsync.

rsync --rsh

For me, rsync is definitely the best solution for transferring files via SSH. I have now learned that it can also be used on Kubernetes. The magic is --rsh option:

-e, --rsh=COMMAND

This option allows you to choose an alternative remote shell program to use for communication between the local and remote copies of rsync. Typically, rsync is configured to use ssh by default, but you may prefer to use rsh on a local network.

--rsh is frequently used for setting extra arguments for ssh. And now we use this option to run rsync and connect to the pod.

Solution

Of course, there are other people trying to use rsync on Kubernetes. I found a clear explanation of the usage on Server Fault. And a well-designed kube-rsync script which requires extra dependency.

I mix their code, and create this script:

kubectl-rsync
#! /bin/bash
set -eo pipefail
if [[ -z "$KUBECTL_RSYNC_RSH" ]]; then
[[ -n "$KUBE_CONTEXT" ]] && echo >&2 "* Found \$KUBE_CONTEXT = $KUBE_CONTEXT"
[[ -n "$POD_NAMESPACE" ]] && echo >&2 "* Found \$POD_NAMESPACE = $POD_NAMESPACE"
[[ -n "$POD_NAME" ]] && echo >&2 "* Found \$POD_NAME = $POD_NAME"
while [[ $# -gt 0 ]]; do
case "$1" in
--context)
KUBE_CONTEXT="$2"
shift 2
;;
--context=*)
KUBE_CONTEXT="${1#*=}"
shift
;;
-c | --container)
POD_CONTAINER="$2"
shift 2
;;
--container=*)
POD_CONTAINER="${1#*=}"
shift
;;
-n | --namespace)
POD_NAMESPACE="$2"
shift 2
;;
--namespace=*)
POD_NAMESPACE="${1#*=}"
shift
;;
-h | --help)
echo "Rsync file and directories from/to Kubernetes pod"
echo ""
echo "IMPORTANT:"
echo "'rsync' must be installed on both the local machine and the target container for this script to work."
echo ""
echo "Usage:"
echo " $(basename "$0") [options] [--] [rsync-options] SRC DST"
echo ""
echo "Options:"
echo " -n, --namespace='' Namespace of the pod"
echo " --context='' The name of the kubeconfig context to use."
echo " Has precedence over KUBE_CONTEXT variable."
echo " -c, --container='' Container name. If omitted, the first container in the pod will be chosen"
echo " --help Display this help and exit"
echo ""
exit
;;
--)
shift
break
;;
*)
break
;;
esac
done
export KUBECTL_RSYNC_RSH=true
export KUBE_CONTEXT POD_NAMESPACE POD_CONTAINER
set -x
exec rsync --blocking-io --rsh="$0" "$@"
fi
# Running under --rsh
# If user uses pod@namespace, rsync passes args as `-l pod namespace`
if [[ x"$1" == x"-l" ]]; then
POD_NAME="$2"
POD_NAMESPACE="$3"
shift 3
else
POD_NAME="$1"
shift
fi
export KUBE_CONTEXT POD_NAMESPACE POD_CONTAINER POD_NAME
echo >&2 "* Connect to pod $POD_NAME in ${POD_NAMESPACE:-current namespace}"
set -x
exec kubectl exec \
${KUBE_CONTEXT:+--context=${KUBE_CONTEXT}} \
${POD_NAMESPACE:+--namespace=${POD_NAMESPACE}} \
${POD_CONTAINER:+--container=${POD_CONTAINER}} \
"${POD_NAME}" -i \
-- \
"$@"

When this script is named as kubectl-rsync and placed under $PATH, it would be recognized by kubectl as a plugin.

Now this script could be invoked as kubectl rsync command. Further, we can use with the customized alias:

alias k=kubectl
# copy from remote to local
k rsync pod:target.txt .
# specify namespace
k rsync pod@namespace:source.txt .
k rsync -n namespace pod:source.txt .
# rsync option is acceptable after `--`
k rsync -- -hhh --progress source.txt pod:dir/

Sweet! ✨🍰✨