avatar

Qiutong(Curtis) Men

Reaching the unreachable star

Setting up Kubernetes Cluster and Jitsi Deployment on Bare-metal Machines

On Ubuntu 20.04 amd64 architecture.

1. Pre-req Checking

  • Turn off swap

    1
    swapoff -a
  • Check disk space

    1
    df -h
  • Others

2. Install Docker

Notice that here the /etc/apt/keyrings is already installed, so ignore the reminder from k8s. Starting Ubuntu 22.04, there’s no more need to install this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

3. Install cri-dockerd for Container Runtime Interface

Actually docker will install containerd. This causes duplicate CRI for k8s.

  • Change shell

Go install script doesn’t support tcsh, change to a POSIX shell.

1
2
sudo passwd <username>
chsh
  • Install Go first
1
2
3
4
curl -OL https://go.dev/dl/go1.20.4.linux-amd64.tar.gz
sha256sum go1.20.4.linux-amd64.tar.gz

sudo rm -rf /usr/local/go && sudo tar -C /usr/local -xzf go1.20.4.linux-amd64.tar.gz

Add Go for all users.

1
sudo vi /etc/profile

Add the following line:

1
export PATH=$PATH:/usr/local/go/bin

Then

1
. /etc/profile

Do this also for ~/.profile. Run go version to verify install.

  • Install cri-dockerd
    1
    2
    3
    4
    git clone https://github.com/Mirantis/cri-dockerd.git
    cd cri-dockerd
    mkdir bin
    go build -o bin/cri-dockerd

Below using sudo:

1
2
3
4
5
6
7
mkdir -p /usr/local/bin
install -o root -g root -m 0755 bin/cri-dockerd /usr/local/bin/cri-dockerd
cp -a packaging/systemd/* /etc/systemd/system
sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
systemctl daemon-reload
systemctl enable cri-docker.service
systemctl enable --now cri-docker.socket

4. Install kubeadm, kubelet, kubectl

Don’t self create /etc/apt/keyrings here and chmod 744, 744 will make the package manager mistrust the public key, thus refuse to update.

1
2
3
4
5
6
7
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-archive-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

5. Init Control Plane Node(with flaw)

  • TL;DR
1
sudo kubeadm init --cri-socket unix:///var/run/cri-dockerd.sock --pod-network-cidr 10.244.0.0/16
  • What happend in reality

Since there’re two CRI on the machine, explicitly select cri-dockerd.

1
sudo kubeadm init --cri-socket unix:///var/run/cri-dockerd.sock
  • For non-root kubectl usage:
    1
    2
    3
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

6. Add Pod Network Add-on

I am not sure why flannel is chosen, just viewed some random topics on the Internet.

  • install flannel

    1
    kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
  • troubleshooting

CoreDNS is still pending(should be running), and flannel is always CrashLoopBackoff.

Do some search on Web, and thanks to this document.

Root cause: flannel requires explicitly setting –pod-network-cidr.

  • Tearing down

    1
    2
    3
    4
    kubectl drain node0.qmcurtis-158673.nyu-netsec-pg0.utah.cloudlab.us --delete-emptydir-data --force --ignore-daemonsets
    sudo kubeadm reset
    sudo "iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X"
    sudo ipvsadm -C # not useful
  • Re-launching

    1
    sudo kubeadm init --cri-socket unix:///var/run/cri-dockerd.sock --pod-network-cidr 10.244.0.0/16

7. Join Worker Node

An example command (output of kubeadm init):

1
2
kubeadm join <IP:port> --cri-socket unix:///var/run/cri-dockerd.sock --token <token> \
--discovery-token-ca-cert-hash sha256:<sha256sum>

8. Make Control Plane Node Schedulable(optional)

1
kubectl taint nodes --all node-role.kubernetes.io/control-plane-

9. Install Helm

1
2
3
4
5
curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
sudo apt-get install apt-transport-https --yes
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm

10. Deploy Jitsi

  • Add helm repo

    1
    helm repo add jitsi https://jitsi-contrib.github.io/jitsi-helm/
  • Modifications

Through jitsi_jvb.yaml (values explained later):

1
2
3
4
5
6
7
8
9
10
11
publicURL: <URL>
tz: America/New_York
jvb:
replicaCount: 2
useHostPort: true
publicIPs:
- <IP>
- <IP>
prosody:
persistence:
storageClassName: jitsi-prosody-fs
  • Deploy
    1
    helm install myjitsi -f jitsi_jvb.yaml jitsi/jitsi-meet --namespace jit --create-namespace

Note: this won’t work, the web app is available but you cannot really join a meeting.

11. Jitsi Troubleshooting and Install PVC provisioner.

  • Dump Jicofo logs

    1
    kubectl logs <jicofo-pod-name>

    Found that cannot establish communication with prosody.

  • Check prosody status

    1
    kubectl describe pod <prosody>

    Pending, and the PVC is pending.

Thanks to this issue.

  • Install local-path-provisioner
1
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.24/deploy/local-path-storage.yaml

or make some change through kustomize. e.g., I need a different storage path since my extra FS is mounted to /mydata.

  • Self-defined Storage Class

Leave room for further extension.

Apply this YAML:

1
2
3
4
5
6
7
8
9
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: jitsi-prosody-fs
provisioner: rancher.io/local-path
parameters:
nodePath: /mydata
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete

Some Issues

  • On Jitsi’s side

Must specify a storageClassName in prosody.persistence.

And remember to remove ths PVC if reconfigured and re-installed. the PVC won’t be automatically deleted or updated.

  • Provisioner name

Must use rancher.io/local-path, cluster.local/local-path-provisioner will not work and cause the PVC waiting for provisioning forever.

  • VolumeBindingMode

Don’t use Immediate, use WaitForFirstConsumer, otherwise the PVC will get reported node not specified.

Fault Injection

Investigating kube-monkey, TBA.