When you have a large number of users across the world that need to access your service on AKS instances from different regions, you want to consider nearby-deployment for a better experience.
However, DNS may not be good enough. Without considering accuracy of GeoIP database, it will become troublesome if a cluster is down: DNS cache itself is a huge problem, as even the ISP may force cache the DNS resolution result for reducing its own pressure. Or the customer may use some custom DNS which causes the IP from a different region to be returned.
TL;DR
- Use annotation:
service.beta.kubernetes.io/azure-additional-public-ips: ${glb_ip_address}
- Scripting is required to sync rules from AKS (regional) load balancer to global load balancer. AKS does not automatically sync rules.
Complaints before main article
When I was skimming the AKS Roadmap history, I find out this issue which talks about the usage of “global load balancer with AKS”.
In simple: the user wants to rely on anycast-based service instead of DNS-based. A scenario can be found here.
Although the global load balancer has been generally available since July 10, 2023, I still can’t find any tutorial about configuring global load balancer with AKS in different regions within 9 months. This is why this post comes out.
BTW, if you try to search the key annotation service.beta.kubernetes.io/azure-additional-public-ips
from search engine, which must be used in this post: there are only three results. See web archive here.
Initializing resources
In this article, we will create two AKS instances in Australia East and Italy North, then deploy the Global Load Balancer in West US.
- Prepare the environment variables for deployment
# Resource group
rG_region=westus
rG=joey_aks_glb_17473
# AKS properties
aks1=joey-aks-17473-australiaeast
aks1_region=australiaeast
aks2=joey-aks-17473-italynorth
aks2_region=italynorth
# Global LB properties
glb=joey_glb_17473
glb_region=${rG_region}
glb_ip=joey_glb_ip
NOTEYou only can deploy Global Load Balancer in available home regions. Be aware this region won’t affect your traffic routing. See also: Home regions and participating regions.
- Create resource group
az group create -n ${rG} -l ${rG_region}
- Create base AKS instances
# To speed up the deployment process, using `--no-wait` so we don't need to wait its completion
az aks create -n ${aks1} -g ${rG} -l ${aks1_region} \
--node-vm-size Standard_A4_v2 --node-count 1 \
--no-wait --only-show-errors
az aks create -n ${aks2} -g ${rG} -l ${aks2_region} \
--node-vm-size Standard_A4_v2 --node-count 1 \
--no-wait --only-show-errors
# Get the infrastructure resource group name of AKS for further use
infra1_rG=$(az aks show -n ${aks1} -g ${rG} \
--query nodeResourceGroup -o tsv --only-show-errors)
infra2_rG=$(az aks show -n ${aks2} -g ${rG} \
--query nodeResourceGroup -o tsv --only-show-errors)
- Create Global Load Balancer
az network public-ip create -n ${glb_ip} -g ${rG} -l ${glb_region} \
--version IPv4 --tier global --sku Standard -o none --only-show-errors
az network cross-region-lb create -n ${glb} -g ${rG} -l ${glb_region} \
--frontend-ip-name ${glb_ip} --public-ip-address ${glb_ip} \
--backend-pool-name kubernetes_lbs --no-wait
# Get Global LB IP for being whitelisted/routed in AKS
glb_ip_address=$(az network public-ip show -n ${glb_ip} -g ${rG} \
--query ipAddress -o tsv)
- Deploy example applications on AKS instances with embedded Global Load Balancer IP Perform the deployment on the first AKS instance:
# At the time, the AKS may not complete provision.
# Only deploy the application after AKS has been successfully provisioned.
while [ "Succeeded" != "$(az aks show -n ${aks1} -g ${rG} \
--query provisioningState -o tsv --only-show-errors)" ]; \
do echo "Waiting until the cluster ${aks1} is being provisioned..."; sleep 10; done; \
az aks get-credentials -n ${aks1} -g ${rG} --only-show-errors; \
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: helloworld
spec:
replicas: 1
selector:
matchLabels:
app: helloworld
template:
metadata:
labels:
app: helloworld
spec:
containers:
- name: helloworld
image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: helloworld
annotations:
service.beta.kubernetes.io/azure-additional-public-ips: ${glb_ip_address}
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: helloworld
EOF
- Perform the same process on the second AKS instance:
while [ "Succeeded" != "$(az aks show -n ${aks2} -g ${rG} \
--query provisioningState -o tsv --only-show-errors)" ]; \
do echo "Waiting until the cluster ${aks2} is being provisioned..."; sleep 10; done; \
az aks get-credentials -n ${aks2} -g ${rG} --only-show-errors; \
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: helloworld
spec:
replicas: 1
selector:
matchLabels:
app: helloworld
template:
metadata:
labels:
app: helloworld
spec:
containers:
- name: helloworld
image: us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0
ports:
- containerPort: 8000
---
apiVersion: v1
kind: Service
metadata:
name: helloworld
annotations:
service.beta.kubernetes.io/azure-additional-public-ips: ${glb_ip_address}
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: helloworld
EOF
NOTEThe annotation
service.beta.kubernetes.io/azure-additional-public-ips
only adds another IP to your service so thekube-proxy
will add relevant iptables rules to redirect the traffic from the IP. It does not let AKS control this IP at all. This can also be used in the scenario like: if you want to switch the IP but the organization does not allow you to add a new service.
Configuring Load Balancer
Disable public Internet access to AKS (regional) Load Balancer (Optional)
At the time, The demo AKS instances with running service have been deployed.
For security reason, you may want to disable public access to regional Load Balancer as they are not necessary to be exposed.
However, private or internal load balancer can’t be added to the backend pool of a global load balancer. So we only can restrict it on Network Security Group (NSG) level. You can skip this section if you believe “disabling public Internete access to regional Load balancer” is not required.
- Get regional Load balancer frontend IP addresses:
function query1 { az resource list -g ${infra1_rG} \
--resource-type Microsoft.Network/publicIPAddresses \
--query '[].{id:id, "k8s-service":tags."k8s-azure-service"}' \
-o json | jq -r '.[] | select(."k8s-service"=="default/helloworld") | .id'; }
function query2 { az resource list -g ${infra2_rG} \
--resource-type Microsoft.Network/publicIPAddresses \
--query '[].{id:id, "k8s-service":tags."k8s-azure-service"}' \
-o json | jq -r '.[] | select(."k8s-service"=="default/helloworld") | .id'; }
while [ ! -n $(query1) ]; \
do echo "The IP is still pending creation; retry after 5s..."; sleep 5; \
done; \
ip1=$(query1); \
ip1Address=$(az network public-ip show --ids "${ip1}" --query ipAddress -o tsv)
while [ ! -n $(query2) ]; \
do echo "The IP is still pending creation; retry after 5s..."; sleep 5; \
done; \
ip2=$(query2); \
ip2Address=$(az network public-ip show --ids "${ip2}" --query ipAddress -o tsv)
- Get NSG name:
nsg1=$(az resource list -g ${infra1_rG} \
--resource-type Microsoft.Network/networkSecurityGroups --query [0].name -o tsv)
nsg2=$(az resource list -g ${infra2_rG} \
--resource-type Microsoft.Network/networkSecurityGroups --query [0].name -o tsv)
- Apply rules on NSG:
az network nsg rule create --nsg-name ${nsg1} -g ${infra1_rG} \
--name "DenyLB_helloworld" --priority 234 --access Deny \
--protocol "*" --destination-address-prefixes ${ip1Address} \
--direction Inbound --no-wait
az network nsg rule create --nsg-name ${nsg2} -g ${infra2_rG} \
--name "DenyLB_helloworld" --priority 234 --access Deny \
--protocol "*" --destination-address-prefixes ${ip2Address} \
--direction Inbound --no-wait
Link AKS (regional) Load Balancer to Global Load Balancer
After completing the deployment of demo AKS instances with running applications, we need to link the AKS Load balancer to the Global Load balancer to complete the whole deployment.
IMPORTANTIt is not necessary that one frontend IP will be bound only to one service. Multiple service can be bound to one frontend IP, or multiple frontend IP can be bound to one service. You can modify the scipt as you need.
- If you skipeed the previous section, you need to get regional Load balancer frontend IP configuration ID before continuing:
function query1 { az resource list -g ${infra1_rG} \
--resource-type Microsoft.Network/publicIPAddresses \
--query '[].{id:id, "k8s-service":tags."k8s-azure-service"}' \
-o json | jq -r '.[] | select(."k8s-service"=="default/helloworld") | .id'; }
function query2 { az resource list -g ${infra2_rG} \
--resource-type Microsoft.Network/publicIPAddresses \
--query '[].{id:id, "k8s-service":tags."k8s-azure-service"}' \
-o json | jq -r '.[] | select(."k8s-service"=="default/helloworld") | .id'; }
while [ ! -n $(query1) ]; \
do echo "The IP is still pending creation; retry after 5s..."; sleep 5; \
done; \
ip1=$(query1)
while [ ! -n $(query2) ]; \
do echo "The IP is still pending creation; retry after 5s..."; sleep 5; \
done; \
ip2=$(query2)
- Linking the AKS Load balancer to Global Load balancer:
lb1_frontendip=$(az network lb frontend-ip list --lb-name kubernetes -g ${infra1_rG} \
--query "[].{id:id, publicIPAddressID:publicIPAddress.id}" \
-o json | jq --arg var1 "$ip1" -r '.[] | select(.publicIPAddressID==$var1) | .id')
lb2_frontendip=$(az network lb frontend-ip list --lb-name kubernetes -g ${infra2_rG} \
--query "[].{id:id, publicIPAddressID:publicIPAddress.id}" \
-o json | jq --arg var2 "$ip2" -r '.[] | select(.publicIPAddressID==$var2) | .id')
az network cross-region-lb address-pool address add \
--frontend-ip-address ${lb1_frontendip} \
--lb-name ${glb} \
--name "${aks1}_lb" \
--pool-name kubernetes_lbs \
--resource-group ${rG} --no-wait --only-show-errors
az network cross-region-lb address-pool address add \
--frontend-ip-address ${lb2_frontendip} \
--lb-name ${glb} \
--name "${aks2}_lb" \
--pool-name kubernetes_lbs \
--resource-group ${rG} --no-wait --only-show-errors
Copy rules from AKS (regional) Load Balancer to Global Load Balancer
As the final step, make sure the routing rule is being created between Global Load Balancer and regional Load balancer, so the Global Load Balancer can route the traffic correctly.
To do this, simply copying the rules from regional Load Balancer and apply it.
rule=$(az network lb rule list --lb-name kubernetes -g ${infra1_rG} \
--query '[].{frontendIPConfigurationID:frontendIPConfiguration.id, "frontendPort":frontendPort, protocol:protocol}' \
-o json | jq --arg var1 "$lb1_frontendip" \
-r '[.[] | select(.frontendIPConfigurationID==$var1) | {frontendPort:.frontendPort, protocol:.protocol}]')
ruleNum=$(echo $rule | jq length)
for ((i=0; i<${ruleNum}; i++))
do
frontendPort=$(echo $rule | jq -r '.['$i'] | .frontendPort')
protocol=$(echo $rule | jq -r '.['$i'].protocol |= ascii_downcase | .['$i'].protocol')
echo "Copying load balancing rule $((i+1)), in total of ${ruleNum}..."
az network cross-region-lb rule create \
--backend-port ${frontendPort} \
--frontend-port ${frontendPort} \
--lb-name ${glb} \
--name "helloworld_$((i+1))" \
--protocol ${protocol} \
--resource-group ${rG} \
--backend-pool-name kubernetes_lbs \
--frontend-ip-name ${glb_ip} \
--enable-floating-ip true --no-wait
done
Check if Global Load Balancer is working
Now, you have completed the deployment of demo scenario. We can check it out if it is working.
- Get your Global Load balancer IP
echo ${glb_ip_address}
- Create two test ACI instances in different regions
az container create -n ${aks1} -g ${rG} -l ${aks1_region} \
--image quay.io/curl/curl:latest --command-line "sleep infinity"
az container create -n ${aks2} -g ${rG} -l ${aks2_region} \
--image quay.io/curl/curl:latest --command-line "sleep infinity"
- Check target backend via hostname from first ACI instance
joey [ ~ ]$ az container exec -n ${aks1} -g ${rG} \
--exec-command "curl http://ip-api.com/line/?fields=country"
Australia
joey [ ~ ]$ az container exec -n ${aks1} -g ${rG} \
--exec-command "curl ${glb_ip_address}"
Hello, world!
Version: 1.0.0
Hostname: helloworld-76784bbcf9-mtg4l
- Check target backend via hostname from second ACI instance
joey [ ~ ]$ az container exec -n ${aks2} -g ${rG} \
--exec-command "curl http://ip-api.com/line/?fields=country"
Italy
joey [ ~ ]$ az container exec -n ${aks2} -g ${rG} \
--exec-command "curl ${glb_ip_address}"
Hello, world!
Version: 1.0.0
Hostname: helloworld-79b7655dc-dx4wn
Since the hostname are different, it shows that the Global Load Balancer is working.
Clean resources
az group delete -n ${rG} --no-wait