Nhật Trường | DevOps, SecOps & Platforms

DNSOps - GitOps for DNS management with DNSControl & GitLab CI/CD

Nhật Trường — Tue, 13 Jan 2026 11:07:20 GMT

This repository provides a GitOps approach to maanage your DNS records live in Git, changes are peer-reviewed, and deployments are automated through CI/CD. When the dashboard is down, your DNS config is still version-controlled and ready to push to a backup provider.

GitOps Pattern: All DNS changes MUST be made through Git commits.Never edit DNS records directly in the provider's dashboard - changes will be overwritten on the next pipeline run.

Why GitOps for DNS?

If your DNS is only managed through a web dashboard, you're one outage away from losing control.

Managing DNS records manually through web dashboards creates several challenges:

Challenge	GitOps Solution
No Audit Trail	Full Git history - who changed what, when
No Granular Access	Branch protection, MR approvals, CODEOWNERS
Provider Outages	Config is versioned, push to backup provider
Manual Errors	Automated validation before applying
Inconsistency	Shared configs, reusable variables

Quick Start

Check out nh4ttruong/dnsops or follow the steps outlined below:

1. Clone and Configure

git clone https://github.com/nh4ttruong/dnsops
cd dnsops

2. Set CI/CD Variables

Provider	Variables
Cloudflare	`CF_ACCOUNT_ID_`, `CF_API_TOKEN_`
AWS Route 53	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
Google Cloud	`GCLOUD_PRIVATE_KEY`, `GCLOUD_CLIENT_EMAIL`

3. Edit DNS Records

nano zones/{domain}/dnsconfig.js

4. Commit and Push

git add . && git commit -m "Update DNS records"
git push origin main

Pipeline automatically: check → preview → push

Repository Structure

dns-mgmt/
├── .gitlab-ci.yml              # Pipeline triggers per zone
├── creds.json                  # Provider credentials (env vars)
└── zones/
    ├── .gitlab-ci-template.yml # Shared pipeline template
    ├── common.js               # Shared IPs & defaults
    ├── example-com/
    │   └── dnsconfig.js        # Zone: example.com
    └── mycompany-io/
        └── dnsconfig.js        # Zone: mycompany.io

Shared Configuration

// zones/common.js
var addr = {
    "default": IP("10.0.0.10"),
    "private_ingress_uat": IP("10.0.0.70"),
    "public_ingress_uat": IP("203.0.113.19"),
    "public_ingress_production": IP("203.0.113.21"),
}

CI/CD Pipeline

flowchart LR
    A[Git Push] --> B[Trigger Zone Job]
    B --> C[check]
    C --> D[preview]
    D --> E{Branch?}
    E -->|main| F[push]
    E -->|other| G[Stop]
    F --> H[DNS Updated]

Stage	Job	Description
test	`check_and_review`	Validate syntax, preview changes
deploy	`push`	Apply to provider (main branch only)

Provider Examples

Cloudflare

// zones/example-com/dnsconfig.js
require("../common.js");
var DP = NewDnsProvider("cloudflare_example_com");

D("example.com", REG_NONE, DnsProvider(DP), {no_ns: 'true'},
  A("@", addr.default),
  A("www", addr.public_ingress_production),
  CNAME("blog", "example.github.io."),
);

"cloudflare_example_com": {
    "TYPE": "CLOUDFLAREAPI",
    "accountid": "$CF_ACCOUNT_ID_EXAMPLE_COM",
    "apitoken": "$CF_API_TOKEN_EXAMPLE_COM"
}

AWS Route 53

// zones/example-org/dnsconfig.js
require("../common.js");
var DP = NewDnsProvider("route53_example_org");

D("example.org", REG_NONE, DnsProvider(DP),
  A("@", addr.default),
  A("api", addr.public_ingress_production),
  CNAME("www", "example.org."),
);

"route53_example_org": {
    "TYPE": "ROUTE53",
    "KeyId": "$AWS_ACCESS_KEY_ID",
    "SecretKey": "$AWS_SECRET_ACCESS_KEY"
}

Google Cloud DNS

// zones/example-io/dnsconfig.js
require("../common.js");
var DP = NewDnsProvider("gcloud_example_io");

D("example.io", REG_NONE, DnsProvider(DP),
  A("@", addr.default),
  AAAA("@", "2001:db8::1"),
);

"gcloud_example_io": {
    "TYPE": "GCLOUD",
    "project": "my-gcp-project-id",
    "private_key": "$GCLOUD_PRIVATE_KEY",
    "client_email": "$GCLOUD_CLIENT_EMAIL"
}

Adding a New Zone

Create zone directory
```
 mkdir zones/new-domain-com
```

Create dnsconfig.js

 require("../common.js");
 var DP = NewDnsProvider("cloudflare_new_domain_com");

 D("new-domain.com", REG_NONE, DnsProvider(DP), {no_ns: 'true'},
   A("@", addr.default),
 );

Add credentials to creds.json

 "cloudflare_new_domain_com": {
     "TYPE": "CLOUDFLAREAPI",
     "accountid": "$CF_ACCOUNT_ID_NEW_DOMAIN_COM",
     "apitoken": "$CF_API_TOKEN_NEW_DOMAIN_COM"
 }

Add trigger to .gitlab-ci.yml

 new-domain-com:
   extends: .trigger-base
   variables:
     ZONE_DIR: new-domain-com
   rules:
     - changes: *common-paths
       when: manual
     - changes:
         - zones/new-domain-com/**/*
       when: always
     - when: never

Set GitLab CI/CD variables
- CF_ACCOUNT_ID_NEW_DOMAIN_COM
- CF_API_TOKEN_NEW_DOMAIN_COM

Local Development

cd zones/{domain}

dnscontrol check                              # Validate syntax
dnscontrol preview --creds ../../creds.json   # Preview changes
dnscontrol push --creds ../../creds.json      # Apply changes

Supported Providers

DNSControl supports 40+ providers:

Provider	Type	Best For
Cloudflare	CDN + DNS	Proxy, WAF, DDoS protection
AWS Route 53	Cloud DNS	AWS ecosystem integration
Google Cloud DNS	Cloud DNS	GCP workloads
Azure DNS	Cloud DNS	Azure ecosystem
DigitalOcean	Cloud DNS	Simple, affordable
Namecheap	Registrar	Domain + DNS bundled

Full list: docs.dnscontrol.org/provider

Multi-Provider Strategy

Manage multiple providers in one repository:

Migration - Move zones between providers gradually
Redundancy - Backup zones on secondary providers
Cost optimization - Different providers for different needs

flowchart TB
    subgraph repo["dnsops"]
        common[zones/common.js]
        zone1[zones/main-domain/]
        zone2[zones/backup-domain/]
        zone3[zones/legacy-domain/]
    end

    zone1 --> cf[Cloudflare]
    zone2 --> r53[AWS Route 53]
    zone3 --> gcp[Google Cloud DNS]

References

Kubernetes API Server & Kubelet Performance Testing

Nhật Trường — Fri, 09 Jan 2026 04:16:49 GMT

In Kubernetes context, while the API Server is the brain, the Kubelet is the muscle that actually runs your workloads. Both need to be stress-tested to guarantee successful deployments.

To test this, we use kube-burner, a tool designed to stress the control plane by creating, updating, and deleting thousands of objects, while simultaneously measuring how fast Kubelets can pick up and run these workloads.

Why Test?

The API Server is the central point for all requests in a Kubernetes cluster. If the API Server cannot handle the load, the entire cluster is affected:

Pod scheduling becomes slow or fails
Service discovery stops working
kubectl commands timeout
Controllers cannot reconcile

Context: When a cluster has many workloads, the API Server must handle thousands of requests/second from:

Controllers (deployment, replicaset, daemonset...)
Kubelet (node status, pod status)
Users (kubectl, CI/CD pipelines)
Applications (in-cluster clients)

Install kube-burner

# Download and install
curl -L https://github.com/cloud-bulldozer/kube-burner/releases/latest/download/kube-burner-linux-amd64.tar.gz | tar xz
sudo mv kube-burner /usr/local/bin/

# Verify
kube-burner version

Reference: kube-burner installation

Test Scenarios

1. Smoke Test

Purpose: Basic validation, baseline performance.

Input:

10 namespaces
50 objects (secrets, deployments)
QPS: 5, Burst: 5

Run test:

kube-burner init -c api-server/smoke.yaml

Expected output:

Metric	Target
Success rate	> 99%
P99 latency	< 500ms
Duration	~3-5 min

2. Load Test (API Intensive)

Purpose: Evaluate API Server capacity under production load.

Input:

30 namespaces
1,800 objects (deployments, configmaps, secrets, services)
QPS: 25, Burst: 30
3 phases: Create → Patch → Delete

Run test:

kube-burner init -c api-server/api-intensive.yml

Test phases:

Phase	Action	Duration
1	Object creation (1,800 objects)	~15 min
2	Object patching (JSON patch, strategic merge)	~5 min
3	Cleanup (cascade delete)	~10 min

Expected output:

Cluster Size	API Server QPS	P99 Latency
3-5 nodes	500-1,500	< 1s
5-20 nodes	1,500-5,000	< 500ms
20+ nodes	5,000-15,000	< 200ms

3. Kubelet Density Test

Purpose: Evaluate cluster scheduling and pod lifecycle capabilities.

Run test:

# Web application workload
kube-burner init -c kubelet-density-cni/kubelet-density-cni.yml

# Database workload  
kube-burner init -c kubelet-density-database/kubelet-density-database.yml

Expected output:

Metric	Target
Pod startup time	< 30s
Scheduling latency	< 5s
Pod churn rate	Stable

Metrics to Monitor

In Grafana (import grafana-dashboard/k8s-system-api-server.json):

Metric	PromQL	Meaning
Request latency	`histogram_quantile(0.99, apiserver_request_duration_seconds_bucket)`	P99 response time
Request rate	`sum(rate(apiserver_request_total[5m]))`	QPS
Error rate	`sum(rate(apiserver_request_total{code=~"5.."}[5m]))`	Server errors
etcd latency	`histogram_quantile(0.99, etcd_request_duration_seconds_bucket)`	Backend latency

Parameter Tuning

Adjust in config file based on cluster size:

# Small cluster (3-5 nodes)
jobs:
  - name: api-test
    qps: 10
    burst: 20
    jobIterations: 10
    replicas: 5

# Large cluster (20+ nodes)
jobs:
  - name: api-test
    qps: 100
    burst: 200
    jobIterations: 50
    replicas: 50

Troubleshooting

Pods stuck Pending:

→ Reduce replicas or increase cluster resources.

High API latency:

kubectl top pods -n kube-system
kubectl logs kube-apiserver- -n kube-system | grep -i error

→ Check etcd performance, reduce qps/burst.

Cleanup failed:

# Get list of namespaces to delete
kubectl get namespace -l kube-burner-job=

# Delete each namespace
kubectl delete namespace 

# Or use xargs to delete in bulk
kubectl get namespace -l kube-burner-job= -o name | xargs kubectl delete

References

%nh4ttruong

Benchmarking etcd: The Heartbeat of Kubernetes

Nhật Trường — Fri, 09 Jan 2026 03:59:40 GMT

Welcome to the "heartbeat" of Kubernetes. How do you benchmark and perform performance testing? Today, we focus on etcd.

Etcd is the consistent and highly available key-value store used as Kubernetes' backing store for all cluster data. If etcd is slow, your entire cluster feels sluggish. API requests time out, and controllers fail to sync.

🧠 About etcd

To optimize something, you must first understand how it works. etcd is a distributed, consistent key-value store that serves as the "brain" of your Kubernetes cluster.

How it Works

Etcd uses the Raft consensus algorithm to ensure data consistency across the cluster (typically 3 or 5 nodes).

Leader Election: One node is elected leader; all writes must go through it.
Replication: The leader replicates the log entry to followers.
Persistence (The Bottleneck): Before confirming a write, etcd must persist the data to disk (Write-Ahead Log or WAL) using an fsync system call.
Consensus: Once a majority (quorum) confirms the write, the request succeeds.

Because every state change requires an fsync to disk, disk latency is the critical path. If your disk is slow, fsync takes longer, the leader blocks, and the entire Kubernetes control plane slows down. This is why "fast SSDs" are non-negotiable for etcd.

Why Test?

etcd stores the entire state of a Kubernetes cluster:

Cluster configuration
Pod/Service/Deployment definitions
Secrets and ConfigMaps
Custom Resources

If etcd is slow, the entire cluster is affected:

API Server response becomes slow
Controllers cannot update state
Pod scheduling is delayed
Watch operations timeout

Context: etcd is a single point of failure. etcd performance determines the performance of the entire cluster.

Prerequisites

Access Methods

Option 1: SSH into control plane node (recommended for production)

ssh control-plane-node

Option 2: kubectl exec into etcd pod

kubectl exec -it etcd- -n kube-system -- sh

Install Tools

# Download etcd binaries
ETCD_VER=v3.5.4
curl -L https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd.tar.gz
tar xzvf /tmp/etcd.tar.gz -C /tmp --strip-components=1

# Install tools
sudo mv /tmp/etcdctl /usr/local/bin/
sudo mv /tmp/benchmark /usr/local/bin/

# Verify
etcdctl version

etcd Endpoints and Certificates

Get information from kubeadm cluster:

# Endpoints
kubectl get endpoints -n kube-system etcd -o jsonpath='{.subsets[*].addresses[*].ip}'

# Certificates (usually located here)
/etc/kubernetes/pki/etcd/
├── ca.crt
├── server.crt
└── server.key

Reference: etcd security

Test Scenarios

1. Smoke Test

Purpose: Verify connectivity and basic operations.

# Set environment variables
export ETCDCTL_API=3
export ETCDCTL_ENDPOINTS="https://10.10.10.1:2379,https://10.10.10.2:2379,https://10.10.10.3:2379"
export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key

# Run smoke test
./scripts/etcdctl/smoke.sh

Script performs:

# Member list
etcdctl member list

# Endpoint health
etcdctl endpoint health

# Endpoint status
etcdctl endpoint status

# Basic put/get/delete
etcdctl put /perf-test/key "value"
etcdctl get /perf-test/key
etcdctl del /perf-test/key

Expected output:

All members healthy
1 leader present
Put/get/delete successful

2. etcdctl Benchmark

Purpose: Measure client-side performance with realistic load patterns.

./scripts/etcdctl/etcdctl-tool.sh

Test phases:

Phase	Rate	Clients	Duration
Medium write	1,000 req/s	200	60s
Heavy write	8,000 req/s	500	60s
Heavy read	15,000 req/s	1,000	60s

Expected output:

Metric	3-node cluster	5+ node cluster
Write latency P99	< 50ms	< 25ms
Read latency P99	< 15ms	< 5ms
Write throughput	5k-10k req/s	15k-30k req/s
Read throughput	20k-50k req/s	80k-150k req/s

3. Benchmark Tool Test

Purpose: Measure raw database performance (server-side).

./scripts/benchmark/benchmark-tool.sh

Test types:

Test	Description	Command
Sequential write	Single client writes	`benchmark --conns=1 --clients=1`
Concurrent write	Multi-client writes	`benchmark --conns=100 --clients=1000`
Read (linearizable)	Strong consistency reads	`benchmark --consistency=l`
Read (serializable)	Weak consistency reads	`benchmark --consistency=s`

Cleanup

After testing, delete test data:

./scripts/benchmark/clean-and-recovery.sh

Metrics to Monitor

In Grafana (import grafana-dashboard/k8s-system-etcd.json):

Metric	PromQL	Meaning
WAL fsync duration	`histogram_quantile(0.99, etcd_disk_wal_fsync_duration_seconds_bucket)`	Disk write performance
Backend commit	`histogram_quantile(0.99, etcd_disk_backend_commit_duration_seconds_bucket)`	Database commit time
Leader elections	`increase(etcd_server_leader_changes_seen_total[1h])`	Cluster stability
DB size	`etcd_mvcc_db_total_size_in_bytes`	Database size

Thresholds:

WAL fsync P99 > 10ms → Disk too slow
Leader changes > 0/hour → Network or disk issues
DB size > 6GB → Need compact/defrag

Reference: etcd metrics

Troubleshooting

Connection refused:

# Check endpoints
etcdctl endpoint health
# Verify firewall, certificates

High latency:

# Check disk I/O
iostat -x 1

# Check etcd logs
kubectl logs etcd- -n kube-system | grep -i slow

Database too large:

# Compact and defrag
etcdctl compact $(etcdctl get / --limit 1 --rev | head -1)
etcdctl defrag

Configuration

Update scripts with cluster info:

# scripts/etcdctl/etcdctl-tool.sh
ETCD_HOSTS="10.10.10.1:2379,10.10.10.2:2379,10.10.10.3:2379"
OPTIONS="--cacert=/path/to/ca.crt --cert=/path/to/client.crt --key=/path/to/client.key"

References

Building a Production-Ready Kubernetes Performance Testing Framework

Nhật Trường — Fri, 09 Jan 2026 03:48:27 GMT

Building a Kubernetes cluster is easy; proving it's production-ready is hard. How do you know if your control plane can scale? Is your storage actually delivering the IOPS promised by the vendor?

To answer these questions and ensure the cluster is ready for production, I researched and gathered information on setting it up. Then, I built a performance test framework to do it.

In this post, I'll walk you through the Objective, the Test Flow, and the Requirements needed to set up this framework.

TL;DR 👉 github.com/nh4ttruong/k8s-perf-tests

🎯 The Objective

The goal is to ensure the Kubernetes cluster meets specific performance requirements across all layers:

Control Plane: API Server handles expected request volume.
etcd: Database meets throughput and latency requirements.
Network: Pod-to-pod communication achieves expected bandwidth and latency.
DNS: CoreDNS handles query rate from services.
Storage: Persistent Volume meets IOPS and throughput requirements.
Ingress: Load balancer handles external traffic.

🔄 The Test Flow

We don't test randomly. We execute in a specific order, moving from the infrastructure layer up to the application layer. If the foundation (etcd) is shaky, testing the Ingress is pointless.

etcd → Database performance (foundation for everything)
API Server → Control plane capacity
Network (CNI) → Pod networking performance
CoreDNS → Service discovery latency
Storage → Persistent volume I/O
Ingress → External traffic handling

🛠 Components & Tools

Here is the stack we use to validate each component. Click on the component name to view the source code and detailed documentation:

Component	Tool	Test Focus
API Server & Kubelet	`kube-burner`	Object CRUD, pod scheduling
etcd	`etcdctl`, `benchmark`	Read/write latency, throughput
CoreDNS	`dnsperf`	Query throughput, latency
Network	`k8s-netperf`	Pod-to-pod, service latency
Storage	`fio`, `kbench`	IOPS, throughput, latency
Ingress	`wrk`	HTTP RPS, response time
Monitoring	`Grafana`	Real-time metrics

Test types

For each component in the following posts, we will look at three types of tests:

Smoke: Validate configuration (1-5 min).
Load: Measure performance at expected load (15-60 min, 70-100% capacity).
Stress: Find the breaking point (10-30 min, 150-200% capacity).

Quick Start

# Clone repository
git clone https://github.com/nh4ttruong/k8s-perf-test.git
cd k8s-perf-test

# 1. etcd smoke test (if you have access)
./etcd/scripts/etcdctl/smoke.sh

# 2. API Server smoke test
kube-burner init -c kube-burner/api-server/smoke.yaml

# 3. Network smoke test
k8s-netperf --config network/smoke_1.yaml --local

# 4. DNS smoke test
kubectl apply -f coredns/
kubectl logs -f -l app=dnsperf -n coredns-perf-test

# 5. Storage smoke test
kubectl create namespace storage-perf-test
kubectl apply -f storage/smoke.yaml -n storage-perf-test

Evaluation Criteria

Kubernetes SLIs/SLOs

The Kubernetes project defines Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for a properly functioning cluster. These criteria are used for performance evaluation.

Reference: Kubernetes Scalability SLIs/SLOs

API Server SLOs

SLI	SLO	Description
Mutating API latency (P99)	≤ 1s	Time to process CREATE, UPDATE, DELETE
Non-mutating API latency (P99)	≤ 1s (single object)	Time to process GET single resource
Non-mutating API latency (P99)	≤ 30s (list objects)	Time to process LIST resources

Pod Startup SLOs

SLI	SLO	Condition
Pod startup latency (P99)	≤ 5s	Stateless pods, image already present
Pod startup latency (P99)	≤ 20s	Stateless pods, image pull required

Target Metrics by Component

Specific evaluation criteria for each component:

Control Plane

Component	Metric	Target	Critical	Reference
API Server	Mutating P99	< 500ms	< 1s	K8s SLOs
API Server	Non-mutating P99	< 200ms	< 1s	K8s SLOs
API Server	QPS sustained	\> 1000	\> 500	Depends on cluster size
API Server	Error rate	< 0.1%	< 1%
etcd	Write latency P99	< 25ms	< 50ms	etcd tuning
etcd	Read latency P99	< 10ms	< 25ms	etcd tuning
etcd	fsync duration P99	< 10ms	< 25ms	etcd hardware

Data Plane

Component	Metric	Target	Critical	Reference
Pod-to-pod	Throughput	\> 5 Gbps	\> 1 Gbps	Depends on physical network
Pod-to-pod	Latency	< 1ms	< 5ms	Same zone
Pod-to-service	Latency	< 2ms	< 10ms	Via kube-proxy
CoreDNS	Query rate	\> 10k QPS	\> 5k QPS	CoreDNS plugins
CoreDNS	P99 latency	< 10ms	< 50ms	With cache
CoreDNS	Cache hit ratio	\> 90%	\> 80%

Storage

Workload Type	Metric	SSD Target	NVMe Target	Reference
Database (OLTP)	Random 4K read IOPS	\> 10k	\> 50k	fio profiles
Database (OLTP)	Random 4K write IOPS	\> 5k	\> 20k
Database (OLTP)	P99 latency	< 5ms	< 1ms
Logging/Streaming	Sequential write MB/s	\> 200	\> 1000
Analytics (OLAP)	Sequential read MB/s	\> 300	\> 2000

Ingress

Metric	Small cluster	Large cluster	Reference
Requests/sec	\> 10k	\> 50k	NGINX tuning
P99 latency	< 100ms	< 20ms
Error rate (5xx)	< 0.1%	< 0.01%
Connection rate	\> 5k/s	\> 20k/s

Result Evaluation

Pass/Fail Criteria

Result	Condition
PASS	All metrics meet Target
CONDITIONAL PASS	All metrics within Critical range, some not meeting Target
FAIL	Any metric exceeds Critical threshold

Pre-Production Checklist

API Server P99 latency < 500ms under expected load
etcd write latency P99 < 25ms
No etcd leader election during 24h test
Pod startup time P99 < 5s (image cached)
DNS query latency P99 < 10ms
Storage IOPS meets workload requirements
Network throughput meets inter-zone requirements
Ingress handles expected peak traffic

References

Kubernetes Scalability SLIs/SLOs - Official SLO definitions
Kubernetes Scalability Thresholds - Cluster size limits
SIG Scalability - Scalability working group
kube-burner Documentation
etcd Performance
etcd Tuning
k8s-netperf
fio Documentation
CoreDNS
NGINX Ingress Controller

In the next post, we start with the heart of the cluster: etcd component → Kubernetes etcd Performance Benchmarks

Setting Up SAML Single Sign-On in Jira with Keycloak IDP

Nhật Trường — Wed, 15 Oct 2025 04:47:36 GMT

In the latest Jira products, including Jira Software and Jira Service Management, users can configure their own SAML/OAuth2 Identity Provider (IDP) without needing any plugins or extensions. This guide will help you configure SAML in your Jira application using Keycloak as the SAML IDP.

1. Create Keycloak SAML client as Identity Provider

Log in to your Keycloak account, select your realm to create your client for Jira authentication
Navigate Client → Create client
Configure the client with the following required values:

Key	Value
Client ID	`https://{jira_host}`
Root URL	`https://{jira_host}/`
Home URL	`https://{jira_host}/`
Valid redirect URIs	`https://{jira_host}/*`
IDP-Initiated SSO URL name	`https://{keycloak_host}/realms/master/protocol/saml`
Master SAML Processing URL	`https://{jira_host}/plugins/servlet/samlconsumer`
Name ID format	email
Force name ID format	On
Force POST binding	On
Include AuthnStatement	On
Sign documents	On
Sign assertions	On

Switch to Key tab, turn off Signing keys config:
Switch to Client scopes tab:
- Change role_list to Optional (if your client had). It will prevent Attribute element with duplicated Name error
- Choose the dedicated Assigned client scope to add new mappers:
Next, config Group list and User Property:
- Configure a new mapper:
  - Add memberOf as Group list to allow Jira to get your member groups:
  - Add firstName , lastName , email as User Property to allow Jira to get users information

Switch to Advanced tab, config https://{jira_host}/plugins/servlet/samlconsumer as Assertion Consumer Service POST Binding URL and Logout Service POST Binding URL:
Navigate Realms Setting, choose Key tab, see RS256 and copy and remember the Certificate for next step:

2. Config Jira authentication method

The latest Jira products support authentication methods that allow authentication via SAML/OAuth2:

Now, let's configure SAML as Single Sign-On in Jira. I will use Jira Service Management as an example:

Access https://{jira_host}/plugins/servlet/authentication-config (change your DNS) or click the top right gear icon, choose System, navigate Authentication methods in left navbar, choose Add configuration. Fill the Name and choose SAML as Authentication method
Next fill the Name and required options for SAML SSO settings as table below:

Key	Value
Single sign-on issuer	https://{keycloak_host}/realms/{realms}
Identity provider single sign-on URL	https://{keycloak_host}/realms/{realms}/protocol/saml
X.509 Certificate	Paste certificate which obtain in previous step
Username mapping	${NameID}
Name ID Policy	Email Address
Sign requests	Off/Uncheck

If you want JIT to allow users to be created and updated automatically when they log in through SSO to Atlassian Data Center applications, specify as below:

Config remain options and click Save configuration to finish:
Back to Authentication methods, we have configured SAML:
Click Action, Test sign-in
Login with your SAML account, and then we accessed Jira:

You have successfully configured SAML authentication for Jira Service Management using Keycloak as the Identity Provider. This setup will streamline the login process for your Jira applications, making it easier for users to access them.

Guide to Access S3 Storage as Local Filesystem

Nhật Trường — Thu, 10 Jul 2025 17:00:50 GMT

S3 object storage offers scalable and cost-effective storage solutions but working with it directly can be challenging when your applications expect traditional filesystem access. This guide explores two powerful tools - rclone and s3fs - that bridge this gap by mounting S3 buckets as local filesystems.

Prerequisites

Before getting started, ensure you have installed the required third-party software:

rclone: A versatile command-line tool for managing files on cloud storage
- Installation: https://rclone.org/install/
- Supports numerous storage providers beyond S3
s3fs: A FUSE-based filesystem specifically designed for S3
- Installation: https://github.com/s3fs-fuse/s3fs-fuse
- Available in most package managers: apt install s3fs or yum install s3fs-fuse

Configuring Your S3 Mount Tools

Step 1: Setting Up Configuration Files

Each tool requires specific configuration to connect to your S3 bucket:

rclone Configuration

Create a configuration file at /etc/rclone.conf:

[s3-mount]
type = s3
provider = AWS
env_auth = false
access_key_id = YOUR_ACCESS_KEY
secret_access_key = YOUR_SECRET_KEY
endpoint = YOUR_ENDPOINT_URL
acl = private

See rclone.config.example for a complete template.

s3fs Configuration

Create a credentials file at /etc/passwd-s3fs with the following format:

ACCESS_KEY_ID:SECRET_ACCESS_KEY

Set appropriate permissions:

chmod 600 /etc/passwd-s3fs

See s3fs-passwd.example for reference.

Step 2: Creating Mount Scripts

Create shell scripts to manage the mounting process with proper parameters:

rclone Mount Script

Create /usr/local/bin/rclone-mount.sh:

#!/bin/bash

# Configuration variables
bucket="your-bucket-name"
url="https://your-endpoint.com"
mount_point="/mnt/s3-bucket"
config_file="/etc/rclone.conf"
log_file="/var/log/rclone-mount.log"
log_level="DEBUG"
provider="s3"  # Options: vstorage, s3, etc.

# Create mount point if it doesn't exist
mkdir -p "${mount_point}"

# Mount the bucket
rclone mount \
  --config "${config_file}" \
  --log-file "${log_file}" \
  --log-level "${log_level}" \
  --allow-other \
  --file-perms 0644 \
  --dir-perms 0755 \
  --vfs-cache-mode full \
  --vfs-cache-max-size 1G \
  --vfs-read-chunk-size 10M \
  --daemon \
  "${provider}:${bucket}" "${mount_point}"

exit 0

Make the script executable:

chmod +x /usr/local/bin/rclone-mount.sh

s3fs Mount Script

Create /usr/local/bin/s3fs-mount.sh:

#!/bin/bash

# Configuration variables
bucket="your-bucket-name"
url="https://your-endpoint.com"
mount_point="/mnt/s3-bucket"
passwd_file="/etc/passwd-s3fs"
log_file="/var/log/s3fs-mount.log"
log_level="debug"
region="HCM03"  # Your specific region

# Create mount point if it doesn't exist
mkdir -p "${mount_point}"

# Mount the bucket
s3fs "${bucket}" "${mount_point}" \
  -o passwd_file="${passwd_file}" \
  -o url="${url}" \
  -o use_path_request_style \
  -o allow_other \
  -o umask=0022 \
  -o dbglevel="${log_level}" \
  -o curldbg \
  -o endpoint="${region}" \
  > "${log_file}" 2>&1

exit 0

Make the script executable:

chmod +x /usr/local/bin/s3fs-mount.sh

Step 3: Creating Systemd Service Units

To ensure your S3 bucket mounts automatically at boot and is properly managed by systemd:

rclone Systemd Service

Create /lib/systemd/system/rclone-mount.service:

[Unit]
Description=Mount S3 Bucket using rclone
After=network-online.target
Wants=network-online.target

[Service]
Type=forking
ExecStart=/usr/local/bin/rclone-mount.sh
Restart=on-failure
RestartSec=10

[Install]
WantedBy=multi-user.target

s3fs Systemd Service

Create /lib/systemd/system/s3fs-mount.service:

[Unit]
Description=Mount S3 Bucket using s3fs
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/s3fs-mount.sh
RemainAfterExit=yes
ExecStop=/bin/fusermount -u /mnt/s3-bucket

[Install]
WantedBy=multi-user.target

Step 4: Enable and Start the Service

Choose which tool you prefer (rclone or s3fs) and enable its service:

# For rclone
sudo systemctl daemon-reload
sudo systemctl enable rclone-mount.service --now
sudo systemctl status rclone-mount.service

# For s3fs
sudo systemctl daemon-reload
sudo systemctl enable s3fs-mount.service --now
sudo systemctl status s3fs-mount.service

Performance Considerations

rclone:
- Offers better performance for large files
- More feature-rich with built-in caching
- Uses more memory but provides better throughput
- Excellent for backup/sync operations
s3fs:
- Simpler, lighter resource footprint
- Better for direct file access patterns
- More POSIX-compliant but slower for metadata operations
- Good for applications that need basic file access

Troubleshooting Common Issues

Mount Failure

If your mount fails to initialize:

Check credentials: Verify your access keys are correct in the configuration files

 cat /var/log/rclone-mount.log | grep "auth"
 # or
 cat /var/log/s3fs-mount.log | grep "auth"

Test connectivity: Confirm network access to your S3 endpoint
```
 curl -I https://your-endpoint.com
```

Permissions: Ensure your mount scripts are executable

 ls -la /usr/local/bin/rclone-mount.sh
 ls -la /usr/local/bin/s3fs-mount.sh

Bucket existence: Verify the bucket name is spelled correctly and exists

 # For AWS S3
 aws s3 ls s3://your-bucket-name

 # For other S3 providers, use their CLI tools

Performance Issues

If you experience slow access:

Increase cache size: For rclone, modify the --vfs-cache-max-size parameter
Adjust chunk size: Modify --vfs-read-chunk-size for your workload
Check network latency: High latency to your S3 endpoint will impact performance
Consider local caching: For frequently accessed files

References

Root Me Solutions & Write-ups

Nhật Trường — Wed, 09 Jul 2025 17:00:22 GMT

This repository offers write-ups and solutions for Root Me CTF challenges, aimed at educational and ethical hacking practice. It provides step-by-step guides for various web security, application security, and digital forensics challenges.

TL;DR

⚠

Disclaimer: Please read the Disclaimer before "dive" into the challenges.

Category	Description	Quick Access
Cross-Site Scripting	XSS attack techniques & solutions	XSS challenges
CSRF	Cross-Site Request Forgery	CSRF challenges
PHP Vulnerabilities	File inclusion, upload, etc.	PHP challenges
SQL Injection	SQLi types and bypasses	SQL Injection challenges
Steganography	Hidden data in files/images	Steganography challenges
Forensics	Digital forensics challenges	Forensics challenges

Contributing

Contributions, corrections, and new write-ups are welcome! Please open an issue or pull request.

Handling Zalo OA API with Python wrapper

Nhật Trường — Wed, 09 Jul 2025 02:53:30 GMT

A simple API wrapper script serves as a straightforward API wrapper for the Zalo Official Account (OA), offering a user-friendly interface to efficiently manage users, access detailed user information, and facilitate message exchanges. It's perfect for automating your Zalo OA interactions and enhancing customer engagement.

Key Features

✅ User Management — Retrieve the full list of users who’ve interacted with your OA.
📇 User Info — Fetch comprehensive details about individual users.
✉️ Messaging — Support for sending both text and image messages.
📬 Message Retrieval — Pull inbound messages from your OA

Setup & Installation

Clone the repository:

git clone https://github.com/nh4ttruong/zalo-oa-api-wrapper.git
cd zalo-oa-api

Install Dependencies: Ensure you have Python 3.7+ and install required packages

pip install -r requirements.txt

Configuration

Obtain your Zalo OA Access Token

Head to the Zalo API Explorer
Choose OA Access Token and click Get Access Token
Tick to allow Term of User and copy the generated Access Token

Create your .env file

ZALO_OA_ZALO_OA_ACCESS_TOKEN=your_token_here

Configure messaging behavior. In .env, set flags:
- SEND_MESSAGE_TEXT, SEND_MESSAGE_WITH_IMAGE
- SEND_ALL_USERS, SEND_USER_LIST
- IMAGE_FILE_PATH, MESSAGE_CONTENT, or MESSAGE_FILE_PATH

💡

See the API Reference and Examples for details.

Examples

Sending a text message to a specific user

To send a message to a specific user (e.g., user_id = "7186086631826132217"):

from dependencies.messages import *

# User ID of the recipient
user_id = "7186086631826132217"
message_text = "Hello, this is a test message"

# Send a text message to the specified user
send_text_message(ZALO_OA_ACCESS_TOKEN, user_id, message_text)

Sending an image message

from dependencies.messages import *
from dependencies.upload import *

# Specify the user ID(s) and message content
user_id = "7186086631826132217"
users = [{"user_id": user} for user in user_id.split(",")]
message_text = "Hello, here’s an image for you!"

# Upload the image and retrieve the attachment ID
attachment_id = upload_media(ZALO_OA_ACCESS_TOKEN, file_path=IMAGE_FILE_PATH, type="image")

# Send the message along with the image
send_message_to_users(ZALO_OA_ACCESS_TOKEN, users, message_text=message_text, image_file=IMAGE_FILE_PATH)

Broadcast messages to all users

from dependencies.messages import *
from dependencies.upload import *
from dependencies.users import *

# Check if all users should receive the message
if SEND_ALL_USERS == 'True':
    users = get_all_users(ZALO_OA_ACCESS_TOKEN)
else:
    # Use a comma-separated list of user IDs from SEND_USER_LIST in .env file
    users = [{"user_id": user.strip()} for user in SEND_USER_LIST.split(",")]

# Send text or image message based on configuration
if SEND_MESSAGE_TEXT == 'True':
    if SEND_MESSAGE_WITH_IMAGE == 'True':
        send_message_to_users(ZALO_OA_ACCESS_TOKEN, users, message_text=MESSAGE_CONTENT, image_file=IMAGE_FILE_PATH)
    else:
        send_message_to_users(ZALO_OA_ACCESS_TOKEN, users, message_text=MESSAGE_CONTENT)
elif SEND_MESSAGE_WITH_IMAGE == 'True':
    send_message_to_users(ZALO_OA_ACCESS_TOKEN, users, message_text=MESSAGE_CONTENT, image_file=IMAGE_FILE_PATH)
else:
    print("No message to send")

Summary

This lightweight Python wrapper simplifies interaction with the Zalo OA API—handling user retrieval, message sending (text/image), and message intake with ease. Ideal for building engagement pipelines, bots, or customer support tools.

How To Clean ETCD Benchmark Efficiently?

Nhật Trường — Wed, 02 Jul 2025 06:32:06 GMT

A lightweight CLI tool to scan, detect, and optionally remove benchmark or non-UTF8 keys from your etcd key-value store.

This tool was created as an extension because the official etcd/tools/benchmark does not include a built-in clean command or the ability to directly manage invalid or benchmark keys. By default, the etcd benchmark tool creates a large binary keyspace for testing etcd. Therefore, etcd-benchmark-cleaner helps retrieve and remove unnecessary binary keys in etcd, reducing its size.

Be cautious and do not run the tool if you are not sure what it does.

Features

Scan keys under a specified hex-encoded prefix
Detect benchmark or invalid UTF-8 keys
Supports dry-run mode for safe validation
Secure connection via TLS
Clear, color-coded terminal output for easy inspection

🔧 Installation

Install package:

go install github.com/nh4ttruong/etcd-benchmark-cleaner@latest
export PATH=${PATH}:`go env GOPATH`/bin && which etcd-benchmark-cleaner

Or manual build locally:

git clone https://github.com/nh4ttruong/etcd-benchmark-cleaner.git
cd etcd-benchmark-cleaner
go build -o etcd-benchmark-cleaner
# Or run directly with `go run clean.go [flags]`

Usage

./etcd-benchmark-cleaner [flags]
    Usage of etcd-benchmark-cleaner:
        --cacert string
                Path to trusted CA file (default $ETCDCTL_CACERT)
        --cert string
                Path to client certificate (default $ETCDCTL_CERT)
        --debug
                Print UTF-8 keys and values
        --dry
                Dry-run mode (simulates deletion)
        --endpoints string
                Comma-separated list of etcd endpoints (default $ETCDCTL_ENDPOINTS)
        --key string
                Path to client private key (default $ETCDCTL_KEY)
        --prefix string
                Hexadecimal prefix of keys to scan
        --remove
                Delete binary keys
        --timeout duration
                Request timeout (default 5s)

Flags to run etcd-benchmark-cleaner:

Flag	Default	Description
`--endpoints`	localhost:2379	Comma-separated list of etcd endpoints (required)
`--prefix`	"" (all)	Hex-encoded prefix of keys to scan (e.g., `02`, `74657374`)
`--cacert`	`$ETCDCTL_CACERT`	Path to CA file (or set `$ETCDCTL_CACERT`)
`--cert`	`$ETCDCTL_CERT`	Path to client cert (or set `$ETCDCTL_CERT`)
`--key`	`$ETCDCTL_KEY`	Path to client key (or set `$ETCDCTL_KEY`)
`--debug`	N/A	Print raw UTF-8 keys and values
`--dry`	N/A	Simulate deletion without making changes
`--remove`	N/A	Remove binary benchmark keys (caution)
`--timeout`	5s	Request timeout (default: `5s`)

Examples

Scan all keys for benchmark entries

./etcd-benchmark-cleaner --endpoints=https://127.0.0.1:2379

Scan keys with a benchmark prefix (`0x00`)

./etcd-benchmark-cleaner --endpoints=https://127.0.0.1:2379 --prefix 00

Dry-run deletion of benchmark keys (no changes made)

./etcd-benchmark-cleaner --endpoints=https://127.0.0.1:2379 --prefix 02 --dry

Remove binary benchmark keys (irreversible)

./etcd-benchmark-cleaner --endpoints=https://127.0.0.1:2379 --remove

🔐 TLS Support

If your etcd cluster uses TLS, provide the following flags:

--cacert path/to/ca.crt
--cert   path/to/client.crt
--key    path/to/client.key

Or set them as environment variables:

export ETCDCTL_CACERT=...
export ETCDCTL_CERT=...
export ETCDCTL_KEY=...

Best Practice

After running etcd-benchmark-cleaner, you should obtain the safe revision of the etcd state, then compact at that revision and defrag each etcd node.

Please check the further information about compact and defrag at ETCD | Maintenance guide

# Get safe revision
etcdctl endpoint status --write-out=json | jq '[.[] | .Status.header.revision]'

# Compact using the previous safe revision. Perform this on one of the three etcd nodes.
etcdctl --endpoints="$ETCD_NODE_1" compact 

# Defrag nodes in the following order
etcdctl --endpoints="$ETCD_NODE_1" defrag && sleep 10
etcdctl --endpoints="$ETCD_NODE_2" defrag && sleep 10
etcdctl --endpoints="$ETCD_NODE_3" defrag

# Watch change in etcd DB size
etcdctl endpoint status --write-out=json

Note

Always run with --dry first before using --remove
Ensure your prefix is correct and hex-encoded
Backup etcd or test against a dev cluster before destructive operations

A Guide to Managing Kubernetes Secrets with AWS Secrets Manager and External Secrets Operator

Nhật Trường — Wed, 14 May 2025 07:33:26 GMT

Managing secrets in Kubernetes is notoriously tricky. Hardcoding them? Yikes. Storing them in plaintext? Dangerous. In this post, I’ll show you how to securely integrate AWS Secrets Manager into your K8s workflow using External Secrets Operator (ESO) - so you can automate secret syncing and sleep better at night.

This post walks you through a clean, secure approach: syncing secrets from AWS Secrets Manager (SM) into Kubernetes using the External Secrets Operator (ESO). You'll learn how to set it up with Helm, configure access policies, and sync secrets in different formats - using real examples from the trenches.

Overview

ExternalSecrets is an open-source Kubernetes plugin that serves two main functions: it injects secrets from supported external providers into your application cluster and synchronizes these injected secrets with their corresponding remote counterparts.

In the ExternalSecrets architecture, two key resources play crucial roles:

SecretStore: This resource manages authentication, enabling your Kubernetes cluster to access AWS resources, specifically secrets. It acts as a bridge, ensuring secure and authorized access to the secrets stored in AWS.
ExternalSecret: This resource is responsible for defining and creating secrets. It utilizes the SecretStore to retrieve specific secrets and provides a template for Kubernetes controllers to generate local secrets within the cluster.

Step-By-Step

🛠️ Install External Secrets Operator via Helm

First, install the ESO Helm chart into its own namespace:

kubectl create ns eso
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets -n eso

🚫 Optional: If you manage CRDs manually, add --set installCRDs=false.

🔐 AWS IAM Setup for External Secrets

We need to create an IAM user or role with access to read specific secrets from AWS Secrets Manager.

Example IAM Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "secretsmanager:ListSecrets",
        "secretsmanager:GetSecretValue",
        "secretsmanager:ListSecretVersionIds"
      ],
      "Resource": [
        "arn:aws:secretsmanager:ap-southeast-1:071123451249:secret:demo*"
      ],
      "Condition": {
        "StringLike": {
          "secretsmanager:SecretId": [
            "arn:aws:secretsmanager:ap-southeast-1:071123451249:secret:prod/demo"
          ]
        },
        "StringEquals": {
          "aws:username": ["secret-eso"]
        }
      }
    }
  ]
}

Why this policy setup?

Fine-grained access: Principle of least privilege.
Conditionals: Limits access to just the needed secrets + specific IAM username.

🔑 Kubernetes Secret for AWS Credentials

We create a K8s secret that holds AWS access keys.

echo -n 'KEYID' > ./access-key
echo -n 'SECRETKEY' > ./secret-access-key

kubectl create secret generic demo-awssm-secret \
  --from-file=./access-key \
  --from-file=./secret-access-key

rm -f ./access-key ./secret-access-key

⚠️ Pro tip: Store these secrets in a GitOps-friendly secret manager like SealedSecrets, SOPS, or External Secrets from your Git repo—not directly in plain YAML files.

🏗️ Configuring the SecretStore

This tells ESO how to talk to AWS Secrets Manager:

apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: demo-secretstore
spec:
  provider:
    aws:
      service: SecretsManager
      region: ap-southeast-1
      auth:
        secretRef:
          accessKeyIDSecretRef:
            name: demo-awssm-secret
            key: access-key
          secretAccessKeySecretRef:
            name: demo-awssm-secret
            key: secret-access-key

💾 Syncing Secrets in Kubernetes

You have three common use cases. Here's what each one looks like:

1. ConfigMap-style Secret (plaintext)

Best for multi-line config files.

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: demo-secret-as-configmap-template
  namespace: demo
spec:
  refreshInterval: 5m
  secretStoreRef:
    name: demo-secretstore
    kind: SecretStore
  target:
    name: demo-config
    template:
      engineVersion: v2
      data:
        core-dev.php: "{{ .coredev | toString }}"
        custom.ini: "{{ .customini | toString }}"
        demo-config.conf: "{{ .conf| toString }}"
        service-url.php: "{{ .serviceurl | toString }}"
  data:
    - secretKey: coredev
      remoteRef:
        key: prod/demo/core-dev.php
    - secretKey: customini
      remoteRef:
        key: prod/demo/custom.ini
    - secretKey: conf
      remoteRef:
        key: prod/demo/cnf.conf
    - secretKey: serviceurl
      remoteRef:
        key: prod/demo/service-url.php

🧠 Output: A Secret with multiple keys mimicking a ConfigMap, storing plaintext files.

2. Raw Secret from JSON (key/value)

Ideal for app credentials or API keys.

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: demo-secret
  namespace: demo
spec:
  refreshInterval: 2m
  secretStoreRef:
    name: demo-secretstore
    kind: SecretStore
  target:
    name: demo-secret
    creationPolicy: Owner
  dataFrom:
    - extract:
        key: prod/demo/secret

🎯 Output: A Kubernetes Secret with key/value pairs extracted from a JSON blob in AWS SM.

3. Templated ConfigMap (Redis Config)

When you want ESO to inject secrets into a full config file template.

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: demo-secret-in-configmap
  namespace: demo
spec:
  refreshInterval: 2m
  secretStoreRef:
    name: demo-secretstore
    kind: SecretStore
  target:
    name: demo-secret-redis-config
    template:
      data:
        redis.conf: |
          bind 0.0.0.0
          port 6379
          requirepass "{{ .redisPassword | toString }}"
          protected-mode no
          appendonly no
          supervised no
          save 3600 1 300 10 30 20
          dir /opt/redis/data
          loglevel notice
          logfile "/opt/redis/data/redis.log"
          databases 6
  data:
    - secretKey: redisPassword
      remoteRef:
        key: prod/demo/redis-credential

⚙️ Output: A fully rendered Redis config with live secrets baked in.

More Info

🔐 Managing SharePoint with an Azure App using Sites.Selected Permissions

Nhật Trường — Tue, 13 May 2025 06:25:24 GMT

Use Cases

Why even go through this setup? Here's where this setup totally makes sense:

You want to manage SharePoint without using personal credentials (aka ditch the annoying login prompts).
You need to upload, download, or tweak files in SharePoint from scripts or automated jobs.
You want to build automation workflows that don’t ask you to "Sign in with Microsoft" every five minutes.

🔍 "Manage" means Read, Write, and even changing permissions

How Azure Permissions Work

Before we dive in, let’s get our heads around two key types of Azure API permissions:

Delegated permissions: These require a signed-in user. Think "act on behalf of a user" — good for interactive apps.
Application permissions: These don’t need a user at all. Perfect for automation and backend stuff. Full freedom with the right consents.

The key player here? Sites.Selected permission. It’s available in both modes, but we’re going with Application permission because we’re all about that sweet non-interactive automation.

Prerequisites

Here’s what you need to follow along:

PowerShell with the PnP module
An Azure AD Application (we’ll set it up in a sec)
Admin consent for Graph API permissions
Access to the SharePoint site you wanna manage

Step-by-Step

1. Register a New Azure App

Go to Azure Portal → App registrations → New registration.

2. Add Permissions

Go to your app → API permissions → Add:
- Sites.FullControl.All (used only temporarily)
- Sites.Selected

🛑 Heads-up: You must request admin consent for these permissions. Microsoft requires Sites.FullControl.All to grant site-specific permissions via Sites.Selected.

3. Create a Certificate

Use PowerShell to generate a cert:

New-PnPAzureCertificate -OutPfx pnp.pfx -OutCert pnp.cer -CommonName .sharepoint.com

Now you’ve got pnp.pfx and pnp.cer.

4. Import Certificates

Browse to the folder, then import both files to Current User → just Next → Finish.

5. Upload Certificate to Azure

In the Azure Portal, go back to your app → Certificates & secrets → Certificates tab → Upload pnp.cer

💡 Save the Thumbprint — you'll need it soon.

6. Connect to SharePoint + Grant Permissions

Let’s put this all together in PowerShell:

$siteUrl = "https://.sharepoint.com/sites/"
$tenant = ".sharepoint.com"
$clientId = ""
$certThumbprint = ""

Connect-PnPOnline -Url:$siteUrl -ClientId:$clientId -Thumbprint:$certThumbprint -Tenant:$tenant
# FullContol/Read/Write/Manage permission https://pnp.github.io/powershell/cmdlets/Grant-PnPAzureADAppSitePermission.html
Grant-PnPAzureADAppSitePermission -Sưite $siteUrl -AppId $clientId -DisplayName "SharePoint Permission" -Permissions "Write"

# Check
Get-PnPAzureADAppSitePermission -Site $siteUrl -AppIdentity $clientId

Now your app officially has access to that SharePoint site with Sites.Selected.

7. Clean Up

Once everything works, you can go back and remove Sites.FullControl.All permission. You don’t need it anymore — your app’s now living on like a pro.

Final Thoughts

Using Sites.Selected with application permissions is low-key one of the best ways to build secure, automated SharePoint workflows — especially when you wanna keep things headless and passwordless.

You get fine-grained control, and you can automate all the boring stuff without needing to run it interactively. SysOps, Power Users, and Scripters — this one's for you.

Let me know if you want a follow-up post on automating uploads/downloads using this setup. Happy scripting! 🚀💻

References

Mastering Disk Usage Analysis With ncdu

Nhật Trường — Mon, 28 Apr 2025 13:59:19 GMT

As a systems engineer, understanding disk usage is not just beneficial — it's essential for maintaining the health and performance of your servers. In this guide, we’ll explore how to efficiently analyze your filesystem using ncdu along with a few other reliable command-line tools.

What is `ncdu`?

ncdu (NCurses Disk Usage) is a lightweight, interactive disk usage analyzer.
It gives you a fast and easy way to identify which directories and files are hogging your disk space — all through a simple terminal interface.

Installing `ncdu`

Getting ncdu up and running is quick:

On Debian/Ubuntu-based systems:

sudo apt install ncdu

On Red Hat/CentOS/Fedora systems:

sudo yum install ncdu

Using `ncdu` to Analyze Filesystem Usage

Running ncdu is super straightforward — just point it at a directory:

ncdu /path/to/directory

Restricting to a Single Filesystem

One of ncdu's killer features is the -x option, which restricts the scan to a single filesystem:

ncdu -x /var/log

This is really handy when you want to avoid crossing into mounted volumes or other partitions accidentally.

Navigating the `ncdu` Interface

When you launch ncdu, you can interact with it using your keyboard:

Arrow keys: Navigate through the directories
Enter: Dive into a directory
d: Delete a file or directory (⚠️ use carefully)
q: Quit the program

It’s intuitive enough that you’ll be flying through your filesystem in minutes.

Alternative Tools for Disk Usage Analysis

While ncdu is amazing for interactive exploration, the good old command line has some other tricks up its sleeve.

1. `df` — Display Free Disk Space

The df command shows you disk space usage and free space across all mounted filesystems:

df -h

The -h flag makes it human-readable (i.e., shows sizes in KB, MB, GB).

To focus on a specific directory:

df -h /var

2. `du` — Estimate File and Directory Sizes

The du command is perfect for quick checks:

du -sh /var/log

-s: Display only the total size
-h: Human-readable format

Want a breakdown of subdirectories?

du -h --max-depth=1 /var/log | sort -hr

This will show you the size of each immediate subdirectory, sorted from largest to smallest.

3. `find` — Locate Large Files

Sometimes you just need to hunt down the big files:

find /var -type f -size +100M -exec ls -lh {} \; | sort -k5 -hr

This finds files larger than 100MB and lists them, sorted by size.

When to Use Each Tool

Tool	Best Use Case
`ncdu`	Interactive exploration and cleanup
`df`	Quick snapshot of disk space across filesystems
`du`	Detailed file/directory size summaries (great for scripting)
`find`	Laser-targeted search for huge files

Conclusion

Keeping an eye on your disk usage is essential for any serious server management strategy.
ncdu is an awesome tool for interactive, detailed analysis — especially when paired with options like -x to stay within boundaries.
But don't sleep on the classics either: df, du, and find each bring their own strengths to your sysadmin toolkit.

By combining these tools, you’ll stay ahead of disk space problems before they turn into full-blown outages. 🚀

Infrastructure Avengers: What Role Are You Playing in the Tech Universe?

Nhật Trường — Sun, 27 Apr 2025 18:07:55 GMT

Sometimes, the only way to survive tech culture is to laugh at it.

Recently, I stumbled upon a meme that hit way too close to home. Like, if you've ever touched a bash script, configured a VPC, or even just nodded in agreement during a “high availability” meeting — you’re gonna feel this deep in your soul.

The meme features four Marvel heroes representing four tech roles, and honestly, it’s too accurate.

DevOps Engineer

First up, we got DevOps, repped by none other than Thor himself — big muscles, big hammer, even bigger solo energy.
DevOps engineers walk into every project like,

"Oh you need CI/CD? Oh you need k8s clusters? Oh you need monitoring? Bet. I got this. Alone."

They’re building pipelines at 2AM, scripting terraform modules mid-flight, and convincing themselves that "it's just a small hotfix" when they’re really rebuilding half the infra.
Thor thinks he doesn’t need anyone else — just like every DevOps bro hammering away at Jenkinsfiles thinking they'll achieve world peace.

💬 Reality check? You can't do it all, King. But we admire the hustle.

Cloud Engineer

Next is our boy, Ant-Man, playing the Cloud Engineer — which is just chef’s kiss perfect.
Cloud Engineers literally live for scaling.
Auto-scaling groups? Scaling databases vertically and horizontally? Burstable instances? Spot pricing?
If it doesn’t scale, it’s dead to them.

They treat Terraform outputs like sacred scrolls and think "high availability" is a personality trait.

"Bro why would you need a database if you can just scale your read replicas, duh."

Meanwhile, the app is still crashing because someone hardcoded localhost in production. 🧍‍♂️

Systems Engineer

Enter: Systems Engineers — the Tony Starks of the tech world.
These guys are out here building insane sht*, making servers dance, automating OS-level magic, and customizing kernels because “it’s more efficient this way.” (??)

Ask a Systems Engineer what they’re working on and they'll hit you with:

"Oh just a tiny side project — a quantum-optimized scheduler that reduces boot time by 0.000001%."

Meanwhile, no one asked, but we’re still lowkey impressed.

They've got racks in their garage, think "uptime" is sexier than abs, and measure worth in uptime percentages.

Site Reliability Engineer (SRE)

Finally, the SREs. Same Thor energy... but now battle-worn and with an eye patch — because experience.
SREs love to pretend they’re not DevOps.

"No no no, I'm an SRE. It's totally different. We use SLIs, SLOs, error budgets... we have standards, man."

Bro, you’re still writing the same bash scripts and fixing the same janky Kubernetes clusters.
Just now you have fancier words to justify why the site went down at 3AM.

SREs are like that friend who definitely was a hipster before it was cool, but now insists he’s just “alternative.”
You’re still in the same boat as the rest of us, mate.

Conclusion

At the end of the day, whether you're carrying a hammer, a shrinking suit, a powered exoskeleton, or just a pager... we're all in the same chaotic, beautiful mess called tech.

You’re gonna cry, you’re gonna laugh, you’re gonna scream internally when the CI pipeline fails again for reasons that make no sense.

But hey — at least memes like this one remind us we’re not alone. 🫶

Learn Web Security With PortSwigger

Nhật Trường — Sun, 27 Apr 2025 17:46:51 GMT

Hey guys ✌🏻, I share my self-study journey in Web Security here, hoping these notes provide something useful for both newcomers and experienced folks.

Summary

In my third year studying Information Security, I realized that what we learn in class is just the basics. To become a real hacker 👌🏻, self-study and practice are essential.

Feeling unsure and confused, I decided to dive into PortSwigger Labs—the only platform I knew at the time for learning Web Security properly.

PortSwigger is a very comprehensive platform for Web Security, with the highlight being PortSwigger Academy—a huge collection of labs where you can practice from basic to advanced levels. I spent over a month working through these labs, which was tough but very rewarding.

Labs I Completed

Hmmm...

Honestly, I'm pretty lazy. Writing these notes helps me remember and serves as a little diary of my self-study journey in Web Security. If you're also searching for direction, don't hesitate to start with the small steps.l things. Let’s Go!

Struggling with deprecated HorizontalPodAutoscaler API version in Kubernetes +1.26? How to Fix It

Nhật Trường — Sun, 27 Apr 2025 17:14:16 GMT

In Kubernetes version 1.26 or later, you might have encountered this frustrating error no matches for kind 'HorizontalPodAutoscaler' in version 'autoscaling/v2beta2'. This article will help you automatically fix the API version in your Helm release secrets.

TL;DR

Suddenly seeing "no matches for kind 'HorizontalPodAutoscaler' in version 'autoscaling/v2beta2'" or "no matches for kind 'HorizontalPodAutoscaler' in version 'autoscaling/v2beta1'" when running Helm upgrades? This happens because these API versions for HPA are deprecated in Kubernetes 1.26+. Use the command below to automatically update the API version in your Helm release secrets:

curl -sL https://gist.githubusercontent.com/nh4ttruong/172936eed756ca11e60efacf42117d2c/raw/31637ec5fed7a41ae202711304671df2de0bc1cd/fix-api-version.sh | bash -s

Always review scripts before piping them directly to bash. The script above updates Helm release secrets to use `autoscaling/v2` instead of deprecated API versions.

Problem

If you've recently upgraded your Kubernetes cluster to version 1.26 or later, you might have encountered this frustrating error "no matches for kind 'HorizontalPodAutoscaler' in version 'autoscaling/v2beta2'"

This error occurs because Kubernetes 1.26 has officially removed support for the autoscaling/v2beta2 API version, which was previously marked as deprecated. The current recommended version is `autoscaling/v2`

This issue is particularly problematic when dealing with Helm charts for several reasons:

Embedded API Versions: Helm charts often have the API version hard-coded in their templates. When these charts were created with older API versions, they won't automatically adapt to newer Kubernetes versions.
Release History Storage: Helm stores the deployed resources' state (including API versions) in secrets within your Kubernetes cluster. Even if you update your chart's templates, the previously deployed release information still contains references to the deprecated API.
Update Mechanism Issues: When you attempt to upgrade a release with helm upgrade, Helm compares the new manifest against the stored release data. If it encounters incompatible API versions, the upgrade fails.
No Automatic Migration: Helm doesn't automatically migrate resources from deprecated APIs to their replacements, requiring manual intervention.

The result is that even after updating your Helm chart's templates to use autoscaling/v2, you might still face errors because the stored release data references the now-removed autoscaling/v2beta2 API.

How to fix?

Here's a comprehensive guide to fixing this issue within your Helm releases:

1. Identify the Affected Helm Release

First, locate the secret that stores your Helm release configuration:

kubectl get secret -l owner=helm,status=deployed --all-namespaces

From the output, identify the secret related to your specific Helm release. It will typically follow a naming pattern that includes your release name.

2. Extract and Decode the Release Configuration

Export the secret to a file:

kubectl get secret  -n  -o yaml > release.yaml

Then decode the stored Helm release data:

cat release.yaml | grep -oP '(?<=release: ).*' | base64 -d | base64 -d | gzip -d > release.data.decoded

This command extracts the release data, decodes it from base64 (twice, as Helm doubly encodes it), and decompresses it.

3. Modify the HorizontalPodAutoscaler API Version

Open the decoded file in your preferred text editor:

nano release.data.decoded  # or vim, or any other editor

Search for all occurrences of autoscaling/v2beta2 and replace them with autoscaling/v2.

Be sure to verify that the structure of any HorizontalPodAutoscaler resources is compatible with the v2 API. In particular, note that the metrics specification might need adjustments.

4. Re-encode and Update the Helm Secret

After making your changes, you need to re-encode the file:

cat release.data.decoded | gzip | base64 | base64 | tr -d '\n' > encoded_release.txt

Create a patch file named patch.json:

{
  "data": {
    "release": ""
  }
}

Replace with the content of your encoded_release.txt file.

Apply the patch to update the Helm release secret:

kubectl patch secret  -n  --patch-file=patch.json

5. Verify and Redeploy

Finally, run a Helm upgrade to ensure the fix is applied:

helm upgrade   -n

To verify that the HorizontalPodAutoscaler is now using the correct API version:

kubectl get hpa -n  -o yaml | grep apiVersion

This should show apiVersion: autoscaling/v2 for your HorizontalPodAutoscaler resources.

Prevention Measures

To avoid similar issues in the future:

Stay Informed About API Deprecations: Before upgrading Kubernetes, always check the release notes for any APIs that are deprecated and will be removed.
Update Your Helm Charts: Make sure your charts are regularly updated to support the latest Kubernetes API versions.
Use Automated Tools: Consider using tools like pluto to find deprecated API usage in your cluster before upgrading.
Test in Non-Production: Always test Kubernetes upgrades in a staging environment first to identify these kinds of issues.
Helm Chart Maintenance: For teams maintaining their own charts, set up a process to regularly review and update API versions.

Conclusion

API deprecations are a common challenge when upgrading Kubernetes, and Helm's release storage can make these issues especially difficult to solve. By learning how to diagnose and fix these issues within Helm's stored release data, you can ensure smoother upgrades and reduce disruptions to your services.

This specific fix for the HorizontalPodAutoscaler API version issue shows how important it is to stay updated with Kubernetes changes while keeping your deployment tools like Helm compatible.

Have you faced other interesting challenges when upgrading Kubernetes or working with Helm charts? Let me know.

S-SDLC - A Part Of DevSecOps Journey

Nhật Trường — Sun, 27 Apr 2025 16:05:34 GMT

"Security is not a checkbox. It's a mindset."

In today’s world of high-speed software development, traditional security models just don’t cut it anymore.
That’s where S-SDLC (Secure Software Development Life Cycle) steps in — it's not just a buzzword, it’s an absolute must-have if you're serious about DevSecOps.

What the Heck is S-SDLC Anyway?

At its core, S-SDLC is about integrating security at every single stage of your software development process — from requirements gathering all the way to production monitoring.

It flips the script from "secure it later" to "build it secure from the start". No more slapping on firewalls at the end and praying hackers don't show up.

Here’s the typical stages:

Requirements Gathering (define security needs)
Design (identify threats, create secure architectures)
Implementation (secure coding practices)
Testing (automated security testing, SAST/DAST tools)
Deployment (secure configs, container hardening)
Maintenance (continuous monitoring, patching)

S-SDLC isn’t a replacement for DevOps or DevSecOps — it’s a core part of how they evolve.

Why Should You Even Care?

Because the old way was broken. Waiting until production to think about security is like building a car and checking if it has brakes after you hit the highway. 🚗💨

S-SDLC helps you:

Catch security flaws early (when they're cheap to fix)
Build trust with customers and stakeholders
Comply with standards (ISO, GDPR, HIPAA, you name it)
Sleep better at night knowing your apps aren't a hacker’s playground

Real-Life Examples of S-SDLC in Action

Here’s where the rubber meets the road. Let’s break down some real-world S-SDLC practices:

🛡️ Shifting Security Left

During the Requirements Phase, teams set clear security goals like:

"All customer data must be encrypted at rest using AES-256."

No more vague "we’ll think about security later" BS. Specific, measurable, enforceable requirements, from Day 1.

🧠 Threat Modeling Before Coding

Before a single line of code drops, teams run Threat Modeling sessions.

Example:

Identify spoofing risks in login flows.
Spot data tampering possibilities in APIs.

Use tools like OWASP Threat Dragon or just good ol' whiteboard sessions.

✍️ Secure Coding Standards

During development, engineers follow secure coding guidelines, such as:

Parameterizing SQL queries (to avoid injections)
Validating all user inputs
Escaping outputs properly in web apps
…

Think OWASP Secure Coding Practices — not "cowboy coding."

🚀 Security Testing in CI/CD

Security isn't some final boss fight at the end — it’s baked right into your pipelines:

Static Application Security Testing (SAST) tools like SonarQube
Dynamic Application Security Testing (DAST) like OWASP ZAP
Dependency vulnerability scans with Snyk or Dependabot

If your pipeline ain't yelling about vulnerabilities, you’re doing it wrong.

🛠️ Managing Third-Party Dependencies

Third-party libraries can be sneaky — one bad package update, and boom 💥. S-SDLC enforces:

Continuous monitoring of dependencies
Blocking known vulnerable libraries from builds
Automated patching where possible

👀 Continuous Monitoring After Deployment

Even after it's live, security doesn't stop. Use runtime security tools to collect and analyze logs from your hosts and your deployment... (Ex: Elasticsearch with Auditbeat, Winlogbeat; Tragon)

It’s not just "deploy and hope" anymore — it’s "deploy and watch like a hawk."

S-SDLC and DevSecOps — The Power Couple 💍

S-SDLC is a huge chunk of the DevSecOps mindset. You can’t automate what you don’t plan for. And you can’t "shift left" without a secure foundation.

DevSecOps = DevOps + Security Everywhere.
S-SDLC = The game plan to actually make that happen.

One’s the vision, the other’s the execution.

Final Thoughts

If you want to truly live that DevSecOps life, you can’t treat security like a side quest. You have to build it in — everywhere, always.

S-SDLC makes that happen, making sure security is not just a "thing" you tack on, but the way you build, test, and run your software.

No excuses. No shortcuts. Just solid, secure apps from start to finish.