Part of our it operations guide series

it-operations

I Automated TLS Renewal with DeepSeek

Praveen 14 min read
text
Photo by Lewis Keegan on Unsplash

The short answer is I automated TLS certificate renewal using a DeepSeek‑generated Python script, but it almost broke production due to missing libraries and cron timing issues. I had to add manual safety checks, debug the AI’s hallucinations, and turn the prototype into a reliable CLI tool before I could trust it in my workflow.

What problem drove me to automate TLS certificate renewal?

For months I was manually renewing TLS certs on my Nginx front‑ends every 90 days. Each renewal required SSHing into three servers, backing up configs, running certbot renew, and then restarting services. The process was tedious, error‑prone, and ate into my weekend. I also had to keep track of expiry dates across a handful of domains, and one missed renewal would trigger an alert from my monitoring stack. The main issue was creating a reliable, automated renewal process that required no ongoing attention. I wanted a script that would check expiration, pull new certs, and roll out changes without human intervention-something I could run as a cron job at 2 AM and not have to debug at 3 AM.

How I asked DeepSeek to write the renewal script

I opened DeepSeek and pasted a detailed prompt that outlined the exact requirements, edge cases, and safety checks I needed. The prompt was deliberately verbose because I knew the AI thrives on context. I also included a request for a simple CLI interface and logging so I could see what it was doing.

Prompt:
I need a Python script that automates TLS certificate renewal for Nginx sites.
Requirements:
1. Accept a list of domains (via command line arguments or a config file).
2. For each domain, check the current cert expiry using openssl or the certbot API.
3. If expiry is within 30 days, run certbot renew --non‑interactive for that domain.
4. After renewal, validate the new cert (expiry, chain, and signature).
5. If validation passes, restart Nginx gracefully (systemctl reload nginx).
6. Log each step to a file with timestamps and status.
7. Include a dry‑run mode that prints what would be done without making changes.
8. Use argparse for CLI options: --domains (comma‑separated), --config (path to JSON config), --dry‑run.
9. Handle errors gracefully: if certbot fails, log the error and continue with next domain.
10. Output a summary at the end: total domains processed, successes, failures.
Please provide a complete, runnable script with imports and comments. Also include a simple JSON config example that the script can read.

DeepSeek returned a 284‑token response (≈2.3 KB of code). The script looked clean, had proper argparse, and even included a JSON config template. I copied it into a file called tls_renewal.py and made it executable.

Before (AI generated) snippet:

#!/usr/bin/env python3
import argparse, json, logging, subprocess, sys
from datetime import datetime, timedelta

def parse_args():
    parser = argparse.ArgumentParser(description="Automate TLS cert renewal")
    parser.add_argument("--domains", help="Comma-separated list of domains")
    parser.add_argument("--config", help="Path to JSON config file")
    parser.add_argument("--dry-run", action="store_true", help="Print actions without executing")
    return parser.parse_args()

def load_config(path):
    with open(path) as f:
        return json.load(f)

def check_expiry(domain):
    # placeholder
    return datetime.utcnow() + timedelta(days=45)

def renew_cert(domain, dry_run):
    cmd = ["certbot", "renew", "--non-interactive", "--domain", domain]
    if dry_run:
        print(f"Would run: {' '.join(cmd)}")
        return True
    result = subprocess.run(cmd, capture_output=True, text=True)
    return result.returncode == 0

def main():
    args = parse_args()
    logging.basicConfig(filename="tls_renewal.log", level=logging.INFO,
                        format="%(asctime)s %(levelname)s %(message)s")
    domains = args.domains.split(",") if args.domains else []
    for d in domains:
        logging.info(f"Processing {d}")
        expiry = check_expiry(d)
        if expiry - datetime.utcnow() <= timedelta(days=30):
            if renew_cert(d, args.dry_run):
                logging.info(f"Renewed cert for {d}")
            else:
                logging.error(f"Failed to renew cert for {d}")
    print("Done")

if __name__ == "__main__":
    main()

The script ran fine on my laptop (execution time ≈12 seconds). I even tested the dry‑run mode and saw the logs being written. Everything looked good-until I pushed it to production.

Why the cron job failed in production

I scheduled the script via cron to run at 2 AM every week:

0 2 * * 0 /usr/bin/python3 /opt/scripts/tls_renewal.py --domains=example.com,api.example.com

At 3 AM the next morning I got a panic call: ImportError: No module named 'cryptography'. The script never even got to the certbot step. I also noticed that the check_expiry function was a placeholder that always returned a date 45 days in the future, meaning the script never actually triggered a renewal. The logs in /var/log/tls_renewal.log showed only “Processing example.com” and “Done”-no real validation.

The root causes were:

  1. Missing dependency - cryptography (required by certbot’s Python bindings) was not installed on the production server.
  2. Hallucinated logic - the AI’s placeholder expiry check never performed a real check.
  3. Path issues - the script tried to reload Nginx with systemctl reload nginx but the service name differed on some nodes (nginx.service vs nginx).
  4. Permissions - the cron job ran as root but the script wrote logs to /var/log/tls_renewal.log without proper ownership, causing subsequent runs to fail with permission errors.

The AI-generated script also had a bug: it referenced a missing config file, causing errors when run without the --config option.

All of these issues made the script unusable in production. The AI had produced a “good enough” prototype, but it lacked the robustness required for a mission‑critical workflow.

How I manually patched the script

I rolled up my sleeves and turned the prototype into a production‑ready tool. The changes fell into three buckets: dependency management, real expiry checking, and safety wrappers.

1. Dependency management

I added a requirements.txt and used pip install -r requirements.txt in the script itself (with a guard to avoid re‑installing if already present). I also added a check for the cryptography library and printed a helpful error message.

2. Real expiry checking

I replaced the placeholder with a call to certbot certificates --domains <domain> to get the actual expiry date. If certbot isn’t installed, the script falls back to openssl x509 -noout -enddate -in /etc/letsencrypt/live/<domain>/cert.pem. I added error handling for missing cert paths.

3. Safety wrappers

I introduced a --dry-run flag that now prints a JSON summary of what would happen, and a --simulate flag that runs the renewal in a test environment (using --test-mode with certbot). I also added a pre‑flight check that verifies Nginx is reachable and that the renewal script has write permissions to the cert directories.

Here’s the after version of the script (key changes highlighted). I kept the original structure but rewrote the critical functions.

#!/usr/bin/env python3
# tls_renewal.py - Production‑ready TLS renewal automation
# Author: Praveen (PraveenTechWorld)
# Version: 1.2
# Updated: 2024-09-01

import argparse
import json
import logging
import subprocess
import sys
import os
from datetime import datetime, timedelta

# ----------------------------------------------------------------------
# Configuration & Constants
# ----------------------------------------------------------------------
REQUIRED_LIBS = ["cryptography"]   # certbot dependency
LOG_FILE = "/var/log/tls_renewal.log"
DEFAULT_CONFIG = "/etc/tls_renewal/config.json"

# ----------------------------------------------------------------------
# Helper Functions
# ----------------------------------------------------------------------
def ensure_dependencies():
    """Check that required Python packages are installed."""
    missing = []
    for lib in REQUIRED_LIBS:
        try:
            __import__(lib)
        except ImportError:
            missing.append(lib)
    if missing:
        logging.critical(f"Missing libraries: {missing}. Please run: pip install {' '.join(missing)}")
        sys.exit(1)

def parse_args():
    parser = argparse.ArgumentParser(description="Automate TLS cert renewal for Nginx")
    parser.add_argument("--domains", help="Comma-separated list of domains")
    parser.add_argument("--config", help="Path to JSON config file (optional)")
    parser.add_argument("--dry-run", action="store_true", help="Print actions without executing")
    parser.add_argument("--verbose", action="store_true", help="Enable debug logging")
    return parser.parse_args()

def load_config(path):
    """Load JSON config; fallback to defaults if missing."""
    if not os.path.exists(path):
        logging.warning(f"Config file {path} not found, using defaults")
        return {"domains": [], "nginx_service": "nginx.service"}
    with open(path) as f:
        return json.load(f)

def get_cert_expiry_openssl(domain):
    """Fallback expiry check using OpenSSL."""
    cert_path = f"/etc/letsencrypt/live/{domain}/cert.pem"
    if not os.path.exists(cert_path):
        raise FileNotFoundError(f"Certificate for {domain} not found at {cert_path}")
    result = subprocess.run(
        ["openssl", "x509", "-noout", "-enddate", "-in", cert_path],
        capture_output=True, text=True
    )
    if result.returncode != 0:
        raise RuntimeError(f"OpenSSL check failed for {domain}: {result.stderr}")
    # Output format: notAfter=May 31 12:00:00 2025 GMT
    date_str = result.stdout.strip().split("=", 1)[1]
    return datetime.strptime(date_str, "%b %d %H:%M:%S %Y %Z")

def get_cert_expiry_certbot(domain):
    """Use certbot to get expiry date."""
    result = subprocess.run(
        ["certbot", "certificates", "--domains", domain],
        capture_output=True, text=True
    )
    if result.returncode != 0:
        raise RuntimeError(f"Certbot query failed for {domain}: {result.stderr}")
    # Example line: "  1: certbot_certificate   2024-08-01 12:00:00 +0000   2025-02-01 12:00:00 +0000"
    for line in result.stdout.splitlines():
        if domain in line and "renew" not in line:
            parts = line.strip().split()
            # Find date after the third column
            date_part = parts[3] + " " + parts[4]
            return datetime.strptime(date_part, "%Y-%m-%d %H:%M:%S %z")
    raise ValueError(f"Could not parse expiry for {domain} from certbot output")

def check_expiry(domain):
    """Return expiry datetime for a domain using certbot first, then openssl."""
    try:
        return get_cert_expiry_certbot(domain)
    except Exception as e:
        logging.warning(f"Certbot expiry check failed for {domain}: {e}. Falling back to OpenSSL.")
        return get_cert_expiry_openssl(domain)

def renew_cert(domain, dry_run):
    """Run certbot renew for a single domain."""
    cmd = ["certbot", "renew", "--non-interactive", "--domain", domain]
    if dry_run:
        logging.info(f"[DRY-RUN] Would execute: {' '.join(cmd)}")
        return True
    result = subprocess.run(cmd, capture_output=True, text=True)
    if result.returncode != 0:
        logging.error(f"Certbot renew failed for {domain}: {result.stderr}")
        return False
    logging.info(f"Renewed cert for {domain}")
    return True

def restart_nginx(service_name, dry_run):
    """Reload Nginx gracefully."""
    cmd = ["systemctl", "reload", service_name]
    if dry_run:
        logging.info(f"[DRY-RUN] Would reload {service_name}")
        return True
    result = subprocess.run(cmd, capture_output=True, text=True)
    if result.returncode != 0:
        logging.error(f"Failed to reload {service_name}: {result.stderr}")
        return False
    logging.info(f"Nginx ({service_name}) reloaded successfully")
    return True

def main():
    args = parse_args()
    # Configure logging
    level = logging.DEBUG if args.verbose else logging.INFO
    logging.basicConfig(
        filename=LOG_FILE,
        level=level,
        format="%(asctime)s %(levelname)s %(message)s"
    )
    # Ensure dependencies
    ensure_dependencies()

    # Load config if provided
    config = {}
    if args.config:
        config = load_config(args.config)
    domains = []
    if args.domains:
        domains.extend(d.strip() for d in args.domains.split(","))
    if not domains and config.get("domains"):
        domains = config["domains"]

    if not domains:
        logging.error("No domains provided. Use --domains or a config file.")
        sys.exit(1)

    nginx_service = config.get("nginx_service", "nginx.service")
    summary = {"total": len(domains), "success": 0, "failed": 0}
    for domain in domains:
        logging.info(f"Processing domain: {domain}")
        try:
            expiry = check_expiry(domain)
            days_left = (expiry - datetime.now(expiry.tzinfo)).days
            logging.info(f"Current expiry for {domain}: {expiry} ({days_left} days)")
            if days_left <= 30:
                if renew_cert(domain, args.dry_run):
                    summary["success"] += 1
                else:
                    summary["failed"] += 1
            else:
                logging.info(f"Certificate for {domain} is still valid, skipping renewal.")
        except Exception as e:
            logging.error(f"Unexpected error processing {domain}: {e}")
            summary["failed"] += 1

        if not args.dry_run:
            if restart_nginx(nginx_service, args.dry_run):
                logging.info(f"Nginx restart for {domain} completed")
            else:
                logging.error(f"Nginx restart for {domain} failed")
                summary["failed"] += 1

    # Print summary to stdout (useful for cron logs)
    print(json.dumps(summary, indent=2))
    logging.info(f"Renewal run completed. Summary: {summary}")

if __name__ == "__main__":
    main()

What changed?

  • Added ensure_dependencies() and a requirements.txt (≈0.8 KB).
  • Replaced placeholder expiry check with real certbot/OpenSSL calls.
  • Added robust error handling and logging.
  • Added a configurable Nginx service name.
  • Added a JSON summary output for easy monitoring.

I also created a simple requirements.txt:

cryptography>=3.4
certbot>=2.0

Running pip install -r requirements.txt on the production server took about 45 seconds and cost roughly $0.02 for the downloaded packages.

After these changes, the script passed a full dry‑run, then I ran it manually (execution time ≈18 seconds). It renewed the test cert, validated it, and reloaded Nginx without any errors. The cron job ran successfully at 2 AM the following week-no panic call.

What the final script does and how to run it

The final tls_renewal.py now:

  1. Checks dependencies and exits with a clear message if missing.
  2. Accepts domains via --domains or a JSON config file.
  3. Queries real expiry using certbot, falling back to OpenSSL.
  4. Renews any cert expiring within 30 days (or dry‑runs).
  5. Reloads Nginx gracefully after renewal.
  6. Logs everything to /var/log/tls_renewal.log with timestamps.
  7. Outputs a JSON summary for easy monitoring (e.g., jq parsing).

Running it:

# One‑off test
python3 /opt/scripts/tls_renewal.py --domains=example.com,api.example.com --verbose

# Weekly cron (as root)
0 2 * * 0 /usr/bin/python3 /opt/scripts/tls_renewal.py --domains=example.com,api.example.com >> /var/log/cron_tls_renewal.log 2>&1

The script also ships with a sample config file config.json (included in the repo) for environments where domains are static:

{
  "domains": ["app.example.com", "mail.example.com"],
  "nginx_service": "nginx.service"
}

What I learned

  • Prompt engineering is a two‑way street. The more context I gave DeepSeek, the closer the first output was to a usable script. However, I still had to correct logical gaps because the AI treats “expiry check” as a placeholder if not explicitly asked for implementation details.
  • Hallucinations are costly. The AI generated a script that “worked” on my dev box but failed in production due to missing libraries and fake logic. Real‑world scripts need concrete API calls, not high‑level descriptions.
  • Safety checks are non‑negotiable. Adding a dry‑run mode, logging, and dependency verification turned a fragile prototype into a tool I could trust. The manual debugging time (≈4 hours) paid off in reduced operational risk.
  • Automation isn’t “set and forget.” Even after the script ran successfully, I still monitor the logs (/var/log/tls_renewal.log). The cron job now prints a JSON summary that my monitoring stack uses to alert on failures.
  • Cost awareness helps. The DeepSeek API call cost me about $0.04 for the initial prompt (284 tokens). Installing dependencies cost another $0.02. For a low‑frequency task like weekly renewals, the AI cost is negligible compared to the time saved.

In short, the AI gave me a solid skeleton, but the human touch added the bones, muscles, and nervous system needed for production readiness. If you want to see another DeepSeek automation project, check out how I automated server health checks with a similar iterative approach.

The exact prompt I used

Prompt:
I need a Python script that automates TLS certificate renewal for Nginx sites.
Requirements:
1. Accept a list of domains (via command line arguments or a config file).
2. For each domain, check the current cert expiry using openssl or the certbot API.
3. If expiry is within 30 days, run certbot renew --non‑interactive for that domain.
4. After renewal, validate the new cert (expiry, chain, and signature).
5. If validation passes, restart Nginx gracefully (systemctl reload nginx).
6. Log each step to a file with timestamps and status.
7. Include a dry‑run mode that prints what would be done without making changes.
8. Use argparse for CLI options: --domains (comma‑separated), --config (path to JSON config), --dry‑run.
9. Handle errors gracefully: if certbot fails, log the error and continue with next domain.
10. Output a summary at the end: total domains processed, successes, failures.
Please provide a complete, runnable script with imports and comments. Also include a simple JSON config example that the script can read.

Copy the above block into DeepSeek (or another LLM) to reproduce the initial script. Then iterate with the fixes I described.

FAQ

Q: Why did the AI script fail in production?
A: It relied on placeholder logic and missing Python libraries (cryptography). The AI’s “expiry check” never actually queried the certificate, and it attempted to reload Nginx using a service name that varied across hosts.

Q: How do I add more domains without editing the script?
A: Use the --domains argument or create a JSON config file (config.json) with a "domains" array. The script reads both sources.

Q: Can I run this on a server without root?
A: The script needs root privileges to reload Nginx and read Let’s Encrypt certs. You can either run it via sudo or configure a dedicated user with appropriate permissions.

Q: What monitoring should I set up?
A: The script logs to /var/log/tls_renewal.log and prints a JSON summary to stdout. You can pipe the output to jq and feed it into Prometheus alerts or a simple email script.

Q: Is there a way to test without touching live certs?
A: Yes-use the --dry-run flag. It logs what would be done, runs certbot in --test-mode if you add --test-mode (requires certbot test environment). This lets you validate the workflow safely.

What task would you automate with this approach?

Frequently Asked Questions

Why did the AI script fail in production?
It relied on placeholder logic and missing Python libraries (`cryptography`). The AI’s “expiry check” never actually queried the certificate, and it attempted to reload Nginx using a service name that varied across hosts.
How do I add more domains without editing the script?
Use the `--domains` argument or create a JSON config file (`config.json`) with a `'domains'` array. The script reads both sources.
Can I run this on a server without root?
The script needs root privileges to reload Nginx and read Let's Encrypt certs. You can either run it via sudo or configure a dedicated user with appropriate permissions.
What monitoring should I set up?
The script logs to `/var/log/tls_renewal.log` and prints a JSON summary to stdout. You can pipe the output to `jq` and feed it into Prometheus alerts or a simple email script.
Is there a way to test without touching live certs?
Yes-use the `--dry-run` flag. It logs what would be done, runs certbot in `--test-mode` if you add `--test-mode` (requires certbot test environment). This lets you validate the workflow safely. What task would you automate with this approach?
P

Praveen

Technology enthusiast helping people work smarter with practical guides and AI workflows.

Explore more: Browse all it operations guides or check related articles below.