# cPanel cron jobs — KomoonRSSCrawler

Production paths are unchanged after the **KomoonRSSCrawler** rename (local repo only). The cPanel app still lives at `crawler.komoon.app/app`.

**Always use the full Node binary** from the app virtualenv — bare `node` fails in cron (`node: command not found`).

```text
/home/bytescorp/nodevenv/crawler.komoon.app/app/20/bin/node
```

**App directory:**

```text
/home/bytescorp/crawler.komoon.app/app
```

Use **one line per cron command** (no trailing space after `\` if you split lines).

---

## 1. Zacatecas state (state-wide providers)

Runs providers with **empty municipality** for the state.

**Schedule:** `*/30 * * * *` (every 30 minutes)

```bash
cd /home/bytescorp/crawler.komoon.app/app && NODE_OPTIONS="--disable-wasm-trap-handler" CRAWL_COUNTRY=MX CRAWL_STATE=Zacatecas /home/bytescorp/nodevenv/crawler.komoon.app/app/20/bin/node src/runBatch.js
```

---

## 2. JSON cleanup (monthly)

Deletes date-folders under `data/` older than retention (default 30 days). Does **not** remove `sync_manifest.json` or `pending_sync.ndjson`.

**Schedule:** `0 3 3 * *` (03:00 on the 3rd of each month) — adjust if you prefer daily `0 3 * * *`.

```bash
cd /home/bytescorp/crawler.komoon.app/app && /home/bytescorp/nodevenv/crawler.komoon.app/app/20/bin/node src/cleanOldNewsJson.js
```

> **Fix:** The previous cron used bare `node`; replace with the full `nodevenv` path above.

---

## 3. Jerez municipality

Only providers with `municipality` matching `Jerez`.

**Schedule (current):** `* 1 * * *` (every minute during 01:00–01:59). Consider `*/15 * * * *` or `0,30 * * * *` if this is too aggressive.

```bash
cd /home/bytescorp/crawler.komoon.app/app && NODE_OPTIONS="--disable-wasm-trap-handler" CRAWL_COUNTRY=MX CRAWL_STATE=Zacatecas CRAWL_MUNICIPALITY=Jerez /home/bytescorp/nodevenv/crawler.komoon.app/app/20/bin/node src/runBatch.js
```

---

## 4. Zacatecas municipality (city)

Only providers with `municipality` set to `Zacatecas` (not state-wide empty-municipality providers).

**Schedule (current):** `* 2 * * *` (every minute during 02:00–02:59).

```bash
cd /home/bytescorp/crawler.komoon.app/app && NODE_OPTIONS="--disable-wasm-trap-handler" CRAWL_COUNTRY=MX CRAWL_STATE=Zacatecas CRAWL_MUNICIPALITY=Zacatecas /home/bytescorp/nodevenv/crawler.komoon.app/app/20/bin/node src/runBatch.js
```

---

## Manual test (SSH)

```bash
source /home/bytescorp/nodevenv/crawler.komoon.app/app/20/bin/activate && cd /home/bytescorp/crawler.komoon.app/app
NODE_OPTIONS="--disable-wasm-trap-handler" CRAWL_COUNTRY=MX CRAWL_STATE=Zacatecas node src/runBatch.js
```

## Deploy updated code

Upload changed files from local `KomoonRSSCrawler` to `/home/bytescorp/crawler.komoon.app/app` (FTP/Git). Then **Restart** the Node app in cPanel if needed. Cron jobs do not need path changes unless you relocate the server directory.
