Status Command¶

The manta_node status command displays information about running node instances, including resource usage, task execution, and health metrics.

Overview¶

The status command provides:

List of running node instances
Process IDs and runtime information
Resource usage (CPU, memory)
Task execution status
Health and connectivity state

Synopsis¶

manta_node status [options]

Options¶

--plain

Use plain text output instead of formatted table

Simpler format for scripting
One node per section
Machine-readable output

--all

Show all instances (default behavior)

Same as running without options
Included for consistency

Usage Examples¶

Basic Status Check¶

View all running nodes:

$ manta_node status
┌─ Running Node Instances ────────────────────────────────────┐
│ Alias      │ Config     │ PID   │ Status  │ CPU % │ Memory │
├────────────┼────────────┼───────┼─────────┼───────┼────────┤
│ prod-gpu-1 │ production │ 12345 │ running │ 45.2% │ 2048MB │
│ dev-node   │ default    │ 12346 │ running │ 12.1% │ 512MB  │
│ test-1     │ test       │ 12347 │ running │ 78.9% │ 4096MB │
└────────────┴────────────┴───────┴─────────┴───────┴────────┘

Plain Text Output¶

For scripting and automation:

$ manta_node status --plain
Found 3 running instance(s):

Instance: production-a3f2c891
  Alias: prod-gpu-1
  Config: production
  PID: 12345
  Status: running
  CPU: 45.2%
  Memory: 2048 MB
  Started: 2024-03-15 10:30:15

Instance: default-b7d4f932
  Alias: dev-node
  Config: default
  PID: 12346
  Status: running
  CPU: 12.1%
  Memory: 512 MB
  Started: 2024-03-15 11:45:30

No Running Nodes¶

When no nodes are active:

$ manta_node status
No running node instances found.

Start a node with:
  manta_node start [config_name]

Output Format¶

Table Format¶

Default rich table display shows:

Alias: Node friendly name
Config: Configuration file used
PID: Process identifier
Status: Current state (running, stopping)
CPU %: CPU usage percentage
Memory (MB): Memory usage in megabytes
Started: Timestamp when node started

Plain Format¶

Plain text output includes:

Instance: Full instance identifier
Alias: Node alias
Config: Configuration name
PID: Process ID
Status: Running state
CPU: CPU usage (if available)
Memory: Memory usage (if available)
Started: Start timestamp

Instance Information¶

Instance Tracking¶

Nodes are tracked via JSON files in ~/.manta/nodes/instances/:

{
  "instance_id": "production-a3f2c891",
  "alias": "prod-gpu-1",
  "config_name": "production",
  "pid": 12345,
  "start_time": "2024-03-15T10:30:15",
  "status": "running",
  "manager_host": "localhost",
  "manager_port": 50051,
  "log_file": "~/.manta/logs/nodes/prod-gpu-1.log"
}

Process Verification¶

The status command:

Reads instance files: Loads all *.json files
Verifies processes: Checks if PIDs are still running
Collects metrics: Gathers CPU/memory usage (if psutil available)
Cleans stale files: Removes files for dead processes
Formats output: Displays in table or plain text

Resource Metrics¶

With psutil installed:

CPU percentage: Current CPU usage
Memory usage: RSS (Resident Set Size) in MB
Process time: Creation timestamp
Thread count: Number of active threads

Without psutil:

Shows “N/A” for metrics
Still displays PID and status
Basic functionality maintained

Status States¶

Node States¶

running

Node is active and healthy
Accepting and executing tasks
Connected to manager

stopping

Shutdown initiated
Completing active tasks
Not accepting new tasks

error

Node encountered critical error
May need restart
Check logs for details

Process States¶

Active

Process exists and responding
PID verified via signal 0
Resources being consumed

Zombie

Process terminated but not reaped
Shows as “defunct” in ps
Requires parent cleanup

Stale

Instance file exists but process dead
Automatically cleaned up
File removed on detection

Integration with Other Commands¶

Status Before Starting¶

Check before starting new nodes:

# Check what's running
manta_node status

# Start if needed
if [ $? -eq 0 ]; then
    manta_node start production
fi

Status After Stopping¶

Verify nodes stopped:

# Stop all nodes
manta_node stop --all

# Verify stopped
manta_node status
# Should show: "No running node instances found"

Monitor During Execution¶

Watch node status:

# Monitor status every 5 seconds
watch -n 5 manta_node status

# Or in a loop
while true; do
    clear
    manta_node status
    sleep 5
done

Scripting and Automation¶

Parse Status Output¶

Extract information for scripts:

# Get PIDs of all running nodes
manta_node status --plain | grep "PID:" | awk '{print $2}'

# Count running nodes
count=$(manta_node status --plain | grep -c "Instance:")
echo "Running nodes: $count"

# Check if specific node is running
if manta_node status --plain | grep -q "prod-gpu-1"; then
    echo "Production node is running"
fi

Health Check Script¶

#!/bin/bash
# health_check.sh - Monitor node health

# Get status
output=$(manta_node status --plain)

# Check if any nodes running
if echo "$output" | grep -q "No running node instances"; then
    echo "WARNING: No nodes running"
    exit 1
fi

# Check CPU usage
while IFS= read -r line; do
    if [[ $line =~ CPU:\ ([0-9]+\.[0-9]+)% ]]; then
        cpu="${BASH_REMATCH[1]}"
        if (( $(echo "$cpu > 90" | bc -l) )); then
            echo "WARNING: High CPU usage: $cpu%"
        fi
    fi
done <<< "$output"

Python Integration¶

import subprocess
import json
import re

def get_node_status():
    """Get status of all running nodes."""
    result = subprocess.run(
        ['manta_node', 'status', '--plain'],
        capture_output=True,
        text=True
    )

    nodes = []
    current_node = {}

    for line in result.stdout.split('\n'):
        if line.startswith('Instance:'):
            if current_node:
                nodes.append(current_node)
            current_node = {'instance': line.split(': ')[1]}
        elif ':' in line and current_node:
            key, value = line.strip().split(': ', 1)
            current_node[key.lower()] = value

    if current_node:
        nodes.append(current_node)

    return nodes

# Example usage
nodes = get_node_status()
for node in nodes:
    print(f"{node['alias']}: {node.get('cpu', 'N/A')}")

Performance Monitoring¶

Resource Tracking¶

Monitor resource trends:

# Log resource usage over time
while true; do
    timestamp=$(date '+%Y-%m-%d %H:%M:%S')
    manta_node status --plain | grep -E "(CPU|Memory):" | \
        while read line; do
            echo "$timestamp $line" >> node_metrics.log
        done
    sleep 60
done

Alert on High Usage¶

# Alert when resources exceed threshold
check_resources() {
    manta_node status --plain | while read line; do
        if [[ $line =~ Memory:\ ([0-9]+)\ MB ]]; then
            mem="${BASH_REMATCH[1]}"
            if [ "$mem" -gt 8192 ]; then
                echo "ALERT: High memory usage: ${mem}MB"
                # Send notification
            fi
        fi
    done
}

Troubleshooting¶

Missing Metrics¶

If CPU/Memory show “N/A”:

# Install psutil for metrics
pip install psutil

# Verify installation
python -c "import psutil; print(psutil.cpu_percent())"

Stale Instance Files¶

Clean up orphaned files:

# Manual cleanup
cd ~/.manta/nodes/instances
for file in *.json; do
    pid=$(grep -o '"pid": [0-9]*' "$file" | cut -d' ' -f2)
    if ! ps -p "$pid" > /dev/null 2>&1; then
        echo "Removing stale: $file"
        rm "$file"
    fi
done

Permission Issues¶

If status shows permission errors:

# Check file permissions
ls -la ~/.manta/nodes/instances/

# Fix permissions
chmod 644 ~/.manta/nodes/instances/*.json

Display Issues¶

If table formatting is broken:

# Use plain format
manta_node status --plain

# Check terminal width
echo $COLUMNS

# Set wider terminal
export COLUMNS=120

Best Practices¶

Regular Monitoring¶

Check periodically: Monitor nodes every few minutes
Log metrics: Keep history of resource usage
Set alerts: Notify on high resource usage
Track trends: Identify usage patterns
Clean stale files: Remove orphaned instances

Production Monitoring¶

Use monitoring tools: Integrate with Prometheus/Grafana
Export metrics: Send to monitoring systems
Health endpoints: Create HTTP health checks
Automated alerts: Set up paging for issues
Dashboard displays: Show real-time status

Development Usage¶

Quick checks: Use status before/after operations
Resource debugging: Monitor during development
Process verification: Ensure clean stops
Multi-node testing: Track cluster nodes
Performance testing: Monitor under load