Status Command

The manta_node status command displays information about running node instances, including resource usage, task execution, and health metrics.

Overview

The status command provides:

  • List of running node instances

  • Process IDs and runtime information

  • Resource usage (CPU, memory)

  • Task execution status

  • Health and connectivity state

Synopsis

manta_node status [options]

Options

--plain

Use plain text output instead of formatted table

  • Simpler format for scripting

  • One node per section

  • Machine-readable output

--all

Show all instances (default behavior)

  • Same as running without options

  • Included for consistency

Usage Examples

Basic Status Check

View all running nodes:

$ manta_node status
┌─ Running Node Instances ────────────────────────────────────┐
│ Alias       Config      PID    Status   CPU %  Memory │
├────────────┼────────────┼───────┼─────────┼───────┼────────┤
│ prod-gpu-1  production  12345  running  45.2%  2048MB │
│ dev-node    default     12346  running  12.1%  512MB  │
│ test-1      test        12347  running  78.9%  4096MB │
└────────────┴────────────┴───────┴─────────┴───────┴────────┘

Plain Text Output

For scripting and automation:

$ manta_node status --plain
Found 3 running instance(s):

Instance: production-a3f2c891
  Alias: prod-gpu-1
  Config: production
  PID: 12345
  Status: running
  CPU: 45.2%
  Memory: 2048 MB
  Started: 2024-03-15 10:30:15

Instance: default-b7d4f932
  Alias: dev-node
  Config: default
  PID: 12346
  Status: running
  CPU: 12.1%
  Memory: 512 MB
  Started: 2024-03-15 11:45:30

No Running Nodes

When no nodes are active:

$ manta_node status
No running node instances found.

Start a node with:
  manta_node start [config_name]

Output Format

Table Format

Default rich table display shows:

  • Alias: Node friendly name

  • Config: Configuration file used

  • PID: Process identifier

  • Status: Current state (running, stopping)

  • CPU %: CPU usage percentage

  • Memory (MB): Memory usage in megabytes

  • Started: Timestamp when node started

Plain Format

Plain text output includes:

  • Instance: Full instance identifier

  • Alias: Node alias

  • Config: Configuration name

  • PID: Process ID

  • Status: Running state

  • CPU: CPU usage (if available)

  • Memory: Memory usage (if available)

  • Started: Start timestamp

Instance Information

Instance Tracking

Nodes are tracked via JSON files in ~/.manta/nodes/instances/:

{
  "instance_id": "production-a3f2c891",
  "alias": "prod-gpu-1",
  "config_name": "production",
  "pid": 12345,
  "start_time": "2024-03-15T10:30:15",
  "status": "running",
  "manager_host": "localhost",
  "manager_port": 50051,
  "log_file": "~/.manta/logs/nodes/prod-gpu-1.log"
}

Process Verification

The status command:

  1. Reads instance files: Loads all *.json files

  2. Verifies processes: Checks if PIDs are still running

  3. Collects metrics: Gathers CPU/memory usage (if psutil available)

  4. Cleans stale files: Removes files for dead processes

  5. Formats output: Displays in table or plain text

Resource Metrics

With psutil installed:

  • CPU percentage: Current CPU usage

  • Memory usage: RSS (Resident Set Size) in MB

  • Process time: Creation timestamp

  • Thread count: Number of active threads

Without psutil:

  • Shows “N/A” for metrics

  • Still displays PID and status

  • Basic functionality maintained

Status States

Node States

running
  • Node is active and healthy

  • Accepting and executing tasks

  • Connected to manager

stopping
  • Shutdown initiated

  • Completing active tasks

  • Not accepting new tasks

error
  • Node encountered critical error

  • May need restart

  • Check logs for details

Process States

Active
  • Process exists and responding

  • PID verified via signal 0

  • Resources being consumed

Zombie
  • Process terminated but not reaped

  • Shows as “defunct” in ps

  • Requires parent cleanup

Stale
  • Instance file exists but process dead

  • Automatically cleaned up

  • File removed on detection

Integration with Other Commands

Status Before Starting

Check before starting new nodes:

# Check what's running
manta_node status

# Start if needed
if [ $? -eq 0 ]; then
    manta_node start production
fi

Status After Stopping

Verify nodes stopped:

# Stop all nodes
manta_node stop --all

# Verify stopped
manta_node status
# Should show: "No running node instances found"

Monitor During Execution

Watch node status:

# Monitor status every 5 seconds
watch -n 5 manta_node status

# Or in a loop
while true; do
    clear
    manta_node status
    sleep 5
done

Scripting and Automation

Parse Status Output

Extract information for scripts:

# Get PIDs of all running nodes
manta_node status --plain | grep "PID:" | awk '{print $2}'

# Count running nodes
count=$(manta_node status --plain | grep -c "Instance:")
echo "Running nodes: $count"

# Check if specific node is running
if manta_node status --plain | grep -q "prod-gpu-1"; then
    echo "Production node is running"
fi

Health Check Script

#!/bin/bash
# health_check.sh - Monitor node health

# Get status
output=$(manta_node status --plain)

# Check if any nodes running
if echo "$output" | grep -q "No running node instances"; then
    echo "WARNING: No nodes running"
    exit 1
fi

# Check CPU usage
while IFS= read -r line; do
    if [[ $line =~ CPU:\ ([0-9]+\.[0-9]+)% ]]; then
        cpu="${BASH_REMATCH[1]}"
        if (( $(echo "$cpu > 90" | bc -l) )); then
            echo "WARNING: High CPU usage: $cpu%"
        fi
    fi
done <<< "$output"

Python Integration

import subprocess
import json
import re

def get_node_status():
    """Get status of all running nodes."""
    result = subprocess.run(
        ['manta_node', 'status', '--plain'],
        capture_output=True,
        text=True
    )

    nodes = []
    current_node = {}

    for line in result.stdout.split('\n'):
        if line.startswith('Instance:'):
            if current_node:
                nodes.append(current_node)
            current_node = {'instance': line.split(': ')[1]}
        elif ':' in line and current_node:
            key, value = line.strip().split(': ', 1)
            current_node[key.lower()] = value

    if current_node:
        nodes.append(current_node)

    return nodes

# Example usage
nodes = get_node_status()
for node in nodes:
    print(f"{node['alias']}: {node.get('cpu', 'N/A')}")

Performance Monitoring

Resource Tracking

Monitor resource trends:

# Log resource usage over time
while true; do
    timestamp=$(date '+%Y-%m-%d %H:%M:%S')
    manta_node status --plain | grep -E "(CPU|Memory):" | \
        while read line; do
            echo "$timestamp $line" >> node_metrics.log
        done
    sleep 60
done

Alert on High Usage

# Alert when resources exceed threshold
check_resources() {
    manta_node status --plain | while read line; do
        if [[ $line =~ Memory:\ ([0-9]+)\ MB ]]; then
            mem="${BASH_REMATCH[1]}"
            if [ "$mem" -gt 8192 ]; then
                echo "ALERT: High memory usage: ${mem}MB"
                # Send notification
            fi
        fi
    done
}

Troubleshooting

Missing Metrics

If CPU/Memory show “N/A”:

# Install psutil for metrics
pip install psutil

# Verify installation
python -c "import psutil; print(psutil.cpu_percent())"

Stale Instance Files

Clean up orphaned files:

# Manual cleanup
cd ~/.manta/nodes/instances
for file in *.json; do
    pid=$(grep -o '"pid": [0-9]*' "$file" | cut -d' ' -f2)
    if ! ps -p "$pid" > /dev/null 2>&1; then
        echo "Removing stale: $file"
        rm "$file"
    fi
done

Permission Issues

If status shows permission errors:

# Check file permissions
ls -la ~/.manta/nodes/instances/

# Fix permissions
chmod 644 ~/.manta/nodes/instances/*.json

Display Issues

If table formatting is broken:

# Use plain format
manta_node status --plain

# Check terminal width
echo $COLUMNS

# Set wider terminal
export COLUMNS=120

Best Practices

Regular Monitoring

  1. Check periodically: Monitor nodes every few minutes

  2. Log metrics: Keep history of resource usage

  3. Set alerts: Notify on high resource usage

  4. Track trends: Identify usage patterns

  5. Clean stale files: Remove orphaned instances

Production Monitoring

  1. Use monitoring tools: Integrate with Prometheus/Grafana

  2. Export metrics: Send to monitoring systems

  3. Health endpoints: Create HTTP health checks

  4. Automated alerts: Set up paging for issues

  5. Dashboard displays: Show real-time status

Development Usage

  1. Quick checks: Use status before/after operations

  2. Resource debugging: Monitor during development

  3. Process verification: Ensure clean stops

  4. Multi-node testing: Track cluster nodes

  5. Performance testing: Monitor under load

See Also