Stop Command¶
The manta_node stop
command gracefully shuts down running node instances, ensuring clean task termination and resource cleanup.
Overview¶
The stop command:
Sends termination signals to running nodes
Waits for graceful shutdown of active tasks
Cleans up resources and temporary files
Removes instance tracking files
Supports both individual and bulk stops
Synopsis¶
manta_node stop [instance] [options]
Arguments¶
instance
Instance ID or alias of the node to stop
Optional: If omitted, stops all nodes
Can be partial match (e.g., “prod” matches “production-a3f2c891”)
Options¶
--all
Stop all running node instances
Same as omitting instance argument
Confirms before stopping multiple nodes
--force
,-f
Force immediate termination (SIGKILL)
Skips graceful shutdown
May cause data loss
Use only when normal stop fails
--timeout
,-t <seconds>
Time to wait for graceful shutdown
Default: 10 seconds
After timeout, forces termination
Set higher for nodes with long-running tasks
Usage Examples¶
Stop Specific Node¶
Stop a node by alias or ID:
$ manta_node stop production
Sent termination signal to node 'production' (PID: 12345)
Waiting up to 10 seconds for graceful shutdown...
✓ Node 'production' stopped gracefully
Stop with Partial Match¶
Use partial instance ID:
$ manta_node stop prod-a3f2
Found matching instance: production-a3f2c891
Sent termination signal to node 'production' (PID: 12345)
✓ Node stopped successfully
Stop All Nodes¶
Stop all running instances:
$ manta_node stop --all
Found 3 running instance(s):
- production (PID: 12345)
- development (PID: 12346)
- test-node (PID: 12347)
Stopping all instances...
✓ All 3 instances stopped successfully
Force Stop¶
Force immediate termination:
$ manta_node stop production --force
Force killed node 'production' (PID: 12345)
Warning: Forced termination may cause data loss
Custom Timeout¶
Allow more time for shutdown:
$ manta_node stop long-task-node --timeout 60
Sent termination signal to node 'long-task-node'
Waiting up to 60 seconds for graceful shutdown...
✓ Node stopped gracefully after 45 seconds
Shutdown Process¶
Graceful Shutdown¶
Normal stop sequence:
Send SIGTERM: Node receives termination signal
Stop accepting tasks: Refuses new task assignments
Complete active tasks: Allows running tasks to finish
Disconnect services: Closes gRPC and MQTT connections
Cleanup resources: Removes temporary files
Update status: Marks instance as stopped
Exit cleanly: Process terminates with code 0
Forced Shutdown¶
When using --force
or after timeout:
Send SIGKILL: Immediate process termination
No cleanup: Tasks and connections terminated abruptly
Remove tracking: Instance file deleted
Potential data loss: Incomplete operations lost
Instance Management¶
Instance files in ~/.manta/nodes/instances/
are:
Checked: Verify process is actually running
Updated: Mark as stopping during shutdown
Removed: Deleted after successful stop
Cleaned: Stale files removed automatically
Signal Handling¶
Signal Types¶
- SIGTERM (15)
Default termination signal
Allows graceful shutdown
Caught by node for cleanup
- SIGKILL (9)
Force termination signal
Cannot be caught or ignored
Immediate process death
- SIGINT (2)
Interactive interrupt (Ctrl+C)
Same as SIGTERM for nodes
Graceful shutdown
Timeout Behavior¶
During graceful shutdown:
Time | Action
-----|--------------------------------------------------
0s | SIGTERM sent, shutdown initiated
5s | Check if process still running
10s | Default timeout reached
| If still running: Send SIGKILL
| If stopped: Cleanup complete
Error Handling¶
No Running Instances¶
$ manta_node stop
No running node instances found.
Instance Not Found¶
$ manta_node stop unknown-node
Error: Node instance 'unknown-node' not found
Running instances:
- production (PID: 12345)
- development (PID: 12346)
Process Already Stopped¶
$ manta_node stop production
Warning: Node 'production' was already stopped
Cleaning up stale instance file
Permission Denied¶
$ manta_node stop system-node
Error: Permission denied (PID: 1234)
Try running with sudo or as the user who started the node
Bulk Operations¶
Stop Multiple Specific Nodes¶
Stop nodes sequentially:
# Stop specific nodes
for node in prod-1 prod-2 prod-3; do
manta_node stop $node
done
Stop Cluster Nodes¶
Stop all cluster nodes:
# Stop cluster (nodes with '-cluster-' in name)
manta_node cluster stop
Conditional Stops¶
Stop based on criteria:
# Stop all GPU nodes
manta_node status | grep gpu | while read node; do
manta_node stop $node
done
Cleanup Operations¶
Automatic Cleanup¶
On successful stop:
Instance tracking files removed
Temporary directories cleaned
Docker containers stopped
Network connections closed
Log files finalized
Manual Cleanup¶
If automatic cleanup fails:
# Remove stale instance files
rm ~/.manta/nodes/instances/*.json
# Check for orphaned processes
ps aux | grep manta_node
# Clean Docker containers
docker ps -a | grep manta
docker rm -f <container_id>
Best Practices¶
Production Environments¶
Always graceful: Avoid force stops in production
Increase timeout: Allow time for task completion
Monitor stops: Check logs after stopping
Scheduled stops: Plan maintenance windows
Verify cleanup: Ensure resources are freed
Development Environments¶
Quick iteration: Use force stop for faster development
Bulk stops: Stop all nodes when done testing
Auto-cleanup: Let system clean stale instances
Check status: Verify all nodes stopped
Emergency Procedures¶
When nodes won’t stop normally:
Try graceful first:
manta_node stop <node>
Increase timeout:
manta_node stop <node> -t 30
Force if needed:
manta_node stop <node> --force
Manual kill:
kill -9 <pid>
(last resort)Clean up: Remove instance files manually
Integration with Scripts¶
Bash Script Example¶
#!/bin/bash
# safe_stop.sh - Safely stop all nodes
echo "Stopping all manta nodes..."
# Get list of running nodes
nodes=$(manta_node status --plain | grep "running" | cut -d' ' -f1)
# Stop each node with timeout
for node in $nodes; do
echo "Stopping $node..."
if manta_node stop $node --timeout 30; then
echo "✓ $node stopped"
else
echo "✗ Failed to stop $node"
exit 1
fi
done
echo "All nodes stopped successfully"
Python Script Example¶
import subprocess
import time
def stop_node(instance_id, timeout=10, force=False):
"""Stop a manta node instance."""
cmd = ['manta_node', 'stop', instance_id]
if force:
cmd.append('--force')
else:
cmd.extend(['--timeout', str(timeout)])
result = subprocess.run(cmd, capture_output=True, text=True)
return result.returncode == 0
# Stop all nodes gracefully
if not stop_node('--all', timeout=30):
print("Graceful stop failed, forcing...")
stop_node('--all', force=True)
Troubleshooting¶
Node Won’t Stop¶
If a node refuses to stop:
Check process:
ps aux | grep manta_node
View logs:
manta_node logs <instance>
Check tasks: Look for hanging tasks
Network issues: Verify manager connectivity
Force stop: Use
--force
flag
Zombie Processes¶
Clean up zombie processes:
# Find zombie processes
ps aux | grep defunct | grep manta
# Kill parent process
kill -9 <parent_pid>
Stale Instance Files¶
Remove orphaned tracking files:
# List instance files
ls ~/.manta/nodes/instances/
# Verify processes
for file in ~/.manta/nodes/instances/*.json; do
pid=$(jq -r .pid "$file")
if ! ps -p $pid > /dev/null; then
echo "Removing stale: $file"
rm "$file"
fi
done
See Also¶
Start Command - Start node instances
Status Command - Check running nodes
Cluster Command - Manage node clusters
Logs Command - View node logs
Identity Configuration - Configuration reference