Skip to main content

Troubleshooting Guide

1. Overview​

This guide provides systematic troubleshooting procedures and solutions for the NE503 AIPC platform. The platform uses a microservice architecture where services communicate via Unix Sockets and follow a specific startup sequence. When encountering issues, follow this general workflow:

  1. Confirm the issue symptom
  2. Check related logs
  3. Refer to the corresponding section
  4. Execute the recommended solution

2. General Troubleshooting Workflow​

3. Service Startup Failure Troubleshooting​

3.1 Check systemd Status​

# View all AIPC service statuses
systemctl status ai-runtime camera-daemon app-manager event-bus device-control device-discovery platform-api

# View a specific service status
systemctl status ai-runtime.service

# View failed services
systemctl --failed

# View service dependencies
systemctl list-dependencies platform-api.service

3.2 Check if Unix Socket Exists​

# List /run/aipc directory
ls -la /run/aipc/

# The platform has 7 sockets:
# ai-runtime.sock β€” AI inference service
# app-manager.sock β€” Container app management
# device-control.sock β€” Device peripheral control
# event-bus.sock β€” Event bus
# device-discovery.sock β€” Device discovery
# camera.sock β€” camera-daemon frame publisher (fd zero-copy)
# camera-control.sock β€” camera-daemon control (lens/HAL)
ls -la /run/aipc/*.sock

# Test Socket connection
nc -U /run/aipc/ai-runtime.sock

3.3 View Logs with journalctl​

# View service logs in real time
journalctl -u ai-runtime -f

# View logs from the last 1 hour
journalctl -u camera-daemon --since "1 hour ago"

# View logs containing error keywords
journalctl -u app-manager | grep -i "error\|failed\|fatal"

# View detailed startup failure errors
journalctl -u app-manager -b --no-pager

# Filter by error level
journalctl -u event-bus -p err
journalctl -u device-control -p warning

3.4 Common Startup Issues​

3.5 Socket Permission Check​

# Check Socket directory and file permissions
ls -ld /run/aipc/
ls -la /run/aipc/*.sock

4. Video Streaming Troubleshooting​

4.1 RTSP Connection Failure​

Diagnostic commands:

# Check RTSP service status
systemctl status camera-daemon

# View RTSP logs
journalctl -u camera-daemon -f

# Test RTSP connection (replace <device-ip> with the actual device IP)
ffmpeg -rtsp_transport tcp -i rtsp://<device-ip>:8554/main -t 10 -f null -

For Web Console WebSocket disconnection troubleshooting (video playback layer), see Application Troubleshooting β€” Video Stream Integration.

5. Device Control Troubleshooting​

SymptomPossible CauseDiagnostic Command
Lens control abnormalityFocus/zoom/iris motor faultgrpcurl ... DeviceControl/GetLensStatus; grpcurl ... DeviceControl/LensResetZero
UART communication failureBaud rate/wiring/voltage issuels -la /dev/ttyS*; stty -F /dev/ttyS0 921600

The complete gRPC interface is defined in platform/device-control/proto/device.proto.

6. Web Console Troubleshooting​

6.1 Browser Compatibility​

BrowserMinimum VersionSupport LevelKnown IssuesSolution
Chrome88+Full support----
Firefox78+Basic supportNo WebCodecsUse MSE playback
Safari14+Partial supportNo WebCodecsDegrade to MSE
Edge88+Full support----
Mobile browsers--Limited supportPerformance issuesUse desktop

Chrome 88+ or Edge 88+ recommended for the best experience. Safari auto-degrades to MSE with slightly lower performance.

6.2 WebSocket and Video Playback Troubleshooting​

Video playback relies on WebSocket to transport H.264 frames. Common issues:

SymptomPossible CauseSolution
WebSocket 1006Abnormal connection closeCheck if platform-api is running, firewall allows port 8080
WebSocket 401/403Invalid or expired tokenRe-login to get a new token
Black screenWebSocket not established / SPS-PPS not receivedRefresh page, check WebSocket connection status
Artifacts/mosaicNetwork packet loss / decoder incompatibilitySwitch browser or check network quality
High latencyNetwork latency / buffer too largeEnsure sufficient LAN bandwidth, reduce encoding GOP

6.3 API Request Failures​

Status CodeMeaningSolution
401Authentication failedClear token and re-login
403Insufficient permissionsCheck user permissions
404Resource not foundCheck API path
500Server errorCheck /var/log/aipc/platform-api.log
503Service unavailableCheck service status, restart if needed

7. Log Level Adjustment​

7.1 Temporarily Adjust Log Level​

# Temporarily set to debug level
# Note: journalctl does not support --log-level; set debug level in the service configuration file instead
sudo journalctl -u ai-runtime -f

# View error level and above logs
sudo journalctl -u camera-daemon -p err

7.2 Modify Configuration File​

The actual config files on the device are located at /opt/aipc/etc/*.yaml (the path specified in systemd ExecStart). The configs/ directory in the source repo is only a template.

# /opt/aipc/etc/ai-runtime.yaml β€” adjust log_level
service:
name: ai-runtime
listen: unix:///run/aipc/ai-runtime.sock
log_level: debug # debug, info, warn, error

7.3 Log Level Reference​

LevelDescription
debugDetailed debugging information
infoKey runtime status
warnNon-fatal warnings
errorCritical errors

7.4 Log Analysis Tips​

# View error rate
journalctl -u ai-runtime --since "1 hour ago" | grep -c "error"

# View most frequent errors
journalctl -u ai-runtime | grep "error" | sort | uniq -c | sort -nr

# Filter specific errors
journalctl -u ai-runtime | grep -E "(timeout|connection refused|permission denied)"

8. Performance Monitoring​

8.1 System Resource Monitoring​

# Monitor CPU usage
top -p $(pgrep -f ai-runtime)

# Monitor memory usage
free -h && ps aux | grep ai-runtime

# Monitor disk I/O
iostat -x 1 5

# Monitor network
iftop -i eth0

8.2 Service Performance Metrics​

# AI Runtime statistics
grpcurl -plaintext -d '{}' unix:///run/aipc/ai-runtime.sock aipc.inference.InferenceService/GetStats

# Container statistics
aipc-cli app info <app-id>

# Device status
grpcurl -plaintext -d '{}' unix:///run/aipc/device-control.sock aipc.device.DeviceControl/GetDeviceStatus

8.3 Real-time Monitoring Script​

#!/bin/bash
# Monitoring script example

while true; do
echo "=== $(date) ==="
echo "CPU Usage:"
top -bn1 | grep "Cpu(s)" | sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | awk '{print 100 - $1}'
echo "Memory Usage:"
free | grep Mem | awk '{printf "%.2f%%\n", $3/$2 * 100.0}'
echo "Disk Usage:"
df /opt/aipc | tail -1 | awk '{print $5}'
echo "NPU Temperature:"
hailortcli fw-control --temperature | awk '{print $3}'
sleep 5
done

9. Common Diagnostic Commands Quick Reference​

ScenarioCommandDescription
View service statussystemctl status ai-runtime camera-daemon app-managerCheck core platform services
View service logsjournalctl -u <service-name> -fView service logs in real time
Check Socketls -la /run/aipc/View Unix Socket files
Check system resourcestop -p $(pidof service)Monitor service resource usage
View container statusaipc-cli app listList all container applications
Test network connectioncurl http://localhost:8080/api/v1/media/statusTest API endpoint
View model statusgrpcurl -plaintext -d '{}' unix:///run/aipc/ai-runtime.sock aipc.inference.InferenceService/ListModelsList registered models
Check NPU statushailortcli scanView Hailo device status
View event logsaipc-cli event-log listView event bus logs
Check disk usagedf -h /opt/aipcCheck disk space
Check memory usagefree -hCheck system memory

10. Error Code Reference​

The following are business error codes returned by platform-api. The full definition is in platform/platform-api/handlers/response.go.

Code ranges: 1xxx General/Request Β· 2xxx Auth Β· 3xxx Service/Infra Β· 4xxx Resource Β· 5xxx AI/Model Β· 6xxx App Manager Β· 7xxx Device Β· 8xxx File/Storage Β· 9xxx SSH Β· 10xxx Process.

CodeMeaningCodeMeaningCodeMeaning
0Success1000Unknown error1001Invalid request
1002Invalid JSON1003Missing parameter1004Invalid parameter
2000Unauthorized2001Forbidden2002Token expired
2003Invalid token3000Service unavailable3001Service timeout
3002Service error3003gRPC error3004Database error
4000Not found4001Already exists4002Resource exhausted
4003Operation failed5000Model not found5001Model load failed
5002Inference error5003Invalid model format6000App not found
6001App install failed6002App start failed6003App stop failed
6004App running6005App not running7000Device error
7001PTZ error7002Camera error7003GPIO error
8000File not found8001File upload failed8002File delete failed
8003Storage full8004Access denied9000SSH config error
9001SSH service error10000Process not found10001Process kill failed

11. Troubleshooting Summary​

  1. Check service status first -- Use systemctl status to confirm if services are running
  2. View error logs -- Use journalctl to view detailed error information
  3. Verify network connections -- Check if Sockets and ports are normal
  4. Check resource usage -- Ensure system resources are sufficient
  5. Troubleshoot module by module -- Verify progressively from low-level hardware to upper-level applications
  6. Preserve complete logs -- Save sufficient log information before and after failures