IntermediateScenario
10 min
Node CPU Thrashing
LinuxPerformanceTroubleshooting
Advertisement
Interview Question
One node in your cluster shows 100% CPU usage with context switching spikes. How do you troubleshoot?
Key Points to Cover
- Run top/htop/mpstat to confirm CPU saturation
- Check for high context switches or interrupts
- Identify runaway processes or noisy neighbors
- Adjust CPU limits or reschedule workloads
- Tune kernel params if systemic
Evaluation Rubric
Uses CPU monitoring tools30% weight
Finds process/neighbors causing thrash30% weight
Mitigates via limits/rescheduling20% weight
Mentions tuning or safeguards20% weight
Hints
- 💡Noisy neighbor issues common in shared clusters.
Potential Follow-up Questions
- ❓How do you prevent CPU starvation?
- ❓What about CPU pinning?
Advertisement