Cloud Environment Exception Location
Keywords: cloud environment, corefile
Business challenge
After the service is on the cloud, the game process occurs core. Due to pod node drift and other reasons, it is difficult to find corefile and debug breakpoints.
BKCI advantage
BKCI automatically pulls up the debugging node and sends coredump content to background development by email. If there is a demand for breakpoints, log in to the debugging pod on the master node of the cluster for debugging and locating.
solution
The overall process is as follows:
● node monitors whether corefile is generated
Idea: Write a corefile file monitoring script to remotely trigger the BKCI pipeline when corefile is added, using the BK "job platform" - timed task function.
Script core part sample, for reference only
file_list=find /data/corefile -mmin -3 -name "core_*"
● After the BKCI pipeline is triggered remotely, deploy the debug pod based on node ip address and image version
● After debugging pod starts, run the kubectl command to obtain the content of coredump
● Send emails or bots to notify developers
● Developers log in to debug pod and troubleshoot problems
BKCI pipeline configuration
● Parsing file names
By parsing the corefile File name, obtain the namespace, file name, image version number, and other information.
● Start the debug pod
Start the debugging pod based on parameters such as the node node and image version
● Get coredump content
Run the kubectl command to obtain the content of coredump
kubectl -n NAMESPACE logs POD_NAME -c corefile-debug
Last updated