New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent POD status reporting #107713
Comments
/sig api-machinery |
Refers to istio/istio#11659 |
/triage accepted This is a display issue in the CLI. |
@ehashman: GuidelinesPlease ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @ehashman, do you think this is a good issue to start contributing with? If so, I'd like to assign myself to it. |
This may be caused by the reason of error is null. I fixed this similar with initContainer.
/assign |
I think pod status will be 'Completed' only when |
I presume the error happens due to evaluating the POD status to the status of the last container in the array. But POD status should be |
Yea, i agrees. So I add |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
@kkkkun Sure apiVersion: batch/v1
kind: Job
metadata:
name: wrong-pod-status
spec:
backoffLimit: 0
ttlSecondsAfterFinished: 600
template:
spec:
containers:
- name: task-main
image: busybox:latest
command:
- /bin/sh
- -c
- |
# Simulate some useful job
sleep 1
# Simulate job failure
false
resources:
limits:
cpu: 100m
memory: 256Mi
requests:
cpu: 10m
memory: 256Mi
- name: sidecar
image: busybox:latest
command:
- /bin/sh
- -c
- |
# Simulate sidecar work
sleep 2
# Sidecar successful exit
true
resources:
limits:
cpu: 100m
memory: 256Mi
requests:
cpu: 10m
memory: 256Mi
restartPolicy: Never
securityContext:
runAsUser: 65000
runAsGroup: 65000 POD status should display |
What's phase of pod status from kube-apiserver? It it correctly getted from kube-apiserver?
|
Short POD status $ kubectl get pod wrong-pod-status-smz7x
NAME READY STATUS RESTARTS AGE
wrong-pod-status-smz7x 0/2 Completed 0 4m
$ Extended status status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2022-08-08T07:57:09Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2022-08-08T07:57:12Z"
message: 'containers with unready status: [task-main sidecar]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2022-08-08T07:57:12Z"
message: 'containers with unready status: [task-main sidecar]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2022-08-08T07:57:09Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://10a04e3623c42ddb81c9d40acd56e2dfee7dca5727262a68ac21bbb526992f89
image: docker.io/library/busybox:latest
imageID: docker.io/library/busybox@sha256:ef320ff10026a50cf5f0213d35537ce0041ac1d96e9b7800bafd8bc9eff6c693
lastState: {}
name: sidecar
ready: false
restartCount: 0
started: false
state:
terminated:
containerID: containerd://10a04e3623c42ddb81c9d40acd56e2dfee7dca5727262a68ac21bbb526992f89
exitCode: 0
finishedAt: "2022-08-08T07:57:12Z"
reason: Completed
startedAt: "2022-08-08T07:57:10Z"
- containerID: containerd://13974698f28f35eff7378b6c73cbca32b78f9e23b7477c85beca2b50fdea35b2
image: docker.io/library/busybox:latest
imageID: docker.io/library/busybox@sha256:ef320ff10026a50cf5f0213d35537ce0041ac1d96e9b7800bafd8bc9eff6c693
lastState: {}
name: task-main
ready: false
restartCount: 0
started: false
state:
terminated:
containerID: containerd://13974698f28f35eff7378b6c73cbca32b78f9e23b7477c85beca2b50fdea35b2
exitCode: 1
finishedAt: "2022-08-08T07:57:11Z"
reason: Error
startedAt: "2022-08-08T07:57:10Z"
hostIP: 10.10.140.142
phase: Failed
podIP: 10.10.176.223
podIPs:
- ip: 10.10.176.223
qosClass: Burstable
startTime: "2022-08-08T07:57:09Z" EDIT: As of k8s 1.23.7 |
@kkkkun Hi, any news? |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
This issue has not been updated in over 1 year, and should be re-triaged. You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
/triage accepted |
any progress here? |
If you can enable the SidecarContainers feature gate, which is relatively new (1.28: alpha, 1.29: beta), you may be able to avoid this issue by configuring a sidecar container as an init container with |
True. But it is a workaround. Still it has to be supported by multiple projects like Istio. |
What happened?
I run service mesh sidecar (in my case Istio) in PODs that are created/controlled by
Jobs
. "Main" container shuts down the istio sidecar via API call to 127.0.0.1 and exits with an actual application code. The issue is when "main" container finishes with error, POD status often displaysCompleted
when called viakubectl get pod
.What did you expect to happen?
It should return status
Error
when one the containers in the POD fails.I believe that is due to POD Status field is calculated incorrectly (it takes the value of the reason of the last container in the
pod.Status.ContainerStatuses
array)kubernetes/pkg/printers/internalversion/printers.go
Lines 812 to 813 in 5c99e2a
The workaround for this situation is to name actual application container with first letters like
abc
and sidecar with last onesxyz
.How can we reproduce it (as minimally and precisely as possible)?
Test job
Anything else we need to know?
Istio version used - 1.12.2
Kubernetes version
1.22.5
Cloud provider
OS version
Install tools
Container runtime (CRI) and and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: