-
Notifications
You must be signed in to change notification settings - Fork 97
🌱 Remove usage of FailureReason and FailureMessage (baremetal) #1716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
> Deprecated: This field is deprecated and is > going to be removed when support for v1beta1 > will be dropped. Please see > https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20240916-improve-status-in-CAPI-resources.md > for more details.
…n controller would need to be changed.
Before the test did not realy test something because the state is none at the beginning (before provisioning), too.
…into tg/remove-failure-reason--baremetal
…into tg/remove-failure-reason--baremetal
…e-reason--baremetal--on-top-main
…e-reason--baremetal--on-top-main
…into tg/remove-failure-reason--baremetal
api/v1beta1/conditions_const.go
Outdated
| const ( | ||
| // RemediationSucceededCondition is: | ||
| // - False when the corresponding CAPI Machine has the "cluster.x-k8s.io/remediate-machine" annotation set and will be remediated by CAPI soon. | ||
| // - True otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it shouldn't be true otherwise. It should be only true IMO after it succeeded. Otherwise it should not be set at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, what about deleting the condition 6 hours after it got set to True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think setting it to true is fine, once it actually got remediated. Then we can think about what you propose, but that would be an optimization that we could do later IMO. However, what I really would not like, and I think we currently do it, is to set the condition on true, even though we didn't even ever remediate the machine.
| ctx = ctrl.LoggerInto(ctx, log) | ||
|
|
||
| remediateConditionOfHbmm := conditions.Get(hetznerBareMetalMachine, infrav1.NoRemediateMachineAnnotationCondition) | ||
| if remediateConditionOfHbmm != nil && remediateConditionOfHbmm.Status == corev1.ConditionFalse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question: Does it work properly in case the hmbh is mirroring from hbmm, because in the other direction the hbmm is already mirroring from hbmh
| remediateConditionOfHbmm := conditions.Get(hbmMachine, infrav1.RemediationSucceededCondition) | ||
| if remediateConditionOfHbmm != nil && remediateConditionOfHbmm.Status == corev1.ConditionFalse { | ||
| // The hbmm will be deleted. Do not reconcile it. | ||
| log.Info("hbmm has RemediationSucceededCondition=False. Waiting for deletion") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the condition is also set on false if it is still in progress, no? I don't understand this code snippet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janiskemper maybe we should rename it to DeleteInfraMachineSucceeded. Because that is the intention. We do not want to reboot the machine.
This will be done via remediation, but this is just the path which need to be followed, because the controller can't delete the capi-machine. Remediation is not really the intention, the machine should get deleted.
Does this make more sense now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't understand it yet. We have the RemediationSucceeded condition, which is set on false with the reason "remediation ongoing". At least this is how we do it on hbmh. Do we have some different logic here on hbmm?
| return reconcile.Result{}, s.setOwnerRemediatedConditionToFailed(ctx, | ||
| "exit remediation because hbmm has no host annotation") | ||
| } | ||
| // if SetErrorAndRemediate() was used to stop provisioning, do not try to reboot server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the name of the function. SetErrorAndRemediate sounds to me as if you explicitly want this remediation file to trigger. However, you instead do the opposite. Shouldn't we rename this function SetErrorAndRemediate so that readers are not let to the wrong direction?
If I understand correctly, you want to set an error so that remediation via reboot is skipped right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, SetErrorAndDeleteMachine() might be a better name. Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. Just to confirm again: You wanna use this function in the cases when a remediation via reboot should not be done?
This condition is only mirrored from hbmm to hbmh, so I think this is fine. |
Baremetal PR
Related HCloud PR 1717
Here’s a shorter, clean PR description:
Summary
This PR removes all usage of the deprecated
FailureReasonandFailureMessagefields from the bare-metal (BM) provider. Status handling is now fully aligned with the newer CAPI conditions-based model.Why
Upstream CAPI is deprecating these fields. Removing them keeps our provider consistent with the updated status API and prevents reliance on legacy status patterns.
Changes
FailureReasonandFailureMessagein BM controllers and API types.bmMachineName. Capi-machine and hbmm are exptected to have the same name. This was adapted, so test data structures match real structures.