qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/5] s390x: Enable and document boot device fallback on panic


From: Jared Rossi
Subject: Re: [PATCH 5/5] s390x: Enable and document boot device fallback on panic
Date: Wed, 5 Jun 2024 10:48:52 -0400
User-agent: Mozilla Thunderbird


diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index c977a52b50..de3d1f0d5a 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -43,6 +43,7 @@ typedef unsigned long long u64;
  #include "iplb.h"
    /* start.s */
+extern char _start[];
  void disabled_wait(void) __attribute__ ((__noreturn__));
  void consume_sclp_int(void);
  void consume_io_int(void);
@@ -88,6 +89,11 @@ __attribute__ ((__noreturn__))
  static inline void panic(const char *string)
  {
      sclp_print(string);
+    if (load_next_iplb()) {
+        sclp_print("\nTrying next boot device...");
+        jump_to_IPL_code((long)_start);
+    }
+
      disabled_wait();
  }

Honestly, I am unsure whether this is a really cool idea or a very ugly hack ... but I think I tend towards the latter, sorry. Jumping back to the startup code might cause various problem, e.g. pre-initialized variables don't get their values reset, causing different behavior when the s390-ccw bios runs a function a second time this way. Thus this sounds very fragile. Could we please try to get things cleaned up correctly, so that functions return with error codes instead of panicking when we can continue with another boot device? Even if its more work right now, I think this will be much more maintainable in the future.

 Thomas


Thanks Thomas, I appreciate your insight.  Your hesitation is perfectly understandable as well.  My initial design was like you suggest, where the functions return instead of panic, but the issue I ran into is that netboot uses a separate image, which we jump in to at the start of IPL from a network device (see zipl_load() in pc-bios/s390-ccw/bootmap.c).  I wasn't able to come up with a simple way to return to the main BIOS code if a netboot fails other than by jumping back.  So, it seems to me that netboot kind of throws a monkeywrench into the basic idea of reworking the panics into returns.

I'm open to suggestions on a better way to recover from a failed netboot, and it's certainly possible I've overlooked something, but as far as I can tell a jump is necessary in that particular case at least.  Netboot could perhaps be handled as a special case where the jump back is permitted whereas other device types return, but I don't think that actually solves the main issue.

What are your thoughts on this?

Thanks,

Jared Rossi





reply via email to

[Prev in Thread] Current Thread [Next in Thread]