During an imperfect overclocking on an AMD Epyc 7763 from 2.45 to 4.0GHz, the IPMI subsequently would enter a resetting loop.
Potential causes:
- faulty USB cable causing short circuits, leading to the whole motherboard “resetting” the whole USB bus and all devices attached
- USB autosuspend incorrectly including the IPMI/BMC and causing the IPMI to suspend/unsuspend/reset
- virtual keyboard/mouse appearing as genuine hardware
Possible fixes:
community-testing/acpi_call 1.2.2-73
community/acpi_call 1.2.2-72
Sep 13 04:21:34.864387 hostname kernel: ipmi_si IPI0001:00: Invalid return from get global enables command: 3 1c 2f 0
Sep 13 04:21:34.864520 hostname kernel: ipmi_si IPI0001:00: Cannot check clearing the rcv irq: -22
Sep 13 04:21:34.874389 hostname kernel: usb 1-2.4: new low-speed USB device number 5 using xhci_hcd
Sep 13 04:21:34.994824 hostname kernel: snd_hda_intel 0000:47:00.1: enabling device (0000 -> 0002)
Sep 13 04:21:34.995018 hostname kernel: snd_hda_intel 0000:47:00.1: Force to non-snoop mode
Sep 13 04:21:34.995141 hostname kernel: usb 1-2.4: New USB device found, idVendor=046b, idProduct=ff10, bcdDevice= 1.00
Sep 13 04:21:34.995190 hostname kernel: usb 1-2.4: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Sep 13 04:21:34.995907 hostname kernel: usb 1-2.4: Product: Virtual Keyboard and Mouse
Sep 13 04:21:34.997742 hostname kernel: snd_hda_intel 0000:49:00.4: enabling device (0000 -> 0002)
Sep 13 04:21:34.997938 hostname kernel: usb 1-2.4: Manufacturer: American Megatrends Inc.
And
Sep 13 19:16:38.507628 hostname kernel: ipmi_si: Invalid return from get global, enables command, not enable the event buffer
And
Sep 13 04:21:35.874391 hostname kernel: ipmi_si IPI0001:00: IPMI message handler: BMC returned incorrect response, expected netfn 7 cmd 42, got netfn 7 cmd 0
Sep 13 04:21:35.874526 hostname kernel: ipmi_si IPI0001:00: IPMI kcs interface initialized
Sep 13 04:21:35.881060 hostname kernel: usbcore: registered new interface driver uas
Sep 13 04:21:35.884393 hostname kernel: ipmi_ssif: IPMI SSIF Interface driver
According to RedHat, this can be the IPMI attempting to read from non-existent sensors: https://bugzilla.redhat.com/show_bug.cgi?id=559299
dmesg logs:
Sep 21 12:42:01.831211 hostname kernel: usb 3-2: USB disconnect, device number 2
Sep 21 12:42:01.831606 hostname kernel: usb 3-2.2: USB disconnect, device number 3
Sep 21 12:42:02.141310 hostname kernel: usb 3-2.4: USB disconnect, device number 4
Sep 21 12:42:02.767777 hostname kernel: usb 1-2.4: new low-speed USB device number 8 using xhci_hcd
Sep 21 12:42:02.877771 hostname kernel: usb 1-2.4: New USB device found, idVendor=046b, idProduct=ff10, bcdDevice= 1.00
Sep 21 12:42:02.878071 hostname kernel: usb 1-2.4: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Sep 21 12:42:02.878249 hostname kernel: usb 1-2.4: Product: Virtual Keyboard and Mouse
Sep 21 12:42:02.878410 hostname kernel: usb 1-2.4: Manufacturer: American Megatrends Inc.
What didn’t work:
- BIOS factory reset did not work.
- Clear CMOS did not work.
BMC reset fix:
Only resetting the IPMI via logging into the administrative console over HTTP, and restoring the AST2500 to factory settings worked.
- Log into IPMI panel.
- Settings -> Reset to Factory settings.