MIPS64 GlobalISel Crash: Empty Function Bug

by Admin 44 views
MIPS64 GlobalISel Crash: Empty Function Bug

Introduction

Hey everyone! 👋 Today, we're diving into a fascinating (and frustrating!) issue within the LLVM compiler infrastructure. Specifically, we're looking at a crash that occurs when using GlobalISel (GISel) on MIPS64 architecture. The bug manifests even in a very simple, seemingly innocuous case: an empty function. This type of situation is especially problematic because it highlights a fundamental issue in the compiler's handling of specific register classes, something that needs to be addressed for the stability and reliability of the compilation process. This article breaks down the problem, provides context, and gives you a good understanding of what's happening under the hood. The goal is to provide a comprehensive explanation of the problem, its impact, and potential workarounds, while also giving a high-level view that is easily understood by those new to compiler development. We will be analyzing the crash, and how you can get around it while working on MIPS64 with GlobalISel. The central theme of this article is to investigate a specific bug report and the debugging process involved in figuring out the root cause. This includes the reproducer, the error report, and the steps that can be taken to mitigate the problem. The core of this analysis will be on the interaction between GISel and the MIPS64 architecture, and what goes wrong.

The Problem: A Crash in the Compiler

The core of the issue is a crash reported when running llc (LLVM compiler) with the -global-isel flag enabled. This flag activates the GlobalISel pass, a newer register allocation and instruction selection framework within LLVM. The crash is triggered by a seemingly trivial piece of code: a function that does nothing. The error message indicates that the target (MIPS64) needs to handle register class ID 0x35. For those unfamiliar, register classes are groupings of registers that share similar characteristics and can be used interchangeably for certain operations. This crash suggests that the compiler isn't correctly identifying or handling a specific register class when processing this simple function. This is especially concerning, since the compiler should be able to handle this type of basic construct. The error message is very specific, Target needs to handle register class ID 0x35, which implies a missing configuration or an unimplemented path within the compiler's MIPS64 backend. The nature of the crash indicates an issue with register allocation, specifically during the RegBankSelect pass. It’s important to remember that GlobalISel is designed to provide a more generic and flexible approach to instruction selection and register allocation, but it still relies on accurate target-specific information.

Reproducing the Bug

The Reproducer: A Minimal Example

To understand the issue fully, let's look at the provided reproducer, which is a small piece of LLVM IR (Intermediate Representation) code. The reproducer is the key to demonstrating the bug. The following LLVM IR code demonstrates the core issue. Remember, the simpler the reproducer, the easier it is to pinpoint the problem. In this case, the LLVM IR is as simple as possible. It is a function that immediately returns. Let's examine the key parts:

target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "mips64el-unknown-unknown-elf"

define i64 @func_3() local_unnamed_addr #0 {
entry:
  ret i64 undef
}

attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="mips64r2" "target-features"="+mips64r2,-noabicalls" }

This code defines a function @func_3 that takes no arguments and returns an undefined 64-bit integer. It's about as basic as a function can get. The attributes specify details about the function, such as how it should be optimized and which target features to enable. The important part is that this minimal example still triggers the crash when compiled with llc and the -global-isel flag.

Compilation Steps

To reproduce the issue, you would typically use a command like this:

llc -o output.s -x86-asm-syntax=intel -O3 -global-isel -global-isel-abort=2 < source.ll

Here's a breakdown of the command:

  • llc: The LLVM compiler. This will take your LLVM IR as input, and produce assembly code. The crucial part here is the use of llc to compile our code.
  • -o output.s: Specifies the output file name (in this case, output.s).
  • -x86-asm-syntax=intel: Specifies the assembly syntax to use (in this case, Intel syntax). This isn't directly related to the crash, but it's a common option. This is for the output assembly syntax.
  • -O3: Optimization level 3. While the crash occurs regardless of optimization level, including this is a good practice.
  • -global-isel: Enables the GlobalISel pass. This is the flag that triggers the crash.
  • -global-isel-abort=2: Configure GlobalISel to abort on errors, to make the debugging experience more efficient.
  • < source.ll: Redirects the input LLVM IR file. This directs the compiler to use your test case as input.

Running this command with the provided LLVM IR will result in the crash.

Analyzing the Crash

The Error Message and Stack Trace

The most important piece of evidence is the error message and the stack trace. The stack trace provides a roadmap of the code's execution leading up to the crash. Let's take a closer look at the key parts of the crash report:

Target needs to handle register class ID 0x35
UNREACHABLE executed at /root/build/lib/Target/Mips/MipsGenRegisterBank.inc:193!

This is the core of the problem! It clearly states the error and pinpoints the location where it occurs. It also suggests that the MIPS64 backend is missing support for a specific register class (ID 0x35). The UNREACHABLE macro indicates that the code should not have been reached under normal circumstances, which highlights an unexpected condition in the compiler's logic. Let's look at the stack trace:

Stack dump:
0. Program arguments: /opt/compiler-explorer/clang-assertions-trunk/bin/llc -o /app/output.s -x86-asm-syntax=intel -O3 -global-isel -global-isel-abort=2 <source>
1. Running pass 'Function Pass Manager' on module '<source>'.
2. Running pass 'RegBankSelect' on function '@func_3'
 #0 0x000000000419b988 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-assertions-trunk/bin/llc+0x419b988)
 #1 0x0000000004198834 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #2 0x000070a60e042520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #3 0x000070a60e0969fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc)
 #4 0x000070a60e042476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
 #5 0x000070a60e0287f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
 #6 0x00000000040da8fa (/opt/compiler-explorer/clang-assertions-trunk/bin/llc+0x40da8fa)
 #7 0x0000000001a3a145 llvm::MipsGenRegisterBankInfo::getRegBankFromRegClass(llvm::TargetRegisterClass const&, llvm::LLT) const (/opt/compiler-explorer/clang-assertions-trunk/bin/llc+0x1a3a145)
 #8 0x0000000001a3dc0c llvm::MipsRegisterBankInfo::TypeInfoForMF::setTypesAccordingToPhysicalRegister(llvm::MachineInstr const*, llvm::MachineInstr const*, unsigned int) (/opt/compiler-explorer/clang-assertions-trunk/bin/llc+0x1a3dc0c)
 #9 0x0000000001a43713 llvm::MipsRegisterBankInfo::TypeInfoForMF::visitAdjacentInstrs(llvm::MachineInstr const*, llvm::SmallVectorImpl<llvm::MachineInstr*>&, bool, llvm::MipsRegisterBankInfo::InstType&)
#10 0x0000000001a4256d llvm::MipsRegisterBankInfo::TypeInfoForMF::visit(llvm::MachineInstr const*, llvm::MachineInstr const*, llvm::MipsRegisterBankInfo::InstType&) (.part.0) MipsRegisterBankInfo.cpp:0:0
#11 0x0000000001a43942 llvm::MipsRegisterBankInfo::TypeInfoForMF::determineInstType(llvm::MachineInstr const*)
#12 0x0000000001a449d5 llvm::MipsRegisterBankInfo::getInstrMapping(llvm::MachineInstr const*) const
#13 0x000000000488c9d7 llvm::RegBankSelect::assignInstr(llvm::MachineInstr&)
#14 0x000000000488cf75 llvm::RegBankSelect::assignRegisterBanks(llvm::MachineFunction&)
#15 0x000000000488d1b6 llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&)
#16 0x000000000307eca9 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
#17 0x00000000036ce866 llvm::FPPassManager::runOnFunction(llvm::Function&)
#18 0x00000000036cec11 llvm::FPPassManager::runOnModule(llvm::Module&)
#19 0x00000000036cf47f llvm::legacy::PassManagerImpl::run(llvm::Module&)
#20 0x00000000008fa6c3 compileModule(char**, llvm::LLVMContext&) llc.cpp:0:0
#21 0x00000000007c7846 main
#22 0x000070a60e029d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#23 0x000070a60e029e40 __libc_start_main
#24 0x00000000008efb05 _start

The stack trace shows the sequence of function calls that led to the crash. The most relevant frame is llvm::MipsGenRegisterBankInfo::getRegBankFromRegClass. This indicates that the compiler is trying to determine the register bank for a particular register class and fails, resulting in the UNREACHABLE error. The RegBankSelect pass is where the issue happens. The RegBankSelect pass is crucial in GlobalISel, as it is responsible for assigning register banks to instructions. The crash occurring in this pass indicates a problem with the MIPS64 target information used during the register bank selection process.

Root Cause Analysis

The root cause of the crash is a missing or incomplete implementation in the MIPS64 backend of LLVM, specifically within the register bank information. The compiler is trying to handle a register class (ID 0x35) that it doesn't know how to deal with. This could be due to several reasons, such as:

  • Missing Register Class Definition: The register class might not be fully defined or correctly mapped to the physical registers on the MIPS64 architecture.
  • Incomplete Register Bank Information: The information about how to use registers from this class within the register banks might be missing or incorrect.
  • Unimplemented Instruction Selection: The instruction selection process might not have been implemented for instructions that use registers from this class.

It is likely a combination of these factors, resulting in the UNREACHABLE state. Debugging this issue would likely involve examining the MIPS64 target-specific code within LLVM, specifically the MipsGenRegisterBank.inc file mentioned in the error message and the register class definitions and the register bank information for MIPS64.

Potential Workarounds

Avoiding GlobalISel

The simplest workaround is to avoid using the -global-isel flag. This will revert to the older register allocation and instruction selection passes, which may not have this specific issue. This is a practical solution if you are working on a project where you can’t wait for the bug to be fixed. However, it means you won't be taking advantage of the potential performance benefits of GlobalISel.

Using a Different LLVM Version

Another approach is to try a different version of LLVM. The bug might be fixed in a newer version or may not exist in an older one. If you're using a development build, consider reverting to a stable release. A newer version might have the fix, or an older one might not exhibit the problem.

Modifying the Code (If Possible)

In some cases, if the code that triggers the crash can be simplified or rewritten in a way that avoids the problematic register class, you might be able to work around the issue. However, given the simplicity of the reproducer, this might not be possible in all scenarios.

Conclusion

This article has explored a specific crash in LLVM's GlobalISel pass when compiling a simple empty function for MIPS64. The analysis shows that this is due to the compiler's inability to handle a specific register class. To summarize, here are the main points:

  • The Problem: A crash occurs when using GlobalISel on MIPS64, specifically when compiling an empty function.
  • The Cause: The MIPS64 backend is missing or incomplete, particularly in its handling of a specific register class.
  • The Impact: This bug prevents the correct compilation of even the most basic code, highlighting a critical flaw.
  • The Workarounds: Avoid GlobalISel, use a different LLVM version, or, if possible, modify the code to avoid the problematic register class.

This is a clear example of the challenges and complexities of compiler development. The good news is that by reporting bugs, providing minimal reproducible examples, and contributing to the open-source community, we can collectively work to resolve such issues. If you encounter similar issues, always provide as much information as possible, including a minimal reproducible example, the compiler version, and the target architecture. This helps developers identify and resolve problems more quickly, contributing to a more stable and reliable compiler infrastructure. Remember to report the bug at the LLVM project's issue tracker! 💪

I hope this article was helpful! If you have any questions or comments, feel free to drop them below. Thanks for reading!