Graf Zahl wrote:Since you seem to already know a bit of this stuff, can you give me a quick summary of what code from LLVM I need so that I can quickstart the whole thing?
Sure. First you need to boot LLVM. I'd grab the LLVMProgram class from my branch to do this. The main classes we need to use after this step are:
- llvm::LLVMContext - Used for using LLVM in multiple threads. We always only have one.
- llvm::ExecutionEngine - The JIT compiler. We use it at the end to get pointers to our functions so we can call them.
- llvm::Module - The container for our functions. Its equivalent in C++ is an .obj file.
- llvm::Function - A function in a module. Create one for each function in decorate.
- llvm::BasicBlock - A list of instructions that ends with a jump or return statement. All functions start with a basic block.
- llvm::IRBuilder<> - Used to emit/insert instructions into a basic block.
- llvm::Value - Immutable values. Constants, the result of an instruction, function arguments.
- llvm::Type - Integers, floats, structs, etc.
Let's say we want to create a simple function that in C++ would be: "int foobar(int a, int b) { return a + b * 42; }". The LLVM code emitting that would look like this:
Code: Select all
void CreateFoobar(llvm::Context *context, llvm::Module *module)
{
using namespace llvm;
Type *int32Type = Type::getInt32Ty(*context);
// Describe function arguments and return type:
Type *returnType = int32Type;
Type *parameterTypes[] = { int32Type, int32Type };
FunctionType *funcType = FunctionType::get(returnType, parameterTypes, false);
// Create function in module and its entry point basic block:
Function *func = Function::Create(funcType, Function::ExternalLinkage, "foobar", module);
BasicBlock *entryBB = BasicBlock::Create(*context, "entry", func);
// Grab argument values:
auto argIterator = func->arg_begin();
Value *a = static_cast<llvm::Argument*>(argIterator++);
Value *b = static_cast<llvm::Argument*>(argIterator++);
// Constant value for 42:
Value *constant42 = ConstantInt::get(*context, APInt(32, 42, true));
// Setup builder to emit instructions into the basic block:
IRBuilder<> builder;
builder.SetInsertPoint(entryBB);
// Emit add and mul instructions into basic block and return the result:
Value result = builder.CreateMul(builder.CreateAdd(a, b), constant42);
builder.CreateRet(result);
// Verify that we didn't emit nonsense:
if (verifyFunction(func))
I_FatalError("verifyFunction failed");
}
int (*)(int, int) GetFoobarAddress(llvm::ExecutionEngine *engine, llvm::Module *module)
{
using namespace llvm;
// Tell JIT compiler to mark the memory pages executable:
engine->finalizeObject();
// Get pointer from JIT compiler:
Function *func = module->getFunction("foobar");
return reinterpret_cast<int(*)(int, int)>(engine->getPointerToFunction(func));
}
That's it. Generates a simple foobar function.
There's only really two things missing from the above example: stack variables and branching. As SSA variables are effectively constants, changing variables requires storing and loading them from the stack. Branching requires jump statements. All jumping in LLVM works by doing a conditional check at the end of a basic block that jumps to another basic block. A for loop function: int sum(int count, int *values) { int val = 0; for (int i = 0; i < count; i++) val += values
; return val; } looks like this:
Code: Select all
void CreateForLoop(llvm::Context *context, llvm::Module *module)
{
[... create func like in first example ...]
// Grab argument values:
auto argIterator = func->arg_begin();
Value *count = static_cast<llvm::Argument*>(argIterator++);
Value *values = static_cast<llvm::Argument*>(argIterator++);
// Setup builder to emit instructions into the basic block:
IRBuilder<> builder;
builder.SetInsertPoint(entryBB);
Value *constant0 = ConstantInt::get(*context, APInt(32, 0, true));
Value *constant1 = ConstantInt::get(*context, APInt(32, 1, true));
// Allocate working variables on the stack and place values in them:
Value *stackIndex = builder.CreateAlloca(int32Type, constant1);
Value *stackVal = builder.CreateAlloca(int32Type, constant1);
builder.CreateStore(stackIndex, constant0);
builder.CreateStore(stackVal, constant0);
// Create three basic blocks:
// one for the conditional check, one for the loop, and one for what should happen when the loop ends
BasicBlock *conditionBB = BasicBlock::Create(*content);
BasicBlock *loopBB = BasicBlock::Create(*content);
BasicBlock *endBB = BasicBlock::Create(*content);
// Jump to condition basic block
builder.CreateBr(conditionBB);
// Emit to condition basic block, grab index from stack and do a conditional jump
builder.SetInsertPoint(conditionBB);
Value *index = builder.CreateLoad(stackIndex);
builder.CreateCondBr(builder.CreateICmpSLT(index, count), loopBB, endBB);
// In the loop we grab the val from the stack, add to it and store result back on the stack
builder.SetInsertPoint(loopBB);
Value *val = builder.CreateLoad(stackVal);
Value *newVal = builder.CreateAdd(val, builder.CreateLoad(builder.CreateGEP(values, index)));
builder.CreateStore(newVal, stackVal);
builder.CreateStore(builder.CreateAdd(index, constant0), stackIndex);
builder.CreateBr(conditionBB);
builder.SetInsertPoint(endBB);
builder.CreateRet(builder.CreateLoad(stackVal));
}
My SSA* family of classes does exactly the above using constructors and operator overloading. I'll let you decide which method is best for the decorate stuff, but just for comparison, the last function looks like this with the SSA classes:
Code: Select all
void CreateForLoop(llvm::Context *context, llvm::Module *module)
{
using namespace llvm;
IRBuilder<> builder(*context);
SSAScope ssa_scope(*context, module, &builder);
SSAFunction function("sum");
function.add_parameter(SSAInt::llvm_type());
function.add_parameter(SSAIntPtr::llvm_type());
function.set_return_type(SSAInt::llvm_type());
function.create_public();
SSAInt count = function.parameter(0);
SSAIntPtr values = function.parameter(1);
SSAStack<SSAInt> stackIndex, stackVal;
stackIndex.store(0);
stackVal.store(0);
SSAForBlock branch;
SSAInt index = stackIndex.load();
branch.loop_block(index < count);
SSAInt val = stackVal.load();
SSAInt newVal = val + values[index].load();
stackVal.store(newVal);
stackIndex.store(index + 1);
branch.end_block();
builder.CreateRet(stackVal.load().v);
}
It is just syntactic sugar around the LLVM API. Because of the way it works it is important to be aware of what kind of LLVM instructions it generates as you type those lines.
Wow, that was a long post! I hope I didn't discourage you too much with it.