3.9 KiB
Instructions
stbu
lwzu
addx
addix
addic
addzex
subfx
subfex
subficx
mulli
mullwx
divwux
andix
orx
rlwinmx
rldiclx
extsbx
slwx
srawix
# can be no-op, or @llvm.prefetch
dcbt
dcbtst
twi # @llvm.debugtrap ?
Overflow bits can be set via the intrinsics:
llvm.sadd.with.overflow
/etc
It'd be nice to avoid doing this unless absolutely required. The SDB could
walk functions to see if they ever read or branch on the SO bit of things.
@llvm.expect.i32
/.i64
could be used with the BH bits in branches to
indicate expected values.
Codegen
Make membase/state constant. Ensure optimized code uses constant value.
Branch generation
Change style to match: http://llvm.org/docs/tutorial/LangImpl5.html Insert check code, then push_back the branch block and implicit else after its generated. This ensures ordering stays legit.
Stack variables
Use stack variables for registers.
- All allocas should go in the entry block.
- Lazily add or just add all registers/etc at the head.
- Must be 1 el, int64
- Reuse through function.
- On FlushRegisters write back to state.
- FlushRegisters on indirect branch or call.
/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
/// the function. This is used for mutable variables etc.
static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
const std::string &VarName) {
IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
TheFunction->getEntryBlock().begin());
return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
VarName.c_str());
}
// stash result of above and reuse
// on first use in entry get the value from state?
// Promote allocas to registers.
OurFPM.add(createPromoteMemoryToRegisterPass());
// Do simple "peephole" optimizations and bit-twiddling optzns.
OurFPM.add(createInstructionCombiningPass());
// Reassociate expressions.
OurFPM.add(createReassociatePass());
Tracing
Tracing modes:
- off (0)
- syscalls (1)
- fn calls (2)
- all instructions (3)
Inject extern functions into module:
- XeTraceKernelCall(caller_ia, cia, name, state)
- XeTraceUserCall(caller_ia, cia, name, state)
- XeTraceInstruction(cia, state)
Calling convention
Experiment with fastcc? May need typedef fn ptrs to call into the JITted code.
nonlazybind fn attribute to prevent lazy binding (slow down startup)
Indirect branches (ctr/lr)
emit_control.cc XeEmitBranchTo Need to take the value in LR/CTR and do something with it.
Return path:
- In SDB see if the function follows the 'return' semantic:
- mfspr LR / mtspr LR/CTR / bcctr -- at end?
- In codegen add a BB that is just return.
- If a block 'returns', branch to the return BB.
Tail calls:
- If in a call BB check next BB.
- If next is a return, optimize to a tail call.
Fast path:
- Every time LR would be stashed, add the value of LR (some NIA) to a lookup table.
- When doing an indirect branch first lookup the address in the table.
- If not found, slow path, else jump.
Slow path:
- Call out and do an SDB lookup.
- If found, return, add to lookup table, and jump.
- If not found, need new function codegen!
If the indirect br looks like it may be local (no stack setup/etc?) then build a jump table:
Branch register with no link:
switch i32 %nia, label %non_local [ i32 0x..., label %loc_...
i32 0x..., label %loc_...
i32 0x..., label %loc_... ]
%non_local: going outside of the function
Could put one of these tables at the bottom of each function and share
it.
This could be done via indirectbr if branchaddress is used to stash the
address. The address must be within the function, though.
Branch register with link:
check, never local?
Caching of register values in basic blocks
Right now the SSA values seem to leak from the blocks somehow. All caching is disabled.
## Debugging