Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Objective This lab will familiarize you with writing static program analyses using the LLVM compiler infrastructure. LLVM is a collection of compiler and analysis toolchain

Objective This lab will familiarize you with writing static program analyses using the LLVM compiler infrastructure. LLVM is a collection of compiler and analysis toolchain utilities widely used in the software analysis community. You will use LLVM to implement two intra-procedural dataflow analyses, one forward (reaching definitions analysis) and one backward (liveness analysis). In LLVM, these are referred to as passes over the code. General Resources Getting Started with LLVM: http://llvm.org http://releases.llvm.org/3.6.2/docs/GettingStarted.html#anexample- using-the-llvm-tool-chain LLVM Documentation: http://releases.llvm.org/3.6.2/docs/index.html http://releases.llvm.org/3.6.2/docs/LangRef.html http://llvm.org/doxygen/index.html C++ Pointers Tutorial: http://www.cplusplus.com/doc/tutorial/pointers/ SSA Intermediate Representation Form (used in LLVM bitcode): https://en.wikipedia.org/wiki/Static_single_assignment_form Programmers Manual Resources Programmers Manual: http://releases.llvm.org/3.6.2/docs/ProgrammersManual.html Using isa<> to check whether a pointer points to an instance of a particular type: http://llvm.org/docs/ProgrammersManual.html#the-isa-cast-and-dyn-c ast-templates Enumerating basic blocks and instructions in a function: http://releases.llvm.org/3.6.2/docs/ProgrammersManual.html#basic-i nspection-and-traversal-routines http://releases.llvm.org/3.6.2/docs/ProgrammersManual.html# iterating-over-predecessors-successors-of-blocks Important classes: http://releases.llvm.org/3.6.2/docs/ProgrammersManual.html#thevalue- class http://releases.llvm.org/3.6.2/docs/ProgrammersManual.html#theuser- class http://releases.llvm.org/3.6.2/docs/ProgrammersManual.html#theinstruction- class http://llvm.org/docs/doxygen/html/classllvm_1_1ValueMap.html http://llvm.org/docs/doxygen/html/classllvm_1_1SetVector.html Lab Setup 1. Download and extract the lab code found on Canvas in the dataflow.zip archive. 2. Navigate to dataflow/build and run the following commands: cmake .. make clean make You should now see libDataflowPass.so under dataflow/build/Dataflow . 3. Go to the dataflow/example directory and generate LLVM bitcode from the programs we will analyze with the following commands: clang -emit-llvm ArrayDemo.c -c -o ArrayDemo.bc clang -emit-llvm Greatest.c -c -o Greatest.bc 4. Run a Dataflow pass using the command below to ensure everything works as expected for the test program ArrayDemo.c . This pass demonstrates the API by printing the definitions, uses, predecessors, and successors of each instruction. opt -load ../build/Dataflow/libDataflowPass.so -Printer < ArrayDemo.bc > /dev/null In addition to testing your setup, the printer pass (in Printer.cpp ) is a very useful reference for implementing your own passes. Lab Instructions Complete the doAnalysis method in the ReachDefAnalysis.cpp and LivenessAnalysis.cpp (located in dataflow/Dataflow/ ) skeleton files to implement the two analyses. Do not write your analysis code outside of these files, as these files are the only ones you will submit. You may use the C++ Standard Template Library (STL) when implementing your analyses, but you may not use other third party libraries. Your code will need to iterate over the program points in the input program and store the computed dataflow facts in DataflowAnalysis::inMap and DataflowAnalysis::outMap . Both analyses inherit from the base class DataflowAnalysis , which you can find in the header file DataflowAnalysis.h located in the directory dataflow/Dataflow/ . Besides including useful classes such as SetVector and ValueMap , DataflowAnalysis.h also defines useful utility functions such as getPredecessors , getSuccessors , and isDef . LLVM passes are performed an intermediate representation (LLVM bitcode) generated from the program source code. LLVM bitcode is generated in Static Single Assignment (SSA) form, a common simplification used by compiler infrastructure. In SSA form, each variable is assigned exactly once, and every variable is defined before it is used. Variables are represented directly by the instruction defining it. In fact, the Instruction class is a subtype of the Value class. You can check whether an instruction defines a variable by checking getType()->isVoidTy() . Since each variable is uniquely defined, variables will never be redefined in SSA form. This will affect the generation of KILL sets in your implementation of reaching definitions analysis, specifically, they will always be empty. After completing the two doAnalysis methods, rebuild the analyses using the commands from setup step 2, and then rerun the analyses using following commands to print the results of the analyses on the ArrayDemo program: opt -load ../build/Dataflow/libDataflowPass.so -ReachDef < ArrayDemo.bc > /dev/null opt -load ../build/Dataflow/libDataflowPass.so -Liveness < ArrayDemo.bc > /dev/null If your implementation is correct, your output will match the example output in ArrayDemo_ReachDef and ArrayDemo_Liveness found in dataflow/example/ . The order of elements in the IN and OUT sets does not matter, but the number of elements and the values should match exactly. We have also included another program, Greatest.c , and its expected outputs for testing your implementation. You can use commands similar to those above to analyze this program. We also encourage students to develop their own test cases / corresponding output and share them on Piazza, although we will not be validating them for correctness. Additionally, we have provided images of the control flow graphs (CFGs) for both programs, named array_demo_graph.png and greatest_graph.png . Each instruction is represented by a node, and edges represent an instructions successor instruction(s). These graphs are useful for reasoning about the correctness of your IN and OUT sets for the sample program analyses. We recommend that you take a moment to review the graph and ensure you understand the structure of each program. Pay close attention to the instructions that have multiple successor and/or predecessor instructions. Helpful LLVM API Information ValueMap has a similar interface to std::map . For example, you can access or insert an element using the [] operator: ValueMap vm; Instruction* I = /*...*/; vm[I] = 5; // inserts to the map LLVM will generate all of the necessary objects for your analyses according to its object model (see DataflowAnalysis.cpp ). You will not need to create new elements to insert into DataflowAnalysis::inMap or DataflowAnalysis::outMap , your code should add the pointer to the object to the ValueMap . SetVector has a similar interface to std::vector , except that it does not permit inserting duplicate elements. The == operator returns false for two SetVector objects containing the same elements in different orders. Also, functions in the library from the STL work with SetVector . For example: SetVector sv; for( int i = 0; i < 5; i++ ) sv.insert(i); sv.insert(0); // has no effect if (std::all_of(sv.begin(), sv.end(), isPositive)) // all_of is from printf("All numbers in sv are positive!"); assuming isPositive is defined elsewhere as: bool isPositive(int i) {return i > 0;}

// ***************** FILE ReachDefAnalysis

#include "DataflowAnalysis.h"

namespace dataflow{ struct ReachDefAnalysis: public DataflowAnalysis{ static char ID; ReachDefAnalysis() : DataflowAnalysis(ID){} protected: /** * Implement your analysis in this function. Store your results in DataflowAnalysis::inMap and * DataflowAnalysis:outMap. */ void doAnalysis(Function &F) override{

//write code in here

} virtual std::string getAnalysisName() override{ return "ReachDef"; } }; char ReachDefAnalysis::ID = 1; static RegisterPass X("ReachDef", "Reach Definition Analysis", false /* Only looks at CFG */, false /* Analysis Pass */); }

// ***************** FILE LivenessAnalysis

#include "DataflowAnalysis.h"

namespace dataflow{ struct LivenessAnalysis: public DataflowAnalysis{ static char ID; LivenessAnalysis() : DataflowAnalysis(ID){} protected: /** * Implement your analysis in this function. Store your results in DataflowAnalysis::inMap and * DataflowAnalysis:outMap. */ void doAnalysis(Function &F) override{

} virtual std::string getAnalysisName() override{ return "Liveness Analysis"; } }; char LivenessAnalysis::ID = 1; static RegisterPass X("Liveness", "Liveness Analysis", false /* Only looks at CFG */, false /* Analysis Pass */); }

// ***************** FILE DataflowAnalysis.cpp

#include "DataflowAnalysis.h"

using namespace llvm;

namespace dataflow { DataflowAnalysis::DataflowAnalysis(char ID) : FunctionPass(ID){} bool DataflowAnalysis::runOnFunction(Function &F){ errs() << "Running " << getAnalysisName() << " on " << F.getName() << " "; for (inst_iterator I = inst_begin(F), E= inst_end(F); I != E; ++I){ inMap[&(*I)] = new SetVector; outMap[&(*I)] = new SetVector; } doAnalysis(F); for (inst_iterator I = inst_begin(F), E= inst_end(F); I != E; ++I){ errs() << "Instruction: " << *I << " "; errs() << "In set: "; SetVector* inSet = inMap[&(*I)]; errs() << "["; for(SetVector::iterator V = inSet->begin(), VE = inSet->end(); V != VE; ++V){ errs() << **V << "; "; } errs() << "] "; errs() << "Out set: "; SetVector* outSet = outMap[&(*I)]; errs() << "["; for(SetVector::iterator V = outSet->begin(), VE = outSet->end(); V != VE; ++V){ errs() << **V << "; "; } errs() << "] "; errs() << " "; } for (inst_iterator I = inst_begin(F), E= inst_end(F); I != E; ++I){ delete inMap[&(*I)]; delete outMap[&(*I)]; } return false; } }

// ***************** FILE DataflowAnalysis.h

#include "llvm/Pass.h"

#include "llvm/IR/Function.h"

#include "llvm/Support/raw_ostream.h"

#include "llvm/IR/CFG.h"

#include "llvm/IR/ValueMap.h"

#include "llvm/ADT/SetVector.h"

#include

#include "llvm/IR/InstIterator.h"

using namespace llvm;

namespace dataflow{

/**

* The skeleton class for Dataflow analyses.

*/

struct DataflowAnalysis : public FunctionPass {

ValueMap*> inMap;

ValueMap*> outMap;

DataflowAnalysis(char ID);

bool runOnFunction(Function &F) override;

protected:

virtual void doAnalysis(Function &F) = 0;

virtual std::string getAnalysisName() = 0;

};

/**

* Get the predecessors of a given instruction in the control-flow graph.

*/

inline std::vector getPredecessors(Instruction* I){

std::vector ret;

BasicBlock* BB = I->getParent();

for(BasicBlock::reverse_iterator i = BB->rbegin(), e = BB->rend(); i != e; ++i){

if (&(*i) == I){

++i;

if(i == e){

for(pred_iterator pre = pred_begin(BB), BE = pred_end(BB); pre != BE; ++pre)

ret.push_back(&(*((*pre)->rbegin())));

}

else{

ret.push_back(&(*i));

}

break;

}

}

return ret;

}

/**

* Get the successors of a given instruction in the control-flow graph.

*/

inline std::vector getSuccessors(Instruction* I){

std::vector ret;

BasicBlock* BB = I->getParent();

for(BasicBlock::iterator i = BB->begin(), e = BB->end(); i != e; ++i){

if (&(*i) == I){

++i;

if(i == e){

for(succ_iterator succ = succ_begin(BB), BS = succ_end(BB); succ != BS; ++succ)

ret.push_back(&(*((*succ)->begin())));

}

else{

ret.push_back(&(*i));

}

break;

}

}

return ret;

}

/**

* Check whether a given instruction defines a local variable. Note since

* LLVM's IR is in SSA, local variables are directly represented by the

* instructions defining them.

*/

inline bool isDef(Instruction* I){

return !(I->getType()->isVoidTy());

}

}

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Murach's SQL Server 2012 For Developers

Authors: Bryan Syverson, Joel Murach, Mike Murach

1st Edition

1890774693, 9781890774691

More Books

Students also viewed these Databases questions