You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Graph Processing Framework with SniperSIM/GEM5/OpenMP
Overview
ECG builds on OpenGraph by integrating it with simulators like SNIPER/GEM5. It is an open source graph processing framework, designed as a modular benchmarking suite for graph processing algorithms. It provides an end to end evaluation infrastructure which includes the preprocessing stage (forming the graph structure) and the graph algorithm. The OpenMP part of ECG has been developed on Ubuntu 20.04, with PowerPC/Intel architecture taken into account.
ECG is coded using C giving the researcher full flexibility with modifying data structures and other algorithmic optimizations.
Presentations that explains end-to-end graph processing (implementation is inspired from these sources)
Ref: Malicevic, Jasmina, Baptiste Lepers, and Willy Zwaenepoel. "Everything you always wanted to know about multicore graph processing but were afraid to ask." 2017 USENIX Annual Technical Conference. Proceedings of the 2015 USENIX Annual Technical Conference, pages 375-386.
[Relabeling the graph], this step achieves better cache locality (better performance) with preprocessing overhead.
Ref: J. Arai, H. Shiokawa, T. Yamamuro, M. Onizuka, and S. Iwamura. Rabbit Order: Just-in-time Parallel Reordering for Fast Graph Analysis. IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2016.
Ref:P. Faldu and J. Diamond and B. Grot, "A Closer Look at Lightweight Graph Reordering," in Proceedings of the International Symposium on Workload Characterization (IISWC), November 2019.
Graph Algorithm step depends on the direction of the data (Push/Pull):
[BFS example], although it doesn't show direction optimized. But we discusses the Push and Pull approach separately.
[Ref]: Scott Beamer, Krste Asanović, David Patterson. The GAP Benchmark Suite. arXiv:1508.03619 [cs.DC], 2015.
Ref: J. Arai, H. Shiokawa, T. Yamamuro, M. Onizuka, and S. Iwamura. Rabbit Order: Just-in-time Parallel Reordering for Fast Graph Analysis. IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2016.
Installation and Dependencies
CPU Mode
OpenMP
Judy Arrays
open@graph:~$ sudo apt-get install libjudy-dev
OpenMP is already a feature of the compiler, so this step is not necessary.
From the root directory you can modify the Makefile with the (parameters) you need for sniper:
open@graph:~ECG$ make clean; make run-sniper
Simulation results are output to ECG/00_graph_bench/sniper-results
To clean simulation stats
open@graph:~ECG$ make clean-stats
You can pass parameters modifying Makefile parameters (easiest way) - cross reference with (SniperSim) to pass the correct values.
PARAMETER
FUNCTION
SNIPER_ARGS
arguments passed to sniper simulator
gem5 Simulator
Coming soon
Graph structure Input (Edge list)
If you open the Makefile you will see the convention for graph directories : BENCHMARKS_DIR/GRAPH_NAME/graph.wbin.
.bin stands to unweighted edge list, .wbin stands for wighted, In binary format. (This is only a convention you don't have to use it)
The reason behind converting the edge-list from text to binary, it is simply takes less space on the drive for large graphs, and easier to use with the mmap function.
convert to binary format and add random weights, for this example all the weights are 1.
--graph-file-format is the type of graph you are reading, --convert-format is the type of format you are converting to.
NOTE: you can read the file from text format without the convert step. By adding --graph-file-format 0 to the argument list. The default is 1 assuming it is binary. please check --help for better explanation.
--stats is a flag that enables conversion. It used also for collecting stats about the graph (but this feature is on hold for now).
ECG can handle multiple representations of the graph structure in memory, each has their own theoretical benefits and shortcomings.
Regular unsorted Edge-list as input.
CSR (Compressed Sparse Row)
Grid
Array-List
Linked-List
ECG Options
Usage: open-graph-openmp [OPTION...]
-f <graph file> -d [data structure] -a [algorithm] -r [root] -n
[num threads] [-h -c -s -w]
ECG is an open source graph processing framework, it is designed to be a
benchmarking suite for various graph processing algorithms using pure C.
-a, --algorithm=[DEFAULT:[0]-BFS]
[0]-BFS, [1]-Page-rank, [2]-SSSP-DeltaStepping,
[3]-SSSP-BellmanFord, [4]-DFS,[5]-SPMV,
[6]-Connected-Components,
[7]-Betweenness-Centrality, [8]-Triangle Counting,
[9-BUGGY]-IncrementalAggregation.
-b, --delta=[DEFAULT:1] SSSP Delta value [Default:1].
-c, --convert-format=[DEFAULT:[1]-binary-edgeList]
[serialize flag must be on --serialize to write]
Serialize graph text format (edge list format) to
binary graph file on load example:-f <graph file>
-c this is specifically useful if you have Graph
CSR/Grid structure and want to save in a binary
file format to skip the preprocessing step for
future runs. [0]-text-edgeList [1]-binary-edgeList
[2]-graphCSR-binary.
-d, --data-structure=[DEFAULT:[0]-CSR]
[0]-CSR, [1]-Grid, [2]-Adj LinkedList, [3]-Adj
ArrayList [4-5] same order bitmap frontiers.
-e, --tolerance=[EPSILON:0.0001]
Tolerance value of for page rank
[default:0.0001].
-f, --graph-file=<FILE> Edge list represents the graph binary format to
run the algorithm textual format change
graph-file-format.
-F, --labels-file=<FILE> Read and reorder vertex labels from a text file,
Specify the file name for the new graph reorder,
generated from Gorder, Rabbit-order, etc.
-g, --bin-size=[SIZE:512] You bin vertices's histogram according to this
parameter, if you have a large graph you want to
illustrate.
-i, --num-iterations=[DEFAULT:20]
Number of iterations for page rank to converge
[default:20] SSSP-BellmanFord [default:V-1].
-j, --verbosity=[DEFAULT:[0:no stats output]
For now it controls the output of .perf file and
PageRank .stats (needs --stats enabled)
filesPageRank .stat [1:top-k results] [2:top-k
results and top-k ranked vertices listed.
-k, --remove-duplicate Removers duplicate edges and self loops from the
graph.
-K, --Kernel-num-threads=[DEFAULT:algo-num-threads]
Number of threads for graph processing kernel
(critical-path) (graph algorithm)
-l, --light-reorder-l1=[DEFAULT:[0]-no-reordering]
Relabels the graph for better cache performance
(first layer). [0]-no-reordering [1]-out-degree
[2]-in-degree [3]-(in+out)-degree [4]-DBG-out
[5]-DBG-in [6]-HUBSort-out [7]-HUBSort-in
[8]-HUBCluster-out [9]-HUBCluster-in
[10]-(random)-degree [11]-LoadFromFile
-L, --light-reorder-l2=[DEFAULT:[0]-no-reordering]
Relabels the graph for better cache performance
(second layer). [0]-no-reordering [1]-out-degree
[2]-in-degree [3]-(in+out)-degree [4]-DBG-out
[5]-DBG-in [6]-HUBSort-out [7]-HUBSort-in
[8]-HUBCluster-out [9]-HUBCluster-in
[10]-(random)-degree [11]-LoadFromFile
-M, --mask-mode=[DEFAULT:[0:disabled]]
Encodes [0:disabled] the last two bits of
[1:out-degree]-Edgelist-labels
[2:in-degree]-Edgelist-labels or
[3:out-degree]-vertex-property-data
[4:in-degree]-vertex-property-data with hot/cold
hints [11:HOT]|[10:WARM]|[01:LUKEWARM]|[00:COLD]
to specialize caching. The algorithm needs to
support value unmask to work.
-n, --pre-num-threads=[DEFAULT:MAX]
Number of threads for preprocessing (graph
structure) step
-N, --algo-num-threads=[DEFAULT:MAX]
Number of threads for graph processing (graph
algorithm)
-o, --sort=[DEFAULT:[0]-radix-src]
[0]-radix-src [1]-radix-src-dest [2]-count-src
[3]-count-src-dst.
-O, --light-reorder-l3=[DEFAULT:[0]-no-reordering]
Relabels the graph for better cache performance
(third layer). [0]-no-reordering [1]-out-degree
[2]-in-degree [3]-(in+out)-degree [4]-DBG-out
[5]-DBG-in [6]-HUBSort-out [7]-HUBSort-in
[8]-HUBCluster-out [9]-HUBCluster-in
[10]-(random)-degree [11]-LoadFromFile
-p, --direction=[DEFAULT:[0]-PULL]
[0]-PULL, [1]-PUSH,[2]-HYBRID. NOTE: Please
consult the function switch table for each
algorithm.
-r, --root=[DEFAULT:0] BFS, DFS, SSSP root
-s, --symmetrize Symmetric graph, create a set of incoming edges.
-S, --stats Write algorithm stats to file. same directory as
the graph.PageRank: Dumps top-k ranks matching
using QPR similarity metrics.
-t, --num-trials=[DEFAULT:[1 Trial]]
Number of trials for whole run (graph algorithm
run) [default:1].
-w, --generate-weights Load or Generate weights. Check ->graphConfig.h
#define WEIGHTED 1 beforehand then recompile using
this option.
-x, --serialize Enable file conversion/serialization use with
--convert-format.
-z, --graph-file-format=[DEFAULT:[1]-binary-edgeList]
Specify file format to be read, is it textual edge
list, or a binary file edge list. This is
specifically useful if you have Graph CSR/Grid
structure already saved in a binary file format to
skip the preprocessing step. [0]-text edgeList
[1]-binary edgeList [2]-graphCSR binary.
-?, --help Give this help list
--usage Give a short usage message
-V, --version Print program version
Organization
00_graph_bench
include - Major function headers
graphalgorithms - supported Graph algorithms
openmp - OpenMP integration
BFS.h - Breadth First Search
DFS.h - Depth First Search
SSSP.h - Single Source Shortest Path
bellmanFord.h - Single Source Shortest Path using Bellman Ford
incrementalAgreggation.h - Incremental Aggregation for clustering
pageRank.h - Page Rank Algorithm
SPMV.h - Sparse Matrix Vector Multiplication
preprocessing - preprocessing graph structure
countsort.h - sort edge list using count sort
radixsort.h - sort edge list using radix sort
reorder.h - cluster reorder the graph for better cache locality
sortRun.h - chose which sorting algorithm to use
structures - structures that hold the graph in memory
graphAdjArrayList.h - graph using adjacency list array with arrays
graphAdjLinkeList.h - graph using adjacency list array with linked lists
graphCSR.h - graph using compressed sparse matrix
graphGrid.h - graph using Grid
src - Major function Source files
graphalgorithms - supported Graph algorithms
openmp - OpenMP integration
BFS.c - Breadth First Search
DFS.c - Depth First Search
SSSP.c - Single Source Shortest Path
bellmanFord.c - Single Source Shortest Path using Bellman Ford
incrementalAgreggation.c - Incremental Aggregation for clustering
pageRank.c - Page Rank Algorithm
SPMV.c - Sparse Matrix Vector Multiplication
preprocessing - preprocessing graph structure
countsort.c - sort edge list using count sort
radixsort.c - sort edge list using radix sort
reorder.c - cluster reorder the graph for better cache locality
sortRun.c - chose which sorting algorithm to use
structures - structures that hold the graph in memory
graphAdjArrayList.c - graph using adjacency list array with arrays
graphAdjLinkeList.c - graph using adjacency list array with linked lists