Reads in a multi-line adjacency list from multiple files in a directory, where ids are of type T.
Trait that classes should implement to read in graphs that nodes have
ids of type T
.
Trait that classes should implement to read in graphs that nodes have
ids of type T
.
The reader class is required to implement iteratorSeq
, a method
which returns a sequence of functions that themselves return an Iterator
over NodeIdEdgesMaxId
(see its type signature below as well).
It is also required to provide a nodeNumberer[T]
.
NodeIdEdgesMaxId
is a case class defined in ArrayBasedDirectedGraph
that stores 1) the id of a node, 2) the ids of its neighbors,
and 3) the maximum id of itself and its neighbors.
One useful reference implementation is AdjacencyListGraphReader
.
A subtrait of GraphReader that reads files of names specified by prefix and containing directory.
Represents an arbitrarily large sequence of bytes which can be interpreted as ints or longs.
Only reads node labels where the key is of type int.
Only reads node labels where the key is of type int. Label values can be of type int and string.
ASSUMES that the label files are named as follows:
collPrefix
_anything_labelName_labelValueType.txt
So the file name starts with an identifier that marks this collection of labels to be read. And that each line has an id followed by a single space followed by int value of label <id> <labelValue>
Reads in a multi-line list of edges from multiple files in a directory.
Reads in a multi-line list of edges from multiple files in a directory. Each edge is in its own line and is of the form: source-id<separator>destination-id where separator is a single character.
One can optionally specify which files in a directory to read. For example, one may have files starting with "part-" that one would like to read, perhaps containing subgraphs of one single graph.
One can optionally specify two additional operations during reading: - to remove duplicate edges - to sort list of adjacent nodes
For a default version for Int
graphs see ListOfEdgesGraphReader.forIntIds builder method.
In each file, a directed edges is defined by a pair of T: from and to.
For example, we use String
ids with
(space) separator
, when
reading file:
a b b d d c a e ...
In this file, node a
has two outgoing edges (to b
and e
), node b
has an outgoing edge
to node d
and node d
has an outgoing edge to node c
.
Note that, it is recommended to use AdjacencyListGraphReader, because of its efficiency.
Wraps a sequence of FileChannels to enable random access on a memory mapped file of arbitrary size.
Wraps a sequence of FileChannels to enable random access on a memory mapped file of arbitrary size. Motivation: FileChannel.open only supports 2GB at a time.
Utility class for writing a graph object to a Writer output stream, such that it could be read back in by a GraphReader.
Reads in a multi-line adjacency list from multiple files in a directory, where ids are of type T. Does not check for duplicate edges or nodes.
You can optionally specify which files in a directory to read. For example, you may have files starting with "part-" that you'd like to read. Only these will be read in if you specify that as the file prefix.
In each file, a node and its neighbors is defined by the first line being that node's id and its # of neighbors, followed by that number of ids on subsequent lines. For example, when ids are Ints, 241 3 2 4 1 53 1 241 ... In this file, node 241 has 3 neighbors, namely 2, 4 and 1. Node 53 has 1 neighbor, 241.
Similarly, when ids are String, input file should follow the example: Alice 2 Bob Chris Bob 1 Chris Chris 1 Bob ... In this file Alice has 2 directed edges to Bob and Chris, Bob has an edge to Chris, and Chris has outgoing edge to Bob. *