This is a place for theory about graph layout topic hopefully creating some more understanding about graphs to make it easier to work with.

"We are drowning in information but starved for knowledge. -- John Naisbitt, Megatrends"

google has a problem it mostly shows not interesting spring embedders or graphviz
then the other directed graph programs and libraries are really very difficult to fine.

The theory can be hard to read in scrience papers see this
https://github.com/Jam3/math-as-code

gdea has many graph drawing papers at http://gdea.informatik.uni-koeln.de/

Graph theory is used in learning ai and explained easy here at
https://ericmjl.github.io/essays-on-data-science/machine-learning/graph-nets/
"demystify graph deep learning" or here in this graphlearning.pdf

There is a free source including graph algorithms with video lectures from open source university at https://github.com/ossu/computer-science
Graph layout is used in chip design and also using ai example at https://www.nature.com/articles/s41586-021-03544-w

Graph algorithms are used to improve robot movements at https://people.csail.mit.edu/jiex/papers/robogrammar/index.html#

The book from kozo sugiyama is available from a bookstore site and recommended.

The book about the edge graph tool from paulisch is still interesting because it is about graph data with added constraints which are implemented at last positioning step when constraints get higher priority then dummy nodes and are handled first, then dummy nodes and then other nodes.

Here is a free pdf about graph data structures and algorithms from jeff ericson http://jeffe.cs.illinois.edu/teaching/algorithms/

or local copy from jeff ericson is in this Algorithms-JeffE.pdf

few link to graph theory math at a math list
https://github.com/rossant/awesome-math#graph-theory

Also free available is a GNU GPL Free book on algorithmic graph theory at https://code.google.com/archive/p/graphbook/

or a local copy of the graphbook is in this latest-r1991.pdf

Wikibooks are free information books for example on graph theory at https://en.wikibooks.org/wiki/Category:Book:Graph_Theory

on freecodecamp is graph explained at i-dont-understand-graph-theory

The brandes node placement is detailed here at brandes.html

pseudo code of few graph algorithms with java https://github.com/mayankraghuwanshi/Graph-Theory
optimizing flow graph in c https://github.com/satukoskinen/lem_in

There are many free graph pdf's available to find using https://scholar.archive.org/

For example this old pdf which shows how a graph layout works in ADA173877.pdf The graph layout puts the nodes in relative vertical levels using dfs algorithm or otherwise depending how nodes are connected.
The nodes in one level have relative horizontal position 1 to last node and the sugiyama algorithm sorts the nodes according to barycenter value of a node and the order of the sorted nodes sets the new relative x position of a node.
The graph is scanned from top to bottom level and vica versa and 2 levels are considered at every iteration step.
The barycenter of a node is the average of the positions of the incoming nodes or outgoing nodes depending on level scan direction. Or average of both values.
In second phase the nodes with equal barycenter value are swapped in x position.
During this scan of levels the number of edge crossings reduces to determine at every scan and when the change of edge crossings drops below some value the sugiyama can stop because the graph may continue to change edge crossings. The graph can be constructed using lists in C or via a matrix like this:

``` /* This algorithm is for routing hierarchies of elements. A "good route" is one that has a minimum number of link crossings. An algorithm that was truly optimal (for minimizing link crossings) would be combinatorial and therefore cost prohibitive; therefore, this algorithm uses a heuristic approach that finds a route with close to the minimum number of crossings in a reasonable amount of time. This algorithm assumes that all the elements form a directed acyclic graph (DAG), which means (1) that links flow one way between elements and (2) for any given node there is no way to get back to the node if, starting at the node, you traverse the links going from node to node. This algorithm also assumes that AT MOST only ONE link may exist between a pair of nodes. ------------------------------------------------------------------------------- OVERVIEW OF ALGORITHM All elements that do not have any parents are placed in the first row (row 0). Elements are assigned to rows, where the row number for each child is equal to the [maximum(row number of all its parents) + 1]. Crossovers are determined by examining links between elements on adjacent rows, so if a parent is in a row that is not adjacent to its child's row, "dummy" nodes are created on the rows in between the parent and child, and the parent and child are connected via these dummy nodes. Once the elements (now called nodes) are assigned to individual rows, the rows are sorted (repeatedly) in order to minimize link crossings. The sort criteria involves attempting to both center children under parents and to center parents over children. The sort orders are then tweaked by swapping nodes that have the same sort value. After the column orders are finalized, the nodes are spread out so they are more or less centered above their children and below their parents. When centering children below parents, a row of children is sorted by which node has the greatest number of parents. These get first choice of where to be placed under the parents (actually, dummy nodes get first preference, then all of the others). Centering parents above children is analogous. When done with node placement, there may be some empty columns, and the numbering scheme may not start at 0. Therefore, the empty columns must be eliminated and every node needs its column renumbered, starting at 0. Then you are done. ------------------------------------------------------------------------------- REALIZATION MATRIX When it comes to both sorting nodes and horizontally spacing the nodes, two adjacent rows are always involved. For example, if we are sorting row[i] based on the children of row[i]'s nodes, then row[i+1] is also involved at this step. These two rows are called the "i-th realization", and form the "i-th realization matrix". A realization matrix shows the parent-child relationships between adjacent rows, with the parents on the rows and the children on the columns. If there is a parent-child relationship, a 1 is stored in the matrix at the position, else a 0 is stored. An example: A B C D \ \ / \ / / | \ /\ / \ / | /\ / \ / \ | / /\ /\ \ | // \ / \ \| E F G H E F G H A 0 1 1 0 In this example, parent 'A' has children 'F' and 'G', B 1 0 0 1 parent 'B' has children 'E' and 'H', C 1 1 0 0 parent 'C' has children 'E' and 'F', D 0 0 0 1 and parent 'D' has child 'H'. ------------------------------------------------------------------------------- ROW AND COLUMN BARYCENTERS Two other important concepts are the "row barycenter" and the "column barycenter" for a node. The "row barycenter" is the basically the average of the positions of a node's children. The "column barycenter" is the average of the positions of a node's parents. These numbers tell us where a node would like to be positioned in its row, depending whether we are positioning relative to children or parents. For example, using the above realization matrix, we can calculate the row barycenters for A, B, C, and D, and the column barycenters for E, F, G, and H. Since the row barycenter of a node is equal to the sum of the positions of the node's children divided by the number of children of the node, the row barycenter for A is (1 + 2)/2 = 1.5. This assumes that we start numbering rows and columns at 0. Similarly, the column barycenter of a node is equal to the sum of the positions of the node's parents divided by the number of parents of the node. So, the column barycenter of F is (0 + 2)/2 = 1.0. The complete example is as follows: Row | E F G H | Barys ------+--------------------+----- A | 0 1 1 0 | 1.5 B | 1 0 0 1 | 1.5 C | 1 1 0 0 | 0.5 D | 0 0 0 1 | 3.0 ------+--------------------+----- Col | 1.5 1.0 0.0 2.0 | Barys | | If we were to sort the child nodes here by their column barycenters, the new order would be G, F, E, H. If we were to sort the parent nodes here by their row barycenters, then the order would be C, A, B, D (if two or more nodes have the same value, be sure to keep the same precedence that already exists between them, e.g., make sure that order after sorting is not C, B, A, D). If a node has no parents then it can't have a column barycenter. This case should never happen, as all nodes that have no parents should be in the first level of the hierarchy, and these nodes would only be represented in realization matrix 0, and they would only have row barycenters. If a node has no children then it can't have a row barycenter. In this case, while sorting based on row barycenters, sort AROUND these nodes, i.e., do not change their positions at all. For example, if we had the following: Row | W X Y Z | Barys ------+--------------------+----- Q | 0 1 1 1 | 2.0 R | 0 0 0 0 | ??? S | 1 0 0 0 | 0.0 T | 0 1 0 1 | 2.0 ------+--------------------+----- Col | 2.0 1.5 0.0 1.5 | Barys | | and we were to sort by row barycenters, the resulting order should be S, R, Q, T. Notice how R stayed in its position, and even though Q and T had the same barycentric value, Q stayed before T. The whole reason for sorting rows and columns by their barycenters is to decrease the number of crossovers. ------------------------------------------------------------------------------- CROSSOVERS The realization matrix is also used to count the number of crossovers between two adjacent rows of nodes. For each row, starting with the second row, if a row element has a 1, then sum up all of the matrix elements that are above AND to the right of this element. Looking again at the first example: A B C D \ \ / \ / / | \ /\ / \ / | /\ / \ / \ | / /\ /\ \ | // \ / \ \| E F G H Row | E F G H | Barys ------+--------------------+----- A | 0 1 1 0 | 1.5 B | 1 0 0 1 | 1.5 C | 1 1 0 0 | 0.5 D | 0 0 0 1 | 3.0 ------+--------------------+----- Col | 1.5 1.0 0.0 2.0 | Barys | | Starting with the second row (parent B's row), position B-E has a 1. Looking at positions above and to the right, we see positions A-F and A-G both have a 1, so the number of crossovers is currently = 2. Position B-H has a 1, but there are no elements above and to the right, so crossovers is still = 2. For parent row of C, position C-E crosses over with B-H, A-F, and A-G, so crossovers = 5. C-F crosses over with B-H and A-G, so crossovers = 7. For parent row D, position D-H doesn't cross over with any other link. So for this realization matrix representing these two rows, the number of crossovers is 7. The total number of crossovers for the whole graph would be the sum of the crossovers from each matrix realization. ------------------------------------------------------------------------------- NODE CENTERING After the nodes for each row have their final sort order, the nodes need to be assigned to grid positions. Their initial grid position will be their column position, by which we mean their array position in the row. From now on, when we take a row or column barycenter, we will be using grid positions instead of column positions. Note: The following examples will be based on centering children with respect to their parents' positions. Centering parents based on their children's positions is analogous. When positioning the nodes on a row based on their parents' positions, the nodes must be initially sorted to see which nodes get first choice. The dummy nodes go first, and the rest of nodes are sorted in descending order based on the number of parents the node has. If a dummy node has a parent that has multiple dummy nodes, all of these dummy nodes are again sorted by how close to the center of the parent's children they are. This is a confusing statement, best illustrated by example: P1 P2 \ | \ __________^__________ \| | | | | C1 D1 D2 C2 D3 Here, parent P1 has one child, C1. Parent P2 has five children, and three of the child nodes are dummy nodes: D1, D2, and D3. C1 is child 0 of P2, D1 is child 1 of P2, D2 is child 2 of P2, C2 is child 3 of P2, and D3 is child 4 of P2. The child midpoint underneath the parent is equal to (the number of children - 1) / 2, so (5 - 1) / 2 = 2. Since the dummy nodes go first, the initial order is D1, D2, D3, C1 (because it has 2 parents), and finally C2. All of the dummy nodes have the same parent, so we will sort them based on how far away they are from the parent's child midpoint. D1 is child 1 of P2, so it is 1 away. D2 is child 2 of P2, so it is 0 away. D3 is child 4 of P2, so it is 2 away. Therefore, the final order for choosing positions is D2, D1, D3, C1, C2. In a situation similar to the dummy nodes, if a non-dummy node has a only one parent, and that parent has other children with just one parent, then these one parent child nodes that have the same parent need additional sorting in the exact same manner that we just did the dummy nodes. The whole purpose behind this is so that the left most node doesn't get first choice. If it did, we would get graphs that look like: A A | | |_________ instead of _____^_____ | | | | | | B C D B C D Anyway, once we have a sort order for the nodes of a row, we place the nodes in their preferred positions. Using the previous example, assume that P1 is in grid position 2 and P2 is in grid position 5. D2 gets first choice, and its column barycenter (based now on parent grid positions, not column positions) is 5, so we place D2 in position 5. D1 is next, its barycenter is also 5. We can't give it 5 since that position is occupied, so we give it the closest possible position we can, which in this case is 4. D3 is next, and its barycenter is also 5. The closest position that we can give it is position 7, since we must allow room for C2. C1 is next, and its barycenter is (2 + 5)/2 = 3.5, which we round to 3. Position 3 is open, so we go ahead and give it position 3. C2 is last, and its barycenter is 5. However, the first position available to it based on its left neighbor is position 6, so we assign it position 6. ------------------------------------------------------------------------------- GOING 'UP' OR 'DOWN' THE GRAPH "Going down the graph" means taking each realization matrix situation, starting with Realization Matrix 0, and performing some action on it, then going to the next realization matrix, and continuing until all of the realization matrices have been visited. "Going up the graph" is analogous, except you start at the bottom with the last realization matrix and work up to Realization Matrix 0. */ ```Also explained like this:

``````The Sugiyama layout algorithm
-----------------------------
This layout algorithm's goal is to reduce edge crossings in the graph.

The Sugiyama layout algorithm consists of four phases:

Phase 1 does a topological sorting of the graph, which assigns edges to levels.
Each level will be displayed at the same vertical position.
There may be edges which cross more than a single levels. These edges,
called `long' edges, are split into a series of shorter ones by
inserting special nodes, called `dummy' nodes, on each intermediate
level. The result is called a `proper' graph.

The label of the long edge is copied to (only) the first edge in the
series (i.e.  between source and the first dummy node).  The long
edge is marked invisible.  When a new layout is requested, the
procedure MakeImproper is called.  MakeImproper removes all dummy
nodes and the edges between them and restores the long invisible edges
back to visible.  The topological sort is then done on this graph
and then the graph is made back into a proper one (MakeProper).
The Sugiyama layout uses what we call the `level' structure when
laying out the graph.  This is basically a linked list of levels
where each level contains a linked list of all the nodes in that
level.  NewLevelStruct generates such a level structure from the
graph.

Phase 2 does a repeated barycentric ordering of the nodes on all levels
starting on level 0, going down the level structure and then up again.
For barycentric ordering one always considers only two adjacent levels.
Barycentric ordering while going down/up orders the nodes on
the lower/upper level according to the average position of their
predecessors/successors. Nodes with equal barycenters keep their order.
This heuristic method works quite well to reduce the number of
edge crossings.  One can either simply sort the nodes in this level
according to their barycenter values or one can first check whether
the new (sorted) arrangement leads to fewer crossings (and only if
so, change the ordering).  This option is specified in the fourth
(sortcc) parameter to the algorithm (0=don't count crossings,
1=do count crossings).  Phase 2 is repeated until either the number
of edge crossings is remains the same or a maximum time of iterations
is reached (the first (BC_I) parameter specifies this maximum).

Phase 3 reorders nodes which have equal barycenters and then calls phase 2
again.  With the second parameter you can give the maximum number of
iterations of phase 3.  Again, there is an option that controls how
the crossings should be counted in this phase.  When considering
a level, one can have it count the crossings in both the level
above and below or only in the next level (depending on whether
you're going up or down).  The fifth (CC) parameter is 1 to count
in one direction only or 2 to count in both directions.

Phase 4 does the finetuning of the graph by a procedure similar to phase 2
(up/down-ward passes in the graph).  This phase calculates the
x,y position of each node.  The relative positions of the nodes
in each level, however, is preserved.  Each node in a level is
assigned a priority and nodes with the highest priority are set first.
The position of a node is set to the average position of its successors
and predecessors, with the restriction that a node can never cause
a previously set node which has higher priority to move.  The priority
of a node is defined as follows: Dummy nodes have the highest priority.
This has the effect that long edges crossing several levels are drawn
straight. For other nodes the priority is the number of incoming or
outgoing edges.  We have added our own modification to the Sugiyama
algorithm in that we assign dummy nodes which are in the middle
of the graph an even higher priority than those near the edges.  This
tends to make the graph a bit wider and much more symmetrical.  Doing
phase 4 one time usually is sufficient. Nevertheless with the third
parameter you can give the number of repetitions. Every repetition
tends to spread the graph a little.  Downward passes tend to make
the bottom nodes appear centered, and Upward passes vica-versa.
After making a series of down/up passes, the final pass is made from
the widest level downwards.  This should make both the top and the
bottom of the graph appear nicely centered.  This pass is usually
the cause for any unexplainable bend in the long (should be straight)
dummy edges.
``````

/ TransitiveReduction performs the transitive reduction of graph g in place.
// The transitive reduction of a graph is a graph with as few edges as
// possible with the same reachability as the original graph. This means
// that if there are three nodes A => B => C, and A connects to both
// B and C, and B connects to C, then the transitive reduction is the
// same graph with only a single edge between A and B, and a single edge
// between B and C.
//
// The graph must be valid for this operation to behave properly. If
// Validate() returns an error, the behavior is undefined and the results
// will likely be unexpected.
//
// Complexity: O(V(V+E)), or asymptotically O(VE)
func (g *AcyclicGraph) TransitiveReduction() {
// For each vertex u in graph g, do a DFS starting from each vertex
// v such that the edge (u,v) exists (v is a direct descendant of u).
//
// For each v-prime reachable from v, remove the edge (u, v-prime).
for _, u := range g.Vertices() {
uTargets := g.downEdgesNoCopy(u)

g.DepthFirstWalk(g.downEdgesNoCopy(u), func(v Vertex, d int) error {
shared := uTargets.Intersection(g.downEdgesNoCopy(v))
for _, vPrime := range shared {
g.RemoveEdge(BasicEdge(u, vPrime))
}

return nil
})
}
}

The median is the middle value, where half of the data is above and half is below. The mean (aka average) is defined as the sum(value) / count(value),  and the mode is defined as the most common or frequently occurring value.

The median, or the middle value, is also known as the 50th percentile (the middle percentile out of 100). This is the value at which 50% of the data is less than the value, and 50% is greater than the value (or equal to it).

This leads us to percentiles: a percentile is defined as the value where x percent of the data falls below the value.

For example, if we call something “the 10th percentile,” we mean that 10% of the data is less than the value and 90% is greater than (or equal to) the value.

Simply put, using averages can have a dramatic (and negative) impact on how values are reported, while percentiles can help you get closer to the “truth.”

The breadth first search algorithm of the chopping phase per-forms a limited
exploration of the graph.
During that explo-
ration, it disconnects from the graph multiple small graph

The breadth first search algorithm of the chopping phase per-forms a limited
exploration of the graph.
During that explo-
ration, it disconnects from the graph multiple small subgraphs.

Graph chopping algorihtm
Input
: A forgetting graphG= (V,E), a root vertexr in V
Parameters
:n the number of sites to extract,s max node in a site
Output: A graph containing a maximum of n sites

Let Gr= (Vr,Erbe a a graph.

while
|V|>0 and n >0 do

Let q1 be a queue.
Let
q2 be a queue.
Let g= (N,A) be a graph

push q1,r
while|N|< s and |q1|>0 do

empty q2
for all x q1 do

if |N| > s then break end if

for all y such that(x,y) E do
if |N| > s then break end if
if y ∈ N then A←A ∪ {(x,y)} end if

NN∪{y}
AA∪{(x,y)}

push y in q2
end for
end for

q1q2

end while
VV \ {N}
EE \ {(u,v)|u,vN}

rfirst vertex left inV
V
rVr N

ErEr A
nn1

end while

returnGr