Algorithm for placing nodes on a circle considering their distance to eachother

Tag: algorithm Author: TZZY0988 Date: 2009-08-12

I have the following problem. I have brain regions and correlations between them. The brain regions I know the distances of. Now, we expect the correlations to be negatively correlated to the distance between the brain regions. So when we increase distance correlation goes down to zero. The expectation is that this is by 1/D^2.

I want to visualize my correlation matrix to check for abnormalities. I have already some other implementations like Taiyun's correlation matrix visualization and a simple 2D scatterplot with the 1/D^2 curve as a blue line.

Next I want to have something based on correlation circles.

The brain regions I have created a Node class for. So my brain regions are nodes. I mimic correlation with Edges. My Edges have a sourceNode and a destinationNode and also a correlation and distance so I can couple them to the correct Node. The distance and correlation are needed for table lookup (backcoupling to regionID and regionName etc).

Now what I want is to place all nodes on a circle so that the nodes which have a small distance to eachother are placed close together, and nodes far away from eachother are placed further away. This way the strong edges (which are thick) are close to eachother. And when you have a very strong edge crossing the circle it is awkward and the eye spots it easily. Of course I seek an optimum, as pointed out below a single real answer does not excist.

I have been searching google but since I do not have a clue what to search for I have found no results. I suspect there is a name for a standard algorithm for this but i do not know it. A link to such an algorithm is okay too.

The thing I came up with so far is to arrange the nodes on the circle in such a way that the SUM of all distances is smallest. But for this I need to make a sort of point system so regions which are close to each other and placed close to each other get for instance some +points and points close to each other but placed further away from each other get some downpoints. Now optimize the point algorithm and get highest outcome.

Any tips on this matter? My math is not that great ;). I'm currently googling on circles, nodes, weights..


If you have any other bright ideas to visualize the matrix be sure to PM me about it, or comment here :).

So you are looking for a way to easily spot nodes, where the correlation is far off the expectation (1/d^2)? Or are you looking for a region of the brain where this is more often the case? Or single grahps?
Indeed looking for visualizations where the correlation is far of the expectation (easily done with a scatterplot + 2D 1/D^2 curve, everything above curve = interesting). It never hurts to test as many options as you can think off. You might see iteresting things with other visualizations. The basic reason behind it is indeed quite simple. Spot abnormalities in correlations based on the rule r = 1/D^2. Next to this, other visualizations might give insight in general pictures. So comparing the visualizations with eachother might give insight, forgetting about the 1/d^2 rule.

Best Answer

The general problem that you describe doesn't have a solution because you're trying to make a map from a 2D surface to a 1D line that preserves all distances, and this isn't possible. If there was a particular region that you wanted to compare to all others, you could then put all the others around a circle so their distance match the distance to this region (but then the distance between these other regions will be distorted).

But you can certainly do better than just random in approximating the distance. Here's an approach: The first step would be to do multiple random arrangements and then pick the best of these. The next improvement would be to optimize each of these arrangements against some cost function by moving the regions around in small steps until they settle into a local minimum, and then pick the best of these local minima. The results of this is shown in the plots below, and the Python code is further down.

alt text

import pylab as px
import numpy as nx
import numpy.random as rand
rt2 = nx.sqrt(2)

N = 10 # number of brain regions

# make brain region locations r=1
regions = []
s = 2.
while len(regions)<N:
    p = 2*s*rand.rand(2)-s
    if nx.sqrt(,p))<s:
regions = nx.array(regions)

for i in range(len(regions)):
    px.text(regions[i,0], regions[i,1], `i`, fontsize=15)
px.xlim(-1.1*s, 1.1*s)
px.ylim(-1.1*s, 1.1*s)
px.title("inital positions")

# precalc distance matrix for future comparisons
dm = nx.zeros((N,N), dtype=nx.float)
for i in range(N):
    for j in range(N):
        dm[i,j] = nx.sqrt(nx.sum((regions[i,:]-regions[j,:])**2))

def randomize_on_circle(n):
    """return array of n random angles"""
    return 2*nx.pi*rand.rand(n)

def cost_fcn(d_target, d_actual): # cost for distances not matching
    return abs(d_target-d_actual)

def calc_cost(angles):
    """calc cost for the given arrangement    """
    c = 0.
    for i in range(N-1):
        for j in range(i, N):
            # sqrt(...) is distance between two pts on a circle (I think)
            c += cost_fcn(dm[j, i], rt2*nx.sqrt(1-nx.cos(angles[i]-angles[j])))
    return c

def optimize_step(a, shift=2*nx.pi/360):
    """try shifting all points a bit cw and ccw, and return the most beneficial"""
    max_benefit, ref_cost = None, None
    best_i, best_shift = None, None
    for imove in range(N): # loop through the regions and try moving each one
        cost0 = calc_cost(a)
        for da in (shift, -shift):
            a_temp = nx.array(a)
            a_temp[imove] += da
            cost = calc_cost(a_temp)
            benefit = cost0 - cost  # benefit if moving lowers the cost
            if max_benefit is None or benefit > max_benefit:
                max_benefit, best_i, best_shift, ref_cost = benefit, imove, da, cost
    return max_benefit, best_i, best_shift, ref_cost       

lowest_cost, best_angles = None, None
cost_initials, cost_plateaus = [], []
for i in range(30):  # loop though 20 randomized placements on the circle
    angles = randomize_on_circle(N)
    costs = []
    benefits = []
    # optimize each original arrangement by shifting placements one-by-one in small steps
    count_benefits_neg = 0
    count_total, max_total = 0, 2000
    while count_benefits_neg < 10: # better to do a variable step size
        b, i, s, c = optimize_step(angles)
        angles[i] += s
        if b < 0:
            count_benefits_neg += 1
        count_total += 1
        if count_total > max_total:
            print count_total, b, costs[-20:], benefits[-20]
            raise "not finding an equilibrium"
    if lowest_cost is None or c < lowest_cost:
        lowest_cost = c
        best_angles = nx.array(angles)
        cost_graph = costs[:]
        benefit_graph = nx.array(benefits)

px.subplot(2, 2, 2)
px.plot(cost_graph, 'o') # make sure the cost is leveling off
px.title("cost evoloution of best")
px.subplot(2, 2, 3)
px.plot(cost_initials, 'o')
px.plot(cost_plateaus, 'd')
px.title("initial and final costs")

px.subplot(2, 2, 4)
for i in range(len(best_angles)):
    px.text(nx.cos(best_angles[i]), nx.sin(best_angles[i]), `i`, fontsize=15)
px.xlim(-1.2, 1.2)
px.ylim(-1.2, 1.2)
px.title("positioned on circle")

Interestingly, this seems to have resulted in far things being far-ish, and near things being near-ish, but with the mid-range orders messed up, so maybe this will do what you want? (This also illustrates the fundamental problem in going from 2D to 1D. For example, on the circle the 4 wants to be further from the 9, but it can't do that without getting closer to the other numbers, whereas in 2D it could just go out to the side.)

You'll probably want to modify cost_fnc which specifies the penalty for having the distances of points on the circle not match the distance from the 2D arrangement. Changing this in a way to increase the costs for large errors (say a quadradic), or to emphasise a cost for the large distance being right, say d_target*(abs(d_actual-d_target)), etc, might help.

Also, changing the size of the circle relative to the size of the 2D data will change the look of this quite a lot, and you probably will want to circle somewhat smaller than the data, as I've done here, as this will spread the points around the circle more. (Here the circle has R=1, so just scale the data appropriately.) Also note that this will make a quantitative assessment of the cost to be not very meaningful as the best arrangements never get very to very low cost since some regions can never be as far apart as in the 2D data.

The point of running multiple random starts is that the evolving arrangement can get stuck in local minima. This technique seems to be useful: settling helps in getting the distance right and the costs down (plot #3, blue dots=initial random, diamonds=local minimum) and it helps some initial arrangements much more than others, so it's good to try multiple initial arrangements. Also, since a number of these seem to settle to around 15 it gives some confidence that this arrangement might be representative.


Therefor i seek an optimum, which does have a solution. I seek an answer where as many as possible are placed correctly and some not.
I totally agree with the importance of visualization. But to me it seems that the approximations required are beyond what you can expect others to supply since only you can choose which ways to warp and tear the space to keep consistent what's important to you (and mathematically, distance can't be one of these things). I suggest that you might have more luck if you rephrase the problem in a more open way (like your last sentence), and rather than ask for an impossible mapping, ask for suggestions of visualizations that might show you what you want to know.
No, I def. want this option since it is my mind now :). It is possible but maybe my explanation is not right. I do seek an optimum, if totally random is the best option that would be optimum. But first I want to seek for a better placement. I think it must be possible to at least have a majority of points which are in real distance close together, together on the circle.
Wow, thanks for the shear effort of putting this together in such detail. I will def. test this when I'm home. I will not mark it for an answer until I have my final routine (so I can post my own if it performs better) but credits will be given :).
No problem... I was trying to explain how to do it, and then it just seemed easier to do it. (Though it's certainly not the easiest 20 pts I've ever earned on SO, but clearly that wasn't the point. It's just and interesting question.)

Other Answer1

I suggest you use an energy minimization algorithm to place your nodes on the circle in a way that minimizes something like square(distance on circle - distance in brain). Then thicken the edges as you describe and the visualization should be complete.

Other Answer2

It's possible that GraphViz has some links to algorithms. Alternatively, you may be able to get your data into a format that GraphViz accepts and run it through that.


At first I want to ty and keep it inside my app, without using GraphViz. But thanks for the hint. Could not find anything really interesting except a link to a book.