Sep 3, 2014


Make an Introduction today.

Close a triangle.

/\  ->  ∆

Jul 8, 2014

Organizational Dynamics…
Adapting Old Structures to New Challenges

"We may not be interested in chaos, but chaos is interested in us." - Robert Cooper

When change was slow, and the future was pretty much like the present, hierarchical organizations were perfect structures for business and government. The world is no longer as predictable, nor are solutions as obvious. Old structures are no longer sufficient for new complex challenges.

Businesses have noticed the changes and are adapting. From GE's boundaryless organization to Toyota's amazingly flexible supply web, agility and adaptability are the mantra. Unfortunately most governments are not as quick and creative. Instead of the out-of-the-box thinking found increasingly in the business world, governments are busy shuffling boxes on the organization chart.

Figure 1 below is a sample organization chart of a generic hierarchical organization. Two nodes are connected by a gray link if there is a formal reporting relationship and information flow. The nodes on the bottom row represent sub-organizations, while the top two rows are executives.
Figure 1 - Original Hierarchy

Assume the above organizational chart roughly represents the U.S. intelligence community. Node 001 is the President and nodes 007 to 016 are various intelligence agencies. Nodes 002 to 006 are the leaders of those various agencies.

In 2003, the U.S. government was facing a dual problem in the intelligence community:
  • improve accuracy -- WMD in Iraq?
  • improve agility -- stop terror attacks
One of the solutions being discussed was adding a new formal position to the intelligence community. This new box would be an 'intelligence czar' to which all other intelligence leaders and their agencies would report. The thinking behind this proposed solution was for there to be one aggregation point for all intelligence. Node 017 in Figure 2 now represents this possible new position.
Figure 2 - Adding the Intelligence Czar[017] to Original Hierarchy

Another solution to integrate intelligence was to connect the various agencies to each other and start to demand and reward knowledge sharing between them. This does not require a new position. It does require the leaders of the agencies to share knowledge and information and to propogate this new culture down in their organizations.  Yet, cultural change is much more difficult than shuffling boxes on the organization chart.  This may require new leaders who are open to connecting the stovepipes and making the intelligence community a true community!  Interconnecting the intelligence leadership is displayed by the horizontal green links in Figure 3a below. (We moved nodes 003, 004, and 005 so that all green links would be visible -- their new positions on the chart have no further significance.)

Figure 3a - Horizontal Links added to the Original Hierarchy

Which solution is better? The new formal position or the interconnecting of existing positions? It depends on your goal. If you want to focus on accountability and budget responsibility then the hierarchy will do. But, if you want a smart, agile learning organization -- able to adapt to a changing enemy -- then the interconnected structure will probably perform better. The interconnected structure spans various organizations with diverse data and perspectives allowing for cross-pollination of information and insights leading to learning.

We apply the small-world network metrics of Watts & Strogatz to Figures 1, 2, and 3a above. One of the key metrics in the small-world model is the average path length, for individuals and for the network overall. A good score for an individual means that he/she is close to all of the others in the network -- they can reach others quickly without going through too many intermediaries. A good score for the whole network indicates that everyone can reach everyone else easily and quickly. The shorter the information paths for everyone, the quicker the information arrives and the less distorted it is when it arrives. Another benefit of multiple short paths is that most members of the network have good visibility into what is happening in other parts of the network -- a greater awareness. They have  wide reach, a broad network horizon, which is useful for combining key pieces of distributed intelligence into insights and patterns. In an environment where it is difficult to distinguish signal from noise, it is important to have many perspectives involved in the sense-making process.

President's Path
Overall Path
Figure 1 - Original Hierarchy                                  
Figure 2 - With Intelligence Czar
Figure 3a - Interconnected Agencies

Table 1 - Small World Metrics

Table 1 shows the path length metrics for each of the 3 networks above.  Since the President is the key destination for intelligence, his distance from the rest of the network is critical. The average path length of the whole network is important for sense-making and learning within the whole intelligence community.  Figure 1 is our starting point, and where we compare the other metrics to.  Figure 2 is OK for the overall group, but it increases the President's path length by almost one step over Figure 1. We want our President to be closer to the intelligence community, not further away!  Even if the Czar is top notch, we are still distorting and delaying the information flows to the President by adding this position. We can see that Figure 3a is a win-win improvement over Figure 1. Both the President's average path length and the group average path length are reduced (everyone is closer to the discussions). Information flows quicker, with less distortion, and President is more involved.

It looks like new connections win out over new nodes!

Figure 3b below gives us insight into why interconnecting the stovepipes is a better option. We redisplay the organization in Figure 3a by connectivity (those nodes that are more connected to each other are moved closer together) and we see a totally new perspective. Figures 3a and 3b have exactly the same connections -- 3b is new view/angle of 3a -- the emergent network perspective of the new organization. By adding the horizontal ties we have transformed a simple hierarchy into an interconnected group. Recent research by psychologist Patrick Laughlin of the University of Illinois shows that groups outperform even the best individuals in decision making. Intelligence information is rarely clear or complete -- a key reason for having many perspectives and diverse experiences for cross-pollination and sense-making.
Figure 3b - Emergent Organization 

The Report to the President of the United States by The Commission on the Intelligence Capabilities of the United States Regarding Weapons of Mass Destruction came to the same conclusion. Based on their findings: "In sum, today's threats are quick, quiet, and hidden" they concluded, "What we need is an intelligence community that is truly integrated … The strengths of our distinct collection agencies must be brought to bear together on the most difficult intelligence problems."

In a similar finding, this RAND report explores organizational culture, integration, and the drawbacks of compartmentalizing information instead of sharing it.

Alas, G.W. Bush chose not to improve the overall network, but to add a new position to the hierarchy -- a common approach in the 20th century, and the non-risky choice. Most world leaders are just starting to become network thinkers -- a pattern that will be more common by the end of the 21st century. 

Jun 16, 2014

Music for Network Thinking

Emergent electronic music by Valdis Krebs

When we think of music we often think of songs or symphonies.  These have an expected structure and distinct themes and musical keys. There is a clear beginning and clear ending and they are related. No matter who plays the tune, we instantly recognize it.  We are expected to pay attention throughout the piece and follow the composer's intentions.  This is prescribed music.

There is also emergent music.  This music is not rigidly structured.  It is not random or chaotic, it is complex -- more like a network, with many connections, paths, possibilities and outcomes.  Various initial conditions do not insure the same progression or the same result.  The music morphs, yet can be molded by the performer.  Surprise trumps over predictability and prescription.

A popular style of emergent music is called Ambient and has many flavors.  A pioneer of electronic music, Brian Eno, describes ambient music this way:

"Ambient Music must be able to accommodate many levels of listening attention without enforcing one in particular; it must be as ignorable as it is interesting."

Emergent music is not intended for a concentrated, full attention, listening session.  It is intended to provide a sonic background to help you think or do something else.  It will waft in and out of your consciousness, but not interfere with what you are doing.  It will help you accomplish what you are working on.

At Orgnet, LLC, we listen to emergent electronic music during periods of computer programming, data wrangling, writing, browsing the Net, and assembling presentations.   Echoes is created via interconnected electronic synthesizers that have been programmed with simple sequences, in specific musical modes.  When these simple sequences interact you get a complex, evolving soundscape. Once the process is started, the composer (programmer?), is just as surprised as the audience on what comes next.

Conversations are also emergent -- different listeners follow different themes and threads. Patterns appear, patterns repeat, and often they combine in new and interesting ways.  Again, you never know what path(s) the conversation will take, or how it will end.

Interaction, flow, intersection, re-combination. Something new emerges.

Enjoy the emergence!

May 10, 2014

Organizational X-Rays and MRIs

What happens when we x-ray an organization and see how it is structured -- how it really works?  Here, the organizational x-ray is performed via social network analysis [SNA], using InFlow software.

We normally view our organizations as hierarchies, or pyramids, with the top executive at the top of the structure.  Figure 1 below is a hierarchical view of the top 3 layers in an I/T department in California.  

Figure 1 - Hierarchy

People are represented by nodes, the color of the nodes represent their level in the organization, and the black lines designate reporting relationships.  The larger magenta-colored node is the CIO, the green nodes are the I/T Directors and the blue nodes are the I/T managers.  No lower levels were mapped fro this project.  Names of people and departments have been hidden for privacy.

Instead of a hierarchy as a pyramid, view it as a network! 

Most hierarchies are viewed as a pyramidal structure with the highest executive on top.  Some have tried new views of hierarchies, such as tilting them on their side, or showing them upside down to support their new theories of management.  These views have not caught on.  A better alternative to viewing a hierarchy as a pyramid, is to view it as a network!  Mathematicians label hierarchical networks as trees.  The tree view is shown in Figure 1 above.  We can move our point of observation, without changing the organizational structure, and see our I/T organization as a Hub-and-Spoke network in Figure 2 below.

Figure 2 - Hub and Spoke

Both views above show the exact same organization with the exact same ties -- we have just move our point of observation.  Now the CIO is in the center of Figure 2 instead of on top in Figure 1.  The black lines still show reporting relationships, and the node colors reveal organization level, as before.  We have added gold frames around each department in Figure 2, but otherwise Figure 1 and Figure 2 show the exact same structure!  We now realize that a Hierarchy and a Hub-and-Spoke network are the exact same thing, just from a different angle.

X-rays are usually good for revealing the bones and hard tissue inside the human body.  Our skeleton and cartilage are like the rigid formal structure of the organization -- the prescribed network of who works where, and who reports to whom.  Our bodies have other important membranes and systems and so do organizations! Here is where SNA really shines -- it not only works as an X-Ray, but also as an MRI.  Like an MRI, SNA reveals the soft, emergent, action membranes that support how things really work in an organization.  Other key systems inside an organization include: 
  • Work Flow/Collaboration
  • Idea Flow/Innovation
  • Knowledge Exchange/Learning
  • Advice-seeking/Expertise
  • Voice of the Customer/Feedback
  • Friendship/Social Nets
Figure 3 below shows our I/T department with both Hierarchy and Work Flow -- who actually works with whom.  Just like an MRI is more complex (show more information) than an X-Ray, Figure 3 is more complex than Figures 1 and 2.  We have also added in Node numbers (as substitutes for employee names) in Figure 3 so that we can better discuss and make sense of the emergent patterns revealed.

Figure 3 - Wirearchy

The gray lines, added to our Hub-and-Spoke view of the organization reveal the actual working relationships between the management staff in this I/T department.  The gray links show the strong work ties between a pair of managers.  The CIO was very happy to see the criss-crossing work ties between the various departments.  He was afraid his organization was stuck in silos, created by the many mergers his company had gone through.  He was glad to see that the managers from acquired companies A, B, and C were now working together well.

We see that the hierarchy is just another network in an organization -- the networks all work together for the organization to execute its goals.  This combination of prescribed and emergent networks is called a Wirearchy, by Jon Husband.

Finally, if we want to look at just the soft tissue inside an organization, we can remove the prescribed network [hierarchy] and just look at the emergent networks.  The InFlow software re-arranges the organization based on the actual links between people and we see an "MRI" of the emergent organization in Figure 4 below.

Figure 4 - Emergent Organization

The organization view above is a layout based on actual work connections -- people that work together, and work with common others, show up nearby on the map.  This allows us to see clusters, core members, and outliers.  With a hierarchy/pyramid we talk about Top and Bottom.  In a network we talk about Core and Periphery.  These structural locations are similar, yet different -- one is assigned, and the other is earned. Your location in the hierarchy is based on you job title.  Your network location is based on your Capabilities, Connections, and Communication [C3] -- irregardless of your job title. 

Another benefit of SNA is that it not only draws maps of the organization, but measures them also!  Key network locations are revealed by network measures.  Figure 4 above shows the individual betweenness measures for the employees in the network.  The betweenness measure reveals the brokers and connectors within the organization.  Again, we use node numbers instead of names to protect privacy.  But, if our clients want to see the actual names it is just a click of the mouse to change the node labels to an employee names.

Would you like to see inside your organization?

We can apply an X-ray/MRI to Big Data also!

May 5, 2014

Capital in the 21st Century

Birds of a feather flock together.  So do books of a feather.

Who flocks with the #1 bestseller?

Thomas Piketty's economic tome "Capital in the Twenty-First Century" is causing quite a stir. What is driving the world-wide excitement about an economics book?

  • Is it the 200 years of economic data he gathered? 
  • Is it the simple prose, without the obfuscating calculus of most economic papers? 
  • Is it the clear graphics that are understood by non-technical readers? 
  • Or, is it the right topic at the right time?  

This book is popular and widely discussed because of all of the above.

The theme of economic inequality is currently a popular conversation in both Europe and the United States.  It may soon become a global conversation. Piketty's Capital has attracted both positive and negative feedback — and most agree it is a great work of research. The liberals like it, and the conservatives can't quite kill it.  A topic — inequality — that was long avoided in capitalistic economies, is now in the forefront of political and economic discourse.

I wondered... What other books are fans of Capital reading?  Is the inequality theme limited to a handful of books, or are there many in the flock?  Amazon reveals the network relationship between books via: Customers Who Bought This Item Also Bought…  We retrieved Amazon data to reveal the egocentric network map around Piketty's Capital.  The map of the book's network neighborhood was created by InFlow software and is shown below.

Piketty's book is the magenta-colored node in the center of the network.  The green nodes are books which Amazon reports as having been frequently also-bought by the readers of Capital.  The blue nodes are books bought by the readers of the green books —but not directly bought by Capital readers. Two books are joined by a gray link if Amazon reported that they were often bought by the same customers.

Executing InFlow's cluster analysis algorithm reveals three overlapping clusters of books.  Each of the three clusters appear within a gold frame.  Books/Nodes appearing in two clusters show up in two frames.  The intersecting frames reveal a type of Venn diagram of the high level network structure of these similar, yet different books.  Each cluster contains a sub-theme on the topic of inequality and is labelled with that description.  The 19th century Karl Marx book —Das Kapital — does not appear to cluster with any of the other also-bought books, although it's theme is the same as Piketty's book.

The Inequality network neighborhood seems robust —it includes many best sellers on Amazon, including Flash Boys, The Divide, All the President's Bankers and A Fighting Chance.  Each of these books tells the story of how one group excludes others for its own benefit.

Politicians in the upcoming 2014 U.S. midterm elections, and the 2016 U.S. presidential election, may want to pay close attention to this network of books.  I suspect the flock will grow larger before the elections.  The theme of the upcoming U.S. elections may well be: "It's the Inequality, Stupid!" — a play off the Bill Clinton winning theme of "It's the Economy, Stupid."  The centrality of Elizabeth Warren's book, A Fighting Chance, may give her a nudge to be a candidate for President in 2016.  Warren's book emerged in that network location not from it's content, but from the choices the buying public made on Amazon.

Valdis Krebs, « Proxy Networks », Bulletin de méthodologie sociologique, 79 | 2003, 61-70.

Mar 24, 2014

A A A Organization

A few weeks ago I was talking to a potential client, CEO, about his concerns that his business was not keeping up with changes in his market-space.  I told him about our research into adaptive/agile/resilient organizations and how it could help his organization.  He said that sounded great and he understood that, but wondered how he could sell those concepts to both his Board and his leadership team.  He said, “After all, they do not have an MBA, like me, and do not read Harvard Business Review.  Your words are just consultant-speak to them.  You would lose them after the second sentence.”  Not the first time I had heard that. Been there, done that.  

My peer group uses and understands terms and concepts such as complex adaptive systems, emergence, self-organizing, resilience, adaption, crowdsourcing, enterprise social networks, and acronyms such as ESN, SNA, ONA, CRM, and SMO.  Yet, these are not for client conversations.  Potential clients want to hear consultants talk in language they understand use everyday. Now, instead of talking about Organization Adaptability Quotients,  I talk about the Triple A (AAA) Organization.  Everyone understands that AAA signifies the highest possible financial rating an investment can receive.  This financial metric has morphed into other business spheres and is commonly understood as adjective signifying the best of something.

A A A  is more than a rating, it also describes the the path and components to a successful organization.
  • Awareness
  • Alternatives
  • Action

For an organization to be agile and adaptive, the people in it need to be aware of what is happening around them, have alternative pathways to gather information and knowledge, and must be allowed to act to meet/solve both local and global goals/problems.  They need to both work in their hierarchy and in a self-organizing network simultaneously!

A wide, radial band of awareness by each employee allows them to adapt to what others are doing.  This awareness is both within the company, and also extends outside to customers, suppliers, and the organization’s marketplace/ecosystem.  Employees know what others are capable of, who the experts are, and what goals and pressures others have.  The more people a person has within his/her sphere of awareness (a.k.a. network horizon) the better s/he will function — as will those connected to that person.
Employees need to simultaneously work in the hierarchy and in self-organizing networks!
Business process improvement taught us to get rid of all redundancies in the workplace.  Yet, collaboration. innovation and change happen best when there are some alternative/redundant pathways available to get things done and make-sense of what is happening. Paths in the organization consist of the prescribed network — the hierarchy, and of the emergent networks — self-organized connections formed by employees amongst themselves to gather information, knowledge, expertise and advice to accomplish their goals. The emergent networks in an organization provide alternate, and often more direct, paths from the need, to the source.  It is not enough that alternatives are available, people need to be know where they are, and what they provide.  Not only is it important to be well located in the flow of things, but it is important to know the flow around you.  

Awareness and alternatives are useless without the ability to take action on them.  Does the leadership of the organization allow and trust employees to self-organize around tasks and goals?  Can you seek advice from someone outside of my department or project team?  Can you connect two people that should know each other because they are working on similar goals and have complimentary skills/knowledge?  In other words, does management keep a tight, rigid hierarchy or allow for looser adaptive structures that change with needs?  
It important to be well located in the flow of information,  and to know the flow around you.  

How well do the three As — Awareness, Alternatives, and Action — function in your organization?  Are they tuned for maximum harmony?  Do your hierarchy and networks work together in a Wirearchy?  Have you measured the wiring in your organization?  Do you know how your organization compares to others?  Do your employees know what to do and how to do it?  Do you know which tune-up(s) can be performed to improve your organization’s performance in the changing market-space?

Dec 10, 2013

Many Merry Connections!

Happy Holidays from Orgnet, LLC!

This year, again, give the Gift of Connection... 
Introduce two people you know, that would benefit from knowing one another!

This year's holiday network is based on the ancient Latvian Puzurs
usually made from straw, for the winter holidays.

Network Puzurs artwork/design: Copyright © 2013 Silvija Krebs

Nov 25, 2013

Mapping Contagions with Social Network Analysis

A contagion passed by human contact, such as MERS, TB, or SARS, spreads through human networks based on how infectious and susceptible each party is. Multiple contacts with infectious people plays a role in the probability of infection. Public health officials perform contact tracing to map the spread of the infection and manage its diffusion. 

The network below above was created at the epidemiology unit of 
The Centers for Disease Control [CDC] in the United States.  The network map 
shows the spread of an airborne infectious disease -- Tuberculosis. The map 
was created using actual contact tracing data from the community in which the 
outbreak was occuring.  

  1. "Transmission Network Analysis to Complement Routine Tuberculosis Contact Investigations"
  1. by McKenzie Andre, Kashef Ijaz, Jon D. Tillinghast, Valdis E. Krebs, Lois A. Diem,
  1. Beverly Metchock, Theresa Crisp, Peter D. McElroy [PDF]
  1. "Embracing collaboration: A novel strategy for reducing bloodstream infections
  1. in outpatient hemodialysis centers"
  1. by Curt Lindberg, Gemma Downham, Prucia Buscell, Erin Jones, Pamela Peterson,
  1. Valdis Krebs [PDF]

Black nodes are persons with clinical disease (and are potentially infectious), 
pink nodes represent exposed persons with incubating (or dormant) infection 
and are not infectious, green represent exposed persons with no infection and 
are not infectious. The grey nodes have been found to be members of the 
human network but have not yet been evaluated by medical personnel.

Unfortunately the 'social butterfly' in this community, the black node in the 
center of the network, is also the most infectious -- a super spreader.  
Current procedures focus on inoculating the vulnerable -- often the very young 
and the very old.  Network analysis reveals that it may be smarter, and more 
efficient, to focus on the spreaders -- those with many contacts to many 
diverse groups.

For more information on how social network analysis [SNA] assists health care 
professionals to manage and discover contagious disease outbreaks, see these
two papers co-authored with the CDC...

Nov 4, 2013

Making Sense of Emergent Patterns in Networks

One of the most used functions of social network analysis software is to discover and display clusters and communities in networks -- the dense sub-networks, where there are more links internally, than externally.

It is easy for the common person to spot dense clusters of connection in a network visualization.  Yet, this is a difficult problem for algorithms.  Early cluster discovery and community detection algorithms took the easy way -- they forced every node into one, and only one cluster, because the math was easier.  It was like the college physics course I took where all of the problems we did were in a vacuum and there was no friction to be accounted for.  This taught the basic principles, but did not carry over into real life.

Sociologists where not happy with the early community detection algorithms because they did not reflect how humans naturally cluster, connect and group themselves.  We are members of many clusters, and through that multiple/partial membership in many groups, we cause those clusters to overlap -- groups are not distinct with just unique members in each.  Group boundaries and porous and fuzzy.

Today there are dozens of community detection algorithms, many allow for overlap, and multiple cluster membership.  Community detection is still a hard problem.  Smart network scientists don't always agree on what is in each community, as we shall see.

Figure 1 is a diagram of simple network of 16 people modeled in InFlow software.  Symmetric (non-directional) connections are shown by the green links in the diagram.  This first network layout is just a circle with nodes in numerical order clockwise.

Figure 1

We first apply the simple community algorithm which puts everyone in a single group only -- often resulting in some funny groupings.  The algorithm finds us 4 unique groups/clusters, and at first glance produces a nice picture.

Figure 2

Upon further inspection we see nodes 1 and 4 have more connections outside of their assigned cluster than inside -- why were they forced into that group?  They look like good candidates for membership in multiple groups.

Next we allow for cluster/group overlap -- multiple memberships -- and we are surprised.  There is more than one answer!  In complex systems, such as human groups, communities and organizations, there is usually no one right answer, or one best way of doing things -- there are often several good answers. It might be impossible to choose the best answer ahead of time! The next set of diagrams (Figure 3-5) all show reasonable clusterings found in the data above. They all show 4 clusters, each cluster enclosed in a gold frame.  If a node shows up inside more than one frame, it is a member of more than one cluster.

Figure 3

Figure 4

Figure 5

Next we run another algorithm and find not 4 clusters, but 3 emergent communities. 

Figure 6

Yet another algorithm gives us just two overlapping emergent groups.

Figure 7

Which of those above do you like?  Which do you think best represents the natural groupings in this toy network?

My favorite patterns are next.  There is no rule that says all nodes have to be assigned to at least one cluster!  Some play a role of connector, or between many nodes in the shortest paths that connect them -- they have high betweenness -- maybe they are liaison between many groups without belonging to any one of them?  One of the settings on the cluster analysis algorithm in InFlow, assigned all nodes to a group except for node 4 -- s/he is a connector of groups, but not a member of groups.

Figure 8

Adjusting the cluster algorithm a little more now get two nodes -- 4 and 5 -- that are not members of any cluster.  They are the connectors in this emergent network.  My favorite rendering of this emergent network is in Figure 9 below. 
Figure 9

One of the properties of human relationships is that they are messy, inexact, and complex.  We should not expect to find one perfect way to group or cluster a network of human relationships.  If we do find such a perfect solution, maybe we have over-simplified the problem, like in Figure 1?

One thing we see in the various clusters above is that nodes 1, 4, and 5 are often the linch-pins that hold two or more clusters together (the clusters overlap around these nodes).  If we run various network centrality metrics on this network, we consistently find nodes 1, 4, and 5 at the top of the list, no matter which metric we choose -- 4 being at the very top, most of the time.

Finding logical and plausible clusters in complex systems is not a simple task -- there is no one simple answer.  This is not like accounting, where everything should add up correctly every time, and you do get one right answer. Finding clusters in networks is often about sense-making, what are the logical patterns we see and what might they tell us?  In our human relationships, we always want "neat and clean", but we always get "messy and fuzzy."  The right software will help you through the messy, and help you make sense of it -- it will not provide simple answers.

What patterns do you see?

Sep 11, 2013

Tracking Two Known Terrorists... Rather Than Everybody

Social Network Analysis [SNA] is a mathematical method for mapping and measuring human networks.  SNA helps us 'connect the dots' of complex human behavior.

Early in 2000, the CIA was informed of two terrorist suspects linked to al-Qaeda. Nawaf Alhazmi and Khalid Almihdhar were photographed attending a meeting of known terrorists in Malaysia. After the meeting they returned to Los Angeles, where they had already set up residence in late 1999.

What do you do with these suspects? Arrest or deport them immediately? No, we need to use them to discover more of the al-Qaeda network. Once suspects have been discovered, we can use their daily activities to uncloak their network. Just like they used our technology against us, we can use their planning process against them. Watch them, and listen to their conversations to see...
  1. who they call / email (i.e meta-data)
  2. who visits with them locally and in other cities
  3. where their money comes from
The structure of their extended network begins to emerge as data is discovered via surveillance. A suspect being monitored may have many contacts -- both accidental and intentional. We must always be wary of 'guilt by association'. Accidental contacts, like the mail delivery person, the grocery store clerk, and neighbor may not be viewed with investigative interest. Intentional contacts are like the late afternoon visitor, whose car license plate is traced back to a rental company at the airport, where we discover he arrived from Toronto (got to notify the Canadians) and his name matches a cell phone number (with a Buffalo, NY area code) that our suspect calls regularly. This intentional contact is added to our map and we start tracking his interactions -- where do they lead? As data comes in, a picture of the terrorist organization slowly comes into focus.

How do investigators know whether they are on to something big? Often they don't. Yet in this case there was another strong clue that Alhazmi and Almihdhar were up to no good -- the attack on the USS Cole in October of 2000. One of the chief suspects in the Cole bombing [Khallad] was also present [along with Alhazmi and Almihdhar] at the terrorist meeting in Malaysia in January 2000.

Figure 2 shows the two suspects and their immediate ties. All direct ties of these two hijackers are colored green, and link thickness indicates the strength of connection.

Once we have their direct links, the next step is to find their indirect ties -- the 'connections of their connections'. Discovering the nodes and links within two steps of the suspects usually starts to reveal much about their network. Key individuals in the local network begin to stand out. In viewing the network map in Figure 2, most of us will focus on Mohammed Atta because we now know his history. The investigator uncloaking this network would not be aware of Atta's eventual importance. At this point he is just another node to be investigated.
Figure 3 shows the direct connections of the original suspects as green links, and their indirect connections as grey links. We now have enough data for two key conclusions:

  1. All 19 hijackers were within 2 steps of the two original suspects uncovered in 2000!
  2. Social network metrics reveal Mohammed Atta emerging as the local leader

With hindsight, we have now mapped enough of the 9-11 conspiracy to stop it. Again, the investigators are never sure they have uncovered enough information while they are in the process of uncloaking the covert organization! They also have to contend with superfluous data. This data was gathered after the event, so the investigators knew exactly what to look for. Before an event, it is not so easy.

As the network structure emerges, a key dynamic that needs to be closely monitored is the activity within the network. Network activity spikes when a planned event approaches. Is there an increase of flow across known links? Are new links rapidly emerging between known nodes? Are money flows suddenly going in the opposite direction? When activity reaches a certain pattern and threshold, it is time to stop monitoring the network, and time to start removing nodes.

IMHO this bottom-up approach of uncloaking a network around known suspects is more effective than a top down search for terrorist needles in the public haystack -- and it is less invasive of the general population, resulting in far fewer "false positives".

In early 2002 I wrote an academic article describing how I mapped the network of the 19 hijackers using  public (open source) data.  

Sep 5, 2013

Vacuuming the Internet

As part of the NSA surveillance revelations, there have been accusations that many popular consumer internet companies such as Google, Apple and Facebook have allowed the NSA to "directly attach to their servers" and vacuum up all of the data going in and out of these servers.  The management of these companies have vehemently denied giving the NSA unfettered access to their customer's data. This CNET article has a good summary of what has happened so far on this particular aspect of the NSA surveillance.

Network thinkers know that to effectively monitor a network, you don't seek out the edge nodes, you find the central hubs and monitor them — through them you will have access to most of what is flowing through the net. In a hub-and-spoke system the spokes are all dependent on their local hub to route information/data/bits -- in and out.  In the complex networks like the Internet, hubs are connected to other hubs (but not all).  The pattern of connections amongst the hubs determines which hubs are more central to the overall flow of things throughout the network.

Security expert Bruce Schneier writes...
"The primary way the NSA eavesdrops on internet communications is in the network. That's where their capabilities best scale. They have invested in enormous programs to automatically collect and analyze network traffic. Anything that requires them to attack individual endpoint computers is significantly more costly and risky for them, and they will do those things carefully and sparingly."

Below is a network map of the Autonomous Systems [AS] that form the backbone of the internet.  It is easy to find the central hubs in this network.  Load the 20,000+ nodes [each AS is represented by a node] and 48,000+ links [a data flow between two ASes is represented by a link] into a social network analysis software program and have it run the Betweenness or Connector metric.  These two network metrics reveal how central any node is in keeping everything interconnected.  The hubs will be reveled by the network metrics.  In the diagram below the hubs are sized by their Connector score -- the higher the score, the larger the node, and the more network paths flow through this node.  The colors are randomly assigned and have no meaning.

Most of the large Internet hubs are located in North America. 

The largest hubs [AS] are mostly telecomm companies, internet infrastructure providers, and organizations of the US government.  Most of the large Internet hubs are located in North America.  You can get a pretty good picture of what is flowing through the whole internet by monitoring just a dozen or two of the largest hubs.  An example of how these main hubs can be tapped, and utilized, is told in the story of Room 641a of SBC Communications in San Francisco.

Whether the NSA has a direct tap into your favorite social network, or search engine, we may never know.  Maybe they don't need the direct connect to capture all of the information flowing on the Net?  

How will the rest of the world view their dependence on the internet, with the U.S.A owning and monitoring the key hubs (key intersections of information flow) in the Net?

Jul 27, 2013

Dancing the Bunny Hop with the NSA

One hop, two hops, three hops... forward!

According to an NSA executive, 2-3 hops/steps is the social distance that the NSA uses to look outward, into social space, from a known terrorist suspect.  When they find a suspect's phone number or email, they investigate the network neighborhood around it.  Why?  They are trying to determine if a suspect is part of a group -- does s/he have co-conspirators?

A simple 3-hop (or 3 step) chain is shown below in Figure 1 -- each green link is a hop or a step and shows contact between the two persons, or nodes, in the network.

Figure 1

Analyzing a specific person's immediate network is also known as contact chaining -- we are connected to many chains via family, friends, colleagues and contacts we communicate with.  Many of these chains intersect and overlap creating a network with multiple paths to most nodes in the network.

Before we do contact-chaining, or any other network analysis, we must first determine: what is a "contact"? Many studies of on-line behavior often set the bar too low for what a "contact" is.  Sites like Facebook and Linkedin often contain way too many spurious ties -- people you have "approved" a link for, but you really do not know.  Facebook is famous for people having hundreds, if not thousands of "friends."  Facebook's own study of their user behavior shows that the average active friend circle (people you actually interact with, and maintain a relationship with with) is between 40-60 people -- a far cry from hundreds or thousands.

Consider a terror plot, as a project.  People have to communicate and work together to accomplish their project goals.  They need to organize the process, share information, meet deadlines, and adapt to changes and setbacks.  This requires regular communication and coordination. If this project activity is performed at a distance, it is trackable and mappable by electronic surveillance.  In the email and phone meta-data, the NSA is looking for a project team connected to a known suspect.

In the email and phone meta-data, the NSA is looking for a project team connected to a known suspect.

After the NSA executive let it slip that they were interested in all 2-3 step contacts around a suspect, many folks tried to estimate how many people would be affected by the multitudes of 3 step chains we are all a part of.  One estimate was 2.5 million people would be affected by each suspects 3 step network chains. While the estimator picked a good starting value for a typical American -- each person has about 40 unique and active friends -- he multiplied once too often ending up in the millions instead of the tens of thousands.

Soon, an even larger estimate appeared in the blogosphere -- 27 million people would be caught in the dragnet around each suspected terrorist!  This estimate started with a 300 person social circle around each individual -- which is twice the Dunbar number of 150!  This estimate also did not take into account the overlapping friendship networks we all have (many of our friends are also friends with each other).

Both of these estimates erred at who was in the center of the network.  The average American may have around 40 on-line contacts, and the social media guru may have 300,  but the domestic or international terrorist is trying to hide, not be discovered, on the Net.  Terrorists, and others behaving covertly, tend to have very small networks of people they trust -- no casual acquaintances to balloon network size!  From my experience of analyzing and mapping human networks for over 20 years, and from mapping the 9-11 hijackers and other covert and criminal networks, these two estimates seemed alarmingly high.

So, how many people could be "persons of interest" in a terrorist search?

Based on my post 9-11 experience, I looked for some social network data in my archives that might better illustrate what the 3 hop network neighborhood around a suspect might really look like.  I found data from a group that mixed both task and trust ties -- similar to what we may find in covert network --  a limited trust radius(trust only a few), yet with many tasks to accomplish.  The members of this network were not surveyed and asked to list who they viewed as colleagues and friends -- all data was gathered from their on-line activity -- it did not matter who they knew, it mattered who they actually contacted on-line.

Figure 2 shows the immediate network of a typical member of the group. The links show actual contact between two people/nodes.  The suspect, highlighted in the middle of the graph, has 13 observed contacts, many who also contact each other.  Terrorists, criminals, and others involved in covert activities keep their network small, for fear of discovery -- they keep only ties they can deeply trust.  Of course, living in the world, they have incidental ties, with local shopkeepers, neighbors, and delivery people.  But these incidental ties are usually face-to-face and do not show up via electronic surveillance.  A terrorist, whether domestic or international, will not readily share his/her phone number, email or other id with merchants and other locals.

Figure 2

All nodes are linked to the Suspect, who is in the center and highlighted in pink. Contacts, who also had observable interactions with each other, are also connected with a grey link.  This "ego network" -- showing 1 hop/step from the suspect -- is typical of many we see, with a clustering coefficient from 0.4 to 0.6 (your friends/colleagues are often friends/colleagues with each other).  

Group structures are hard to spot in 1 step networks, that is why we go out 2 and 3 steps in order to find any emergent groups amongst this collection of nodes.  At 2 hops/steps from the Suspect, we start to see some clustering of nodes.  Below are the interactions at 1 and 2 steps from the Suspect.

The magenta colored nodes are the same step 1 nodes seen in Figure 2.  The green nodes are two steps from the suspect.  We notice that some neighbors have more connections than others.  Again, the links only show the observed/recorded interactions.  The network begins to show some clustering.  What is interesting about the network at this point is where it starts to fold back into itself -- which 2 step contacts interact with each other and with various 1 step contacts?  The green nodes with more than one connection, especially to various magenta-colored nodes are probably more important than those who are just single spokes around a magenta hub.

Next we bring in the third hop, shown in Figure 4 by the blue nodes.
Figure 4

The network has grown much larger than the 1 hop network in Figure 1.  We have gone from 14 nodes to 185 nodes in three steps.  The suspect had 13 observed contacts in Figure 1.  Many would naively estimate the suspect's 3 step network to be 13 x 13 x 13 = 2,197.  But many friends/colleagues are also friends/colleagues with each other -- we have overlapping networks with those we are connected to.  Each node in the above networks represents one unique person.

Now that we have expanded the network out 3 steps, what do we do?  We shrink the network!  The NSA wants to find groups that the suspect may belong to, and find other key nodes in his/her network -- that is why they gather the 2-3 hop contact data.  Rather than investigate all 184 contacts of the suspect, we want to now reduce the network to its core, around the suspect.  The core network, of 47 nodes, is shown in Figure 5 below.  
Figure 5

We see that most of the 1 step nodes remain, with a good portion of the two-step nodes (green), but a very small percentage of the three-step nodes (blue).  The key nodes to focus on have been highlighted in yellow -- they are important to the structure and the flows in the network.  

Next, let's extract the clusters, and their overlaps, in the core.  It appears that there are 4 clusters with several of them overlapping via 4 nodes.  We erase the links and draw a Venn diagram in Figure 6 showing the four clusters and four nodes which act as linchpins holding the various clusters together.  
Figure 6

The four connecting nodes (linchpins) are probably the ones that will be investigated first, followed by the other nodes that were highlighted in yellow (see Figure 5).

So, we have gone from millions of nodes, to thousands of nodes, now to dozens of nodes affected by each terrorist suspect tracked.  According to an NSA slide released by Edward Snowden, the NSA currently has over 117,000 suspects.  With the previous 3 hop estimates estimates, 117,000 suspects would include most of the world population into the NSA dragnet(counting for overlaps).  With dozens of "persons of interest" we end up with about 1.5 million people within the sphere of analysis -- still, a lot of "false positives" (false alarms) to sort through.  1.5 million is a lot less than the 27 million estimate which was based on false assumptions about both 1) covert social circles, and 2) how human networks overlap.

Update July 2014: based on this Washington Post report on actual NSA data provided by Edward Snowden, even my low estimate of 1. 5 million was a little high -- based on that slice of NSA data Washington Post estimates around 1 million people in total are caught up in the network analysis of 90,000 suspects/targets.  I had used an earlier estimate of over 117,000 suspects.

The method I described is a logical approach for an experienced social network analyst.  It is probably not the method(s) used by the NSA.  Their methods may be similar, because they are looking for groups/clusters and trying to identify which nodes need closer scrutiny.

I applied a similar approach to mapping two initial suspects in the 9-11 attacks -- after the event.