3/15/2013

Big (Network) Data

The Excitement about "Big Data" is usually around the access to lots of data -- thousands, or millions, of records.  Below is a hairball diagram of lots of social data (nodes and links revealing a network of connections) from the WWW. Social data is relational/interdependent, not discrete/independent like most statistical data about individual people/objects.

This picture is not that interesting!  What we want is interesting and useful, not BIG.
When investigating social/relational data, it is usually not the forest that is useful, but the clusters of various trees, and their relationships, inside the ecosystem. We not only want to "see the forest for the trees", but also see the patterns/clusters of trees in the forest!

Big data often contains small clusters -- especially with social data.  Human networks usually contain dozens or hundreds of nodes -- we usually do not have time/energy for thousands or millions of friends/colleagues.  The goals is to find the significant clusters amongst all of the data.  When looking at at big social data it is important to set the bar correctly for what is a link of significance/importance. The first step in mining big social data is to eliminate the noise -- find the natural human groups in your sea of data.  Where are the islands of interesting patterns?

Network components reveal much within interlinked data. As we zoom in, we can begin to answer some useful questions...
  • Who is here?  
  • How are they clustered? 
  • How are they connected? 
  • Who are the key connectors?  
  • Who is in the thick of things?
We put an MRI to our big data above.  We see various subsets of the above ecosystem.  These network maps show various slices/parts of the whole, and how they are connected. Notice the network components displayed below are all in the size range of dozens or hundreds of nodes.  We now see patterns worth investigating.
and many more slices...

At 1000 meters, big data is not that interesting.  At 100 meters, we start to see interesting patterns/components. At 10 meters we can play with the patterns and really start to learn what is happening inside the social ecosystem.  In Big Data, the important numbers are not the millions, but the groups of dozens, and hundreds, that reveal meaning and give us insight.

What is happening in the "social forest" inside your ecosystem?


2/09/2013

Arrows on Twitter

Twitter is a social network that network scientists refer to as an asymmetric network -- the links are directional, they are drawn with arrows.  Links between people on Twitter show direction of intent. The arrows are drawn from source to target.


Looking at a social graph from Twitter we can tell a lot by following the arrows...
  • who is aware of whom/what?
  • whom/what is getting attention?
  • who is involved in conversations on specific topics?
  • who is central, and who is peripheral to the discussions?
This past week I was invited to a Twitter Chat (a.k.a. Tweetchat) on the topic of Serendipity.  Two separate chat groups (#innochat and #ideachat) came together on a topic of overlapping interest.  Twitter chats last for 1 hour and use a pre-determined Twitter hash-tag to track all of the tweets in the on-line conversation. 

When we draw a network map (a.k.a. social graph) we see people as nodes and their connections/conversations as links.  In this case, the links have arrows showing who is referring to whom.

Let's first look at outgoing links -- who is linking out, to whom.  Links on Twitter can be of a broadcast nature (X announces something to all of her followers).  Links can also be directed at a specific target -- Y aims a message at a specific person, or Z re-tweets (RT) something X has posted.  Although an RT is a broadcast, it is also a message back to the originator of the tweet -- I am aware of what you tweeted, and I choose to pass it on to others (not necessarily an endorsement).  

The network map below shows participants from the Serendipity Tweetchat.  Two nodes are linked if the source node, RT'ed, MT'ed or @-messaged the target node more than once during the chat session.  The node colors show: blue - general chat participant, purple - chat facilitators, green - invited guest.  The Twitter ID of each participant is shown beneath their node.  The node size in this first map is determined by a network metric called Awareness -- it looks at all local, outgoing, direct and indirect, links surrounding a node.  The higher the awareness metric, the larger the node.  Larger nodes should be more aware of what is happening in the surrounding network, than the smaller nodes.


Next we look at the same map -- same nodes, same links -- but different node sizes.  This time the nodes are sized by incoming links.  The network metric used here is called Attention, it looks at all local, incoming, direct and indirect, links surrounding a node. It is good to have many incoming links, but it is even better to have incoming links from others who have many incoming links!  Those with a nice pattern of incoming links are what Malcolm Gladwell referred to as mavens in The Tipping Point.

Notice that some of the node sizes have changed drastically -- some with low Awareness, have high Attention and vice versa.  The metrics help reveal the roles people play on Twitter -- some engage many others, while some prefer/wait to be engaged (targeted).  The node of the lead chat organizer, @blogbrevity, is large for both network metrics -- a proper pattern for an effective facilitator!

Our third network graph shows node size based on both incoming and outgoing links.  The network metric, Integration, shows how "in the thick of things" a node is.  A Twitter node with a high Integration score is probably posting interesting tweets, noticing other peoples' tweets, getting retweeted  and participating in many conversations.

Notice how the both facilitators (purple nodes) have the largest nodes -- they were very active in moving this very rapid chat conversation forward.  They were interacting with the invited guest, with newcomers and regulars, all the while asking questions, and RT'ing key tweets of the ongoing conversation.  Those with a high integration metric can play the role of connectors, as described in Gladwell's Tipping Point.

Twitter is not just about person-to-person interactions, it is used to broadcast messages to large groups -- either followers or those tracking a hash-tag.  Many of the tweets in the chat, were aimed at no one in particular, they were broadcast to the whole group. This network map is different than the others, because it shows only broadcast messages to the whole group -- it does not show interactions between the participants.  The whole group of the chat participants are represented by the large red node in the center -- it is the hub.  The spokes around the hub represent various participants that shared more than one tweet with the whole gathering.  The thickness of the links indicate how many tweets each person/spoke sent to the group/hub.

This chat network formed, emerged, and disbanded all in the space of several hours.  Yet, it revealed the pattern of many long-term networks -- a core-periphery structure, mavens, connectors, and leaders.  Many of the participants in this chat, already knew each other on Twitter, especially through previous Twitter chat events.  This was really an old network reconvening -- with a few new members joining in.

The core members of a group are easy to spot, they are the ones with many arrows, all pointing to each other -- a sub-network where everyone seems to know, and interact with, everyone else.  The network map below shows the core of the network -- all nodes have at least 4 connections to everyone else and they all have incoming and outgoing arrows.  The core was so tightly packed that we removed the arrow heads so that the graph was easier to read.


Finally, we look at all of the data from this chat -- aggregate all of the arrows, and combine all of the maps above -- to find out which participants were most involved in this Twitter chat.  The list, sorted high to low, shows the fifteen (15) most connected people over the hour long chat on Serendipity.  

@blogbrevity
@orgnet
@InnovationFixer
@DrewCM
@Renee_Hopkins
@juneholley
@CreativeSage
@innovate
@thehealthmaven
@AndreaMeyer
@kiporama
@deb_lavoy
@OBX_Harvey
@wbendle
@jawbrain

Next time you look at a map of a human network, look for the arrows.  Who are they going to?  Where are they coming from?  Where is a cluster of arrows, all pointing to each other?  Ask the analyst what the links mean?  What do the node colors/sizes mean?  Soon you will be able to make sense of the map and zero in on key clusters of activity, along with key connectors in getting things done.

To the connect the dots, follow the arrows!


Acknowledgements: One of the chat facilitators -- Andrew Marshall (@DrewCM) provided us the history of the chat from Tweetchat.com.  My friend, and colleague, Zee Spenser converted the PDF to CSV network mapping data.  Zee shares the data and his code on github.  




12/24/2012

Merry Connections

O, Network Tree 
O, Network Tree 
How green your geodesics... 

This year give the Gift of Connection... 
Introduce two people you know, that would benefit from knowing one another!

Network Tree artwork/design: Copyright © 2012 Silvija Krebs

10/16/2012

2012 Political Book Network

I have been mapping political book networks since before 2004 U.S. presidential election. These network maps are like a social graph of books.   The data is gathered from Amazon.com -- their list of top political books.  Two books are linked if they were often bought together, or by the same buyer.  These are also-bought pairs -- people who bought this book also bought that book.

During the the 2008 election the political book map reflected the deep divide in the country between conservative (RED) voters and liberal (BLUE) voters.  There were no connections, nor any intermediaries between red and blue books -- each cluster was completely closed off to the other. There was a separate cluster of people reading books on the then new candidate -- Obama, but they were not interested in reading/purchasing other political books (upper left corner of network map below).

2008 Political Book Network Map


I expected a similar pattern for 2012 -- a big chasm between right and left.  I thought the map would show each group honing up on their side's talking/debating points and ignoring books of non-conforming opinions.  I was surprised, the two clusters in October 2012 were connected by several books!  The hub in the center of the network, with spokes to many blue and red books, is The Price of Politics by Bob Woodward.  Woodward is viewed as a center-right journalist, and this book is about politics in general, so it makes sense that both sides would be reading his usually excellent prose.  No Easy Day, by one of the Navy Seals that took out bin Laden reads more like a novel, than a history book, attracting readers from all political persuasions.

The third bridging book was a surprise! The Little Blue Book is intended for a progressive audience -- it is a handbook for how to argue effectively with the right wing.  So, you would expect it to be firmly in the center of the dense blue cluster, right?  Wrong!  It has both blue and red readers!  I checked all editions of the books -- hardback, paperback and Kindle.  For the Kindle version, The Little Blue Book was connected (also bought) with other blue books, as expected.  It was with the paperback edition where I was surprised -- 4 of the first 10 also-bought books were red books!  Amazon shows their also-boughts by decreasing count/volume, therefore there were many instances of readers of certain red books were buying The Little Blue Book.  Why is this so?  Maybe the right wing is trying to understand the left wing and reading their blue handbook -- similarly to how they read the far left book Rules for Radicals during the 2008 election campaign.  

2012 Political Book Network Map


This year we also have books about the candidates -- their biographies and positions on major issues.  Obama has the same set of books as last election, Romney has his No Apology series, and Romeny's running mate is written up in the Young Guns book.  Potential voters appear to be reading books about both of the candidates -- Amazon readers are buying books about Romney and Obama together!  See books in upper left frame (2012 Candidate biographies) above.  

Another pattern is different in 2012 than in 2008. Now, people reading about the candidates, are also reading other political books.  The pattern is positive for Romney -- people reading about him are reading other red books -- not so, for Obama.  People reading his positive biographies and position books are also reading polemics attacking Obama.  The most influential anti-Obama book in the above network is Obama's America -- it is read by potential voters who are reading about both Obama and Romney.  See the link patterns in the upper left corner of the above diagram.

Even though the two book networks are connected, we still have a polarized voter base -- those are two strongly defined communities.  Running one of the network metrics from InFlow software, reveals two tightly defined clusters.  The E/I Ratio (External/Internal) is near -1.0 for both the blue and red groups indicating two exclusionary communities.  Polarization persists in America.

Can we use these network maps to predict the election?  Probably not.  The main insight I get from these maps is that the 2008 election provided a more clear cut choice for voters.  Although supporters of each candidate today would also say the choice is clear this time around (they always say that), the data does not support that. The bad news for Obama is that in 2008 people were mostly reading positive books about him and in 2012 they are reading both positive and negative books about him. Are these that small percentage of undecided voters who will likely decide this close election?  I bet each campaign would love to know who these Amazon readers are... and these readers may want to know who each other is!

10/12/2012

Community Networks


Most community and organizational networks we see in our consulting practice have a shape similar to the network above.  These are called a core-periphery networks.

The core is made up of people with many connections to each other -- it is the dense center of the network, represented here by the pink/magenta nodes.  The periphery, shown by the grey nodes, are people that provide data, information and knowledge to the core, but are usually outside of the dense work ties in the core.  They are resources to the core, allowing access to data, information and knowledge that does not reside in the core, but is necessary for the core members to accomplish their goals.  The grey links show who has collaborated with whom on a project in the last year.

The three blue nodes in the above network are the formal leaders of the network -- they are the social entrepreneurs who formed a non-profit organization to support green businesses in NE Ohio.  The disconnected groups, in the upper right of the map, are also involved in sustainability projects, but they have not worked with any of the core members yet.  These are satellites of the current core-periphery network.  There may be network weaving opportunities to connect the satellites to parts of the network where they can best provide information, expertise, and experience.  For more on how networks form and evolve see our white paper "Building Smart Communities through Network Weaving"

On-line communities also have core-periphery structures and many satellite clusters.  They also have a unique set of passive members -- people who share an interest with the core, but who are not active in the network (they have not collaborated with anyone in the network), they just observe what is happening.  They are often called lurkers.  For more on the network patterns in online communities see our post "Connecting the Community"

Knowing the net, helps us knit/nudge/navigate the net!

These network maps help community managers build more innovative and resilient social networks.  First you see the present structure of the network... where are the gaps, where are the bridges, who are the linchpins that keep things together, who is in the core, and who is in the periphery?  Knowing the net, helps us knit the net!  The maps show us where we are today, allowing the community (along with their consultants) to plan where they want to be tomorrow.

What possibilities do you want to be ready for?






9/11/2012

Infrastructure Resilience


Infrastructure is an increasingly popular term these days.  Whether we are talking about our failing infrastructures because of the current economic crisis, or we are worrying about targeted infrastructure when discussing cyber-war and terror attacks.  All advanced societies depend on infrastructure, and the more advanced the society, the higher that dependency — and the consequences of failure.

Recently, I have been reading a fascinating bookResilience, by Andrew Zolli.  He writes about resilience across many levels, but also takes an in-depth dive into the resilience of infrastructure and how it affects our lives in many ways.  In reading the book, I was reminded of some of the lessons I learned after revealing terrorist networks and analyzing infrastructure as networks.  

Below is an infrastructure network in a prominent country.  This network is critical to this economy's international trade, helping create a trade surplus for this country.

The layout of this network is typical for many man-made networks — they are built for efficiency, and not resilience.   Efficiency affects the network two ways — when it is functioning as intended, and when it is failing.  The nodes above are connected with an almost minimum number of links so as to avoid redundancy — engineers often want to keep materials and choices to a minimum in complex systems.  If the network is hit by a random failure, often the result is not catastrophic.  There are many places the network can fail with just local affects.

Unfortunately, our strong focus on efficiency can be used against us in terror attacks and cyber-war.  Although efficient networks can handle random failures, they are extremely brittle when faced with intelligent, targeted attacks on nodes and links that maintain the network's connectivity.  The network above is both highly efficient and highly brittle.  You need to disable only two nodes for the network to break apart into unconnected components stopping the flow through the system.  Local geographic attacks can be successful with an attack on a single node or link.  If you were planning to disable the above network, which two nodes would you choose? (Put your answers in the Comments)

Our systems must be designed with the often conflicting goals of efficiency and resilience.  Resilience often requires redundancy —a common enemy of efficiency.  Redundancy provides alternate pathways in the network — if one path is blocked/disabled, others are available to continue the flow.  In the diagram above, in most places, if your path is blocked you have no alternatives — you sit and wait until the network is repaired.

Resiliency requires Redundancy

Network thinkers know that resiliency requires redundancy — we need alternative choices when we encounter failures.  Some redundancy is a good thing — just not too much!  Nature uses redundancy in living systems to help them adapt to change.  Most client networks we examine, that are effective at getting things done, have a reasonable amount of redundancy in the paths available throughout the structure.

The secret to resiliency are alternative paths in the network.  But, how do we know where to put the alternative paths?  We can use network analysis to determine our easy points of failure.  Other factors, such as geography, can help determine the most easily attacked nodes/links.  Although we can not plan for all possible attacks, or natural breakdowns, we can build some alternatives into our systems to make them more robust.  We want systems that degrade gradually after an attack, not brittle systems that fall apart after a few intelligently focused hits.

In a world of increasingly interconnected and interdependent systems and networks, we must learn to build these structures in new ways, that not only focus on efficiency, but also on robustness, recovery and the "ability to bounce back" as Andrew Zolli says.

Where would you add new links to make the network above more resilient?  Remember, the nodes are not people, but transfer points for whatever is flowing through this infrastructure (gas, oil, electricity, water, etc). (Put your answers in the Comments)

8/31/2012

Wirearchy

Wirearchy is another way of looking at organizations today.  It includes both the hierarchy structure and the emergent task, social, and political connections that happen as we work together in organizations. Wirearchy was originally defined by Jon Husband in 1999 as:
"Wirearchy is a dynamic two-way flow of power and authority based on:
  • knowledge,
  • trust,
  • credibility, 
  • a focus on results
 enabled by interconnected people and technology"
 Jon Husband, 1999

An Organizational Network Analysis (ONA) reveals the wirearchy in an organization — the prescribed and emergent networks in an organization — the enterprise's social & task networks.  We find that even a hierarchy is a network! Mathematicians classify a hierarchy as a special type of network called a tree — a network without cycles/loops.  It is one of many overlapping organizational networks.

In the network map below we see two networks.  The hierarchy, drawn with black arrows, shows who reports to whom.  The work network, drawn with grey undirected links, shows who collaborates with whom — many within the hierarchy and many not.  The work network shows one of the emergent organizational structures — no one explicitly designed it like they did with the prescribed hierarchy.


The nodes in the network map are colored by location of the office where the employee works (actual employee names are hidden, replaced by numbers).  Similar organizational functions are grouped together by location — executive and administrative (Finance, HR, IT, etc.) functions are performed at the purple location (HQ).

Some may look at this map and ask why there are not more connections between the green, blue and red locations.  Others may look at the map and see the preponderance of links within a location and conclude that management has accurately placed the employees that need to work together.

As outsiders, looking at a network diagram of someone else's organization, we can not judge what is right and what should be.  A smart, external consultant with knowledge of many organizations, works with executives of the client organization, who have inside knowledge of how things work and could be, to decide what is the ultimate organizational shape and structure to meet management's goals.  An ONA is to management like an x-ray/CAT scan is for doctors — it helps to reveal the invisible, but does not provide easy answers.

The organizational wiring — both the prescribed and emergent — is molded by the strategy, contexts, and history the organization is embedded in.  There are no one-size-fits-all solutions.  Apple is not structured like Mobil, yet they both succeed with different patterns of wirearchy appropriate for them and their business ecosystem.

For more on organizations, hierarchies and networks see: Adapting Old Structures to New Challenges


7/04/2012

Social Capital... the Key to Success in the Connected Age

In the knowledge economy, knowledge and content are no longer sufficient – everyone has access to many sources of content and knowledge. You cannot compete on what everyone knows. As you move up the hierachy, it becomes more difficult to compete on individual competency – everyone is highly skilled and experienced at the top. It is hard to compete when everyone is so similar.
You cannot compete on what everyone knows.
The new advantage is context – how internal and external content is interpreted, combined, made sense of, and converted to new products and services. Creating competitive context requires social capital – the ability to find, utilize and combine the skills, knowledge and experience of others, inside and outside of your organization. Social capital is derived from employees’ personal and professional networks.

In the diagram below is a large U.S. company they has great local social capital (good connectivity within each regional office) but poor social capital between regions/offices. This is a typical pattern in so-called "siloed organizations."  Both the network map(on the left in the diagram) and the network metrics(on the right in the diagram) reveal this ineffective pattern.  



Innovation happens at the intersections -- innovative organizations have many more intersections of diverse thinking and approaches than we see above.  Competing effectively in the connected economy is based on combining (and re-combining) unique knowledge from different parts of the business ecosystem (both within and outside of the organization). 
Innovation happens at the intersections!
Does your organization have the connections for both implementation and innovation?  Does your organization's human network provide both internal knowledge and advice, along with external ideas and possibilities?

For more, read this original article by Valdis Krebs.

5/17/2012

Social Networks as Art


Textile art by Gundega Strautmane based on the network maps of Valdis Krebs — Relational Ornaments.

Gundega works with vision impaired folks in Riga Latvia — creating art that they can touch & feel, allowing them to see & sense data visualizations.  As opposed to most art objects, this is art you are invited to touch! Feel the pins, the threads.

The network, pictured above, is a TB outbreak.  Ideas and interests, like disease, spread person-to-person and can be contagious. Go spread some Art!

Enjoy!


4/27/2012

The Social Life of Code

I was hired by a major U.S. company to do network diagrams of their major computer systems and how they were connected — who updated whom?  It was 1998 and the year 2000 was fast approaching — they were worried about "Y2K" (when all legacy computers would crash, because they would not know how to handle a year that did not begin with "19") and how it might affect their business-critical systems.

The goal of the network analysis was to map out which business computer system updated which other system.  We gathered data on several link-sets, or networks — anyone remember working with IBM JCL? One of the link sets was only for date-based updates. Of course, we drew an asymmetric network — system X updates/influences system Y. The nodes on the network map were either major modules of major systems (Payroll), or complete business systems (Pension Calculation).  We colored the nodes by whether the system was Y2K compliant yet.  In a way, this was lot like the contagion analysis that I would do with the Centers for Disease Control many years later — who is sick, and who is coming in contact with whom else?  This analysis was also similar to the many social network analysis projects being done today — looking for influencers in social media, or looking for bad guys in terror or corruption networks.

Recently a client of mine sent me a large network map claiming he had found a new use for InFlow.  He sent a map of code for a program he was writing.  He said the network map, and InFlow's ability to store text in the nodes (he would put notes and code there), helped him stay organized in this very complex project.  He also mentioned that the various sub-network maps he can create helped immensely in discussions and problem-solving sessions with coders and system designers.

Computer systems are man-made networks, but they sometimes have similar attributes to living networks such as human, biological and ecosystem networks.  Therefore, we can apply our learning from one domain to the other -- innovation happens at the intersections!  In this article, The Social Life of Routers, I apply social network analysis thinking to designing a network of routers for computer networks.

We are always sending updates to each other, whether man or machine...   Ping!

4/14/2012

The Next Big Thing


While Facebook prepares to go public, Apple Pings into the void, LinkedIn focuses on resumes & recruiters, Myspace circles the drain, Twitter becomes more complex, Pinterest distracts, Google+ goes around in circles, and Instagram loses focus, the next big Interests & Passions network is being built... under the radar.

Amazon, with their public/private highlights/notes from Kindle readers is creating a knowledge & interests ecosystem that will aggregate what the world is interested in, and what the world finds important... and what the world wants to buy more of.  And, of course, they are making it social, by connecting to many of those they will eventually replace (mentioned above). It is not just the also-bought data that matters (which books bought by same customer), it is what we specifically find interesting and useful in those books that reveals deep similarities between people -- the hi-lites, bookmarks and the notes will be the connectors.  Our choices reveal who we are, and who we are like!

...it is what we specifically find interesting and useful in those books that reveals deep similarities between people 

We will connect with each other via our similarities and profit from our differences... 
and so will Amazon!   We are all nodes in the Amazon network/jungle.

 
*    *    *    *    *    *    *    *    *


Above is a 30,000 foot view of the jungle.  When we descend to the jungle floor, we see individual plants and what is growing near them. 

Below is a network map(via social network analysis) of a very interesting new book -- Too Big to Know [2B2K] by internet scholar David Weinberger. David's book is shown by the magenta node in the center of the network.  Directly connected to his book are the books that Amazon mentions that customers also bought [green nodes], in addition to 2B2K. These books are probably more similar than different to 2B2K. The blue nodes are books that are 2 steps away from 2B2K, they are probably more different than 2B2K, but retain similarities. The arrows show the direction of the majority of also-bought activity.  If you find 2B2K interesting, you will probably find a pleasant read in one of green books or possibly a blue book -- depending upon your desire for difference. 

Today, Amazon introduces you to similar books.    Tomorrow, they will introduce you to similar readers.


We have been analyzing book networks for a long time -- I think we were the first.  Our most popular maps are those of political books in the United States.  We may do another analysis before the 2012 U.S. presidential election.

UPDATE: Twitter has also picked up on the idea of overlapping interests — they use this data to suggest who you should follow on Twitter.  Instead of mining your interests from the books you read (like Amazon), Twitter mines your interests from the web sites you visit — no Kindle purchase required!  Read more about Twitter's building of your interest graph.

2/09/2012

Changing Centuries, Changing Contexts

This week, a new practitioner of network analysis [@BennettResnik] asked me the difference between SNA — social network analysis and ONA — organizational network analysis.  I gave him a flippant answer — the difference between SNA and ONA is the spelling!

There is a funny story behind the origin of ONA. It shows us how context matters and how client  views have changed significantly.

I started working with the IBM Consulting Group (now IBM Global Business Services) in the early 1990s. They licensed InFlow and I trained them in the use of the software and the process of social network analysis.  Soon there were dozens of IBM consultants applying SNA both within the firm and externally with clients[there are currently several hundred IBM consultants applying network analysis].  One of the most productive SNA consultants in IBM at that time was Gerry Falkowski.

One day Gerry sent me a presentation that he was using within IBM to introduce executives to ONA.  I looked it over, it was a great deck, aimed squarely at business people.  He did not use the term social network analysis — everything was organizational network analysis.

The next time we met face-to-face, over a beer, I asked him why he dropped the term SNA and social network analysis.  Gerry laughed, and shared how he had been presenting his slide deck around the company the last few months and that IBM executives seemed interested, but somewhat skittish about the term: social.  "After all, Gerry, we are a business, there is nothing social here!" said an IBM executive.  This was not the first time he had heard that.

So, Gerry did a smart thing.  He opened up his presentation and did a replace all "social" with "organizational."  That is the only change he made, and went back out on the road to now sell Organizational Network Analysis.  With the exact same presentation, except for the change in wording above, he now got rave reviews from executives inside of IBM.  "Wow, when can we do this!?" was the new response.  Soon Gerry was so busy doing ONA inside of IBM, with great results, that he became the go-to person for ONA throughout the firm.  He just changed one word, and it made all of the difference.

In the 20 years since IBM started using network analysis, things have come full circle.  The corporate consulting context has changed — now it's social this, social that and social everything.  Consultants are now doing the opposite with their old presentations — changing all occurrences of "process", "project" and "team" to social.

Today the focus is on social, and tomorrow... adaptive, agile, innovative, productive... ?

*   *   *   *   *

If you want to read more about the early days of network analysis in organizations see this issue of Esther Dyson's Release 1.0 newsletter or this Corporate Leadership Council best practices study [large PDFs].

1/10/2012

Corruption on the Cuyahoga

The Cuyahoga River runs through NE Ohio into downtown Cleveland emptying in Lake Erie.  Native Americans named the river "cuyahoga" because it was crooked — full of bends and turns — following a serpentine structure.

It was a fortuitous naming — "crooked" has been the theme for politics in the County called Cuyahoga. In 2008, the FBI and IRS raided the County administration offices, and homes of some of the employees, to begin the long process of exposing the corrupt network of favors that controlled business within the County. This corrupt network followed the basic rule of all closed networks: "you have to buy in, to get in."

The network map below shows most of the network (132 people) that has been the focus of the federal probe.  A person is included in the network if they have been charged with a crime, or have been listed on an indictment/information or search warrant as having been "tied" to a suspect.  The nodes in red have been "charged" with a crime, the nodes in gray have no charges (as of January 9, 2012), and the nodes in black have been involved, but have passed away since the investigation started.  The three key nodes in the case (and in terms of network metrics) are hi-lited in blue — Frank Russo, J. Kevin Kelley and Jimmy Dimora. They are the central hubs in the network. The first two have already pleaded guilty and will testify against Dimora in the upcoming trial.



Most of the charged individuals have either plead guilty or have been found guilty in a jury trial.  A few contractors have been absolved of their charges by local juries.  Jimmy Dimora plead "not guilty" and will now face a federal racketeering trial in Akron, Ohio.

Below is the network graph of the witnesses for the Dimora trial made public by the presiding judge Sarah Lioi, and also printed in the Plain Dealer.  Jimmy Dimora is being tried together with co-defendent Michael Gabor — both hi-lited in pink.  The five key witnesses in the trial are all hi-lited in green.  People are linked if they were mentioned as tied in an indictment/information or search warrant.  We see how the witnesses are connected to the defendant.




Next we will look at the core of this conspiracy — see the social graph below.  We see that the core is broken into 2 clusters, with the central connector being Frank Russo.  He received the highest social network metric scores for both Power and Structural Holes.




Russo is holding the two clusters together.  As this very central figure, will he get the longest prison sentence for his crimes? We are sure judges don't do social network analysis, before sentencing.  Our social network metrics "predict" that Russo will get the longest sentence, followed by Dimora and Kelley, in that order.  Of course both Kelley and Russo have been cooperating with the investigators, so their sentences may be significantly affected by that.  We'll see.

In addition to hundreds of pages of legal documents (network data extracted by researcher Silvija Krebs), I used information from the Cleveland Plain Dealer which has been reporting on this investigation since day one. We will update the charts as we get new information. Thank you all!

UPDATE (March 9, 2012): Dimora guilty on 33/34 counts — mostly for Hobbs Act (racketeering) violations!

12/09/2011

Innovation happens at the Intersections

On December 3rd I attended part of the worldwide OpenData Hackathon, locally in Riga, Latvia.  A few dozen people were there and it was a good mix -- idea people and activists, data analysts, and coders/hackers (data extraction specialists).  The morning turned into a talkathon, but after Lunch we started digging for "interesting data" to play with.

Our first data set were the donations in 2010 and 2011 to various Latvian political parties from various individuals and companies.  After filtering out the smallest donations, we saw a familiar pattern -- most donors choose one party, and one party only, to donate to.  Several donors contributed to multiple parties -- spreading their bets (or "investments").  The donation pattern created a mostly hub-and-spoke network which is below. The green nodes are the parties, and the black nodes are the donors.  A black node is linked to a green node if a contribution above a certain amount was made.  Kind of a nice network for the holidays, eh?


Next we dug into the voting patterns of the Latvian parliament, the Saeima.  An early comment was that the data set would probably not show anything interesting — afterall, the parties keep a tight reign on the deputies, strongly encouraging everyone to vote similarly.  Being a long time student of social networks and emergent organization, I was not so sure.  I said, "Let's give a try and let the data speak"

Even though the data/results of each deputy vote is published on the Saeima web site, the data was not easy to extract for pattern-matching.  Raimonds Simanovskis and Jānis Baiza made a heroic effort and got the voting record data of the 11th Saeima.  Uldis Bojars ran the data through Python scripts to give us network data.

Our first few attempts at interesting visualizations were not successful.  Unfortunately the data contained many counts from procedural votes — where all deputies usually vote in the affirmative to move legislation along.  We had to filter the procedural votes out so that only the votes on substantial legislation remained. Once that was accomplished we were left with data that would expose voting patterns on important issues.  Like often in social network analysis, those who thought they knew the data, were surprised.  The deputies did not always vote as their parties instructed!  Some interesting political bedfellows emerged.  What was even more interesting where the patterns of the individual parties and where they end up on the network map.

We filtered the data to show the stronger, higher occurrence patterns.  The first pattern we saw was the Latvian/Russian split in the parliament.  The largely Russian Harmony Centre party was a an obvious cluster — they are the opposition party. The ruling coalition of Unity, Zatler Reform Party, and the National Alliance —all majority ethnic Latvian — formed an integrated cluster.  The remaining party, the Greens/Farmers had a choice — 1)isolation, 2)joining one of the two clusters, or 3)bridging the two clusters.  They have decided to be the bridge in the early months of the 11th Saeima.  

The network map of deputy voting patterns is below.  Two deputies are connected if they voted the same on many important pieces of legislation.  Harmony Centre are the bright red nodes on the right, the dark green nodes in the middle are the Greens/Farmers.  The darker maroon nodes are the National Alliance, the lighter purple nodes are Zatler's Reform Party and the light green nodes are Unity.  They are all colored by their branding colors.  The number of nodes shown does not add up to the total number of deputies in parliament — some of the deputies did not survive the high cutoff in this data set for number of votes.

The network map above shows the ruling coalition on the left side, with the near opposition in the center, and the far opposition on the right.  The ruling coalition and the far opposition do not have a pair of deputies voting alike.  The between party -- Greens/Farmers -- appear to be keeping the network from totally fragmenting.

We will allow the political experts in Latvia to explain the patterns in the above network — we can provide a copy of the network map with all of the deputy names.  How will this voting pattern affect a small country coming out of a deep economic recession? I for one, am happy not to see five disconnected clusters.  The bridging ties in the above network give me hope we can work toward a better Latvia for all.

This day of open data exploration was a perfect example of how innovation happens at the intersections!  People of different skills, perspectives, knowledge and goals came together around open government data and at the end of the day had formed an emergent network that was connected and moving forward, yet was not subverting individual talents and goals.  Like I always say:  Connect on your similarities and benefit from your differences!  

Maybe this mantra will play out in the Saeima also?

10/05/2011

Thanks Steve!


In 1988, I programmed the first version of InFlow on an original Macintosh (with added memory), using Prolog.

When I bought that computer in 1984, I knew it would change my life. I just did not know how at that moment... the dots are connected now. Thanks Steve!
"You can't connect the dots looking forward;
you can only connect them looking backwards."

 Steve Jobs
P.S. Think Different

8/30/2011

Circle of Influence

As the political season is now in full bloom, many of us are going to be looking at politicians and who influences them.

An easy way of doing this is looking at the "Circle of Influence" — a simple network diagram that reveals how money and favors flow in a clock-wise direction. A generic example from one of our projects, is illustrated below.



Starting at the top (12 o'clock) in a clockwise flow...
• Company A wants to obtain new contracts (without competing in the open market)
• Executive(s) from company A donate(s) to political party X
• Members of political party X vote to award contracts/legislation in favor of company A
• Company A receives a monetary benefit from new contracts
green arrows show money flow, while red arrows show influence/favor

Using these circles of influence we can see how politicians are embedded in networks of indebtedness and favor.

An example of the flow of influence from the Cuyahoga County Corruption Probe is shown here. This flow of influence did not succeed for the company/executive seeking favor.

An interesting insight into influence flows is that the longer they are, the more advantageous they are... for those involved. The more distance (steps in the network flow) a company/executive can put between themselves and the legislation/contracts they want to influence the harder it is to show an association/pressure. Executives and politicains want to avoid the obvious quid pro quo — they want the plausible deniability of an indirect quid pro quo.

What flows of influence will you spot in the upcoming elections?



5/25/2011

Who gets Attention on Twitter?


Recently I viewed my Klout "influence score" on Twitter. It was 57. Curious, I checked out my PeerIndex "influence score" on Twitter, it was 60. Hmm, are these two scores becoming similar, measuring the same stuff?

I occasionally look at these scores and noticed that they had changed in the last few weeks. Both of my scores had fallen in the last few weeks.

Had I grown less influential during my recent travel visiting clients?  I don't think so!

I had spent less time on Twitter, but that does not mean I am less influential today than at the beginning of the month. Hey folks at @klout and @peerindex... I have news for you!  Influence is not like a suntan. It is not dependent on daily exposure/activity on Twitter!!

Influence is not like a suntan, it does not change much based on *daily* exposure!

I looked up a few twitter friends/colleagues and noticed they had similar scores across Klout and PeerIndex also -- some where closer than others.

Next, we retrieved the Klout and PeerIndex scores for all people [~ 200] I follow on Twitter to see if there were any interesting patterns in this sample. Some of them had almost identical Klout and PeerIndex scores, some were not calculated by one or the other service, and some had divergent scores.

Which score more accurately gauges real influence on Twitter? Are either of these influence scores significantly better than the back-of-the-napkin Twitter metrics [LFR score] I described earlier? How precise are these scores?

I found both Klout, and PeerIndex scores on 177 of the 200 people I follow. Of course, there is an LFR score for everyone on Twitter -- it is easily calculated by looking at a person's Followers and Listed counts in their Twitter Profile.
    Looking at all three scores we see some difference, but not much.

  • Klout and Peer Index differ by an average of 13 across the 200 people I follow
  • Klout and LFR differ by 15 on average
  • Peer Index and LFR differ by 18 on average.
When I remove some of the outliers [less than 10% of the population] the difference between the three scores shrinks noticeably.

Does it really matter which score we use? How accurately can you measure something as nebulous as influence or attention? Is a several point difference between scores a significant delta?

Of course, the good news is you do not need to be popular to receive deserved attention!

I still prefer the LFR (List to Follower Ratio) score -- when looking at someone's Twitter profile, I can easily calc LFR in my head and make a quick judgement on whether to follow this person, or add them to a topic list.

To calculate LFR quickly, add a zero (0) to the Listed number and then divide that by the number of Followers, i.e. I have 4088 followers and appear on 483 Lists, my LFR is 4830/4088 = 1.18. A number > 1.00 means people are paying attention to you, a score approaching 2.00 means you have the focused attention of many! Of course, the good news is you do not need to be popular to receive deserved attention!

LFR finds us such Twitter gems as @VenessaMiemis (LFR=1.66), @zenext (LFR=1.69), @jhagel(LFR=1.72), and @twliterary(LFR=1.50), each is paid great attention to in their respective field, and on Twitter.

What other Twitter influence/attention metrics do you track?

UPDATE1: Interesting interview by Augie Ray with Azeem Azhar, CEO of Peer Index.
 I like Azeem's concept of "cheap"(i.e. following) and "expensive"(i.e. responding) activities on Twitter. I agree, it is more important to look at the expensive activities to gain a more realistic perspective of who/what is really important.   IMHO, the power of LFR is in the very expensive activity of creating and curating Lists on Twitter!

UPDATE2: I have created a LFR Twitter List, that anyone can follow, of people whose LFR > 1.  If your LFR is greater than 1 and you are not on the list, and you have more than 100 followers, let me know!



5/11/2011

Ecosystem Wars

In the connected world of today companies compete not just on products, but on integrated ecosystems of cooperating products and services. Business today is a war between networks... reach, inclusion, attention, power, control, influence... all those network dynamics are in play.

Great article from Gizmodo about the war between the ecosystems of Apple, Google and Microsoft for internet supremacy: The Dogs of War: Apple vs. Google vs. Microsoft

Gizmodo provides the network diagram below to illustrate the intertwined battle amongst the three titans of the consumer internet.  This is what social network analysts call a two-mode network: connections between companies and markets.


I am a big fan of network visualization, but a spaghetti diagram is not always a good solution. Yet, early in my network analysis career, I also produced spaghetti diagrams of the internet industry!

Visualization should help us quickly see the pattern(s) in a complex dynamic. Is there another way to display the data so that it is easily understandable by a business reader?

I took the data from the Gizmodo diagram, added this week's acquisition of Skype by Microsoft, and a few other missing items, and came up with this alternate view of the internet ecosystem wars below.


Displaying network links as intersections on a Venn diagram clears up the picture quite a bit -- quickly showing where the three big players do, and do not, overlap. In terms of social network analysis, it is easier to see the structural equivalence of the network players with the Venn diagram. Sociologists believe that in networks, the more structurally equivalent two nodes are, the more they will be in fierce competition, and will continue to mimic each other's moves.

Update: Here is another dreadful network visualization from the New York Times.  It could also use the Venn treatment, eh?

4/20/2011

Making Buying Decisions via Social Network Analysis


I have an interest in the recent financial crisis (a.k.a. mortgage meltdown) so I am constantly looking for good reading material about the topic. This morning I was wondering... "What should I read next?" "Which one book will cover most of the angles of this topic?"

Rather than spend time reading reviews of the popular books on the topic, I followed my own suggestion. Using social network analysis [SNA], I created a book network around the topic of "financial crisis". The data was gathered from Amazon and the map of the most frequently mentioned "also bought" books is displayed above. The map does not rank books by sales volume, though it does show the most popular book on the topic of financial crisis.

The network map above shows how books were bought together and by the same customers: A-->B means that people who bought book A also bought book B. This book network helps us see which books are most influential and most integrated in this topic area. I am looking for ONE book to read on the subject, so I will be examining the "integration" scores of each book in the network.

After gathering the data and putting into my software, the network self organizes into two clusters. The cluster on the left contains mostly economic perspectives on the financial crisis and the books in the small cluster on the right contain more of a political perspective on the crisis. I am more interested in the economic dynamics of the financial crisis so I will focus on the cluster on the left.

Two books emerge at the top of the list of network integration scores [the nodes in the network map are sized according to relative integration scores] -- Too Big to Fail and All the Devils Are Here. They both have many similar connections to other books, so they play similar position in the network -- I could choose either one, and be happy.

Update: Thanks to Laura C. Tisdel, Editor of Too Big to Fail for sending a copy of the book after reading the above!

3/10/2011

Visualizing Twitter Lists


The network map above shows the links between people on the Twitter List: Network Analysis, maintained by @valdiskrebs. Two individuals are connected on the map if they both follow each other on Twitter.

The size of the node corresponds to how "integrated" the person is in the following relationships amongst Network Analysts on the list. This network metric -- Integration -- goes beyond geodesics, and looks at paths of varying lengths in the network. The larger the node the more it is "in the thick of things" of Twitter conversations about Network Analysis.

We assume information and knowledge arrive both directly and indirectly on Twitter, based on who you are paying attention to, and who they are paying attention to. The timing of a Tweet can be very important depending who picks it up and ReTweets it to their Followers. Therefore, it is good that a Tweet/idea follows several paths and arrives at different times.

An interactive version of this social graph can be found on the orgnet.com site.

We thank @marc_smith for the Twitter Following data provided in March 2011.

2/03/2011

Mapping Twitter #Chats


This is a network map of the almost 1000 tweets during the #ideachat 1 hour session in November 2010. Individual participants in the chat are shown as purple nodes and the "whole group" is shown as the large green circular node. If someone tweeted to everyone in the group, at least twice in the session, an arrow would be drawn from their node to the big green node. People who tweeted to each other [@ messages or RTs], at least twice in the 1 hour session, will have arrows drawn from the tweeter node to the subject node. @blogbrevity <--> @cocreatr indicates that they both sent 2 or more tweets to each other during the session. [We do not show the hundreds of single tweets in the session -- we are looking for key participants.]

Node size on the network map reflects a new network metric we are experimenting with called "attention" which tries to determine both quantity and quality of links pointing at someone. It's not just the number of tweets pointed at you, but who they come from that matters. We will also post an interactive version of this map that will allow you to filter on the type of tweets and their timing during the 1 hour session.

From: blogbrevity's posterous. Thanks, Angela!

12/11/2010

Holiday Connections


Merry Christmas! Priecīgus Ziemassvētkus!

Remember, the holidays are great times to re-energize, re-activate and re-weave your networks! During the holidays we come in contact with people we do not see the rest of the year. They bring us new information, insights and intersections from the different networks they play in. Connect, communicate and celebrate! Let the merriment flow and the overlaps emerge!

Happy New Year! Laimīgu Jauno Gadu!


Original Holiday Network Art - © 2008, Silvija Krebs

10/30/2010

Networks on the Radio

I recently had the opportunity to be interviewed by Nora Young of CBC Radio in Toronto for her wonderful program: Spark. She is a great interviewer -- puts the subject at ease and asks very interesting questions. The interview covers basic aspects of social network analysis, including privacy issues with the data. An MP3 of the interview is available as well as this more information on this web page.

The day before the mid-term election in the USA, November 1st, I will be on the Brian Lehrer Show on WNYC discussing political networks. The network map below shows some of the political ties of the two gubernatorial candidates in New York.


It is not surprising that the political networks of people in politics are not that much different in pattern from the network of political books we read. In my previous mapping of networks of political books, the general pattern has remained the same, though the books [nodes in the network] have changed — strong clusters of Red and Blue with a thin strand connecting them.