Ming the Mechanic:
Incidental Data Analysis

The NewsLog of Flemming Funch
 Incidental Data Analysis2004-05-05 07:13
picture by Flemming Funch

There's one discipline I'd like to be an expert in. Well, there are many subjects and disciplines I'd like to explore, but that I don't get around to.

I'm, however, very intrigued about techniques for figuring things out without direct access to conclusive source data.

Everything leads traces. Everything that happens leaves some kind of signs in the environment. Nothing real is completely isolated from its surroundings. Every event interacts with its circumstances, and will influence them in ways that can be noticed. Even if you don't have the thing itself, you will always have access to something that was in contact with it. If you're skilled enough, you can find things out indirectly that aren't available directly.

A good example of the principle is the rumor that whenever the Pentagon is planning a big operation, the number of pizzas ordered at night from Washington D.C. pizzarias to be delivered to the Pentagon, will increase. Obviously because a bunch of people are working long nights there, and they're hungry. It is kind of obvious. If you can find out where they order their pizzas, and you watch the graph of how many pizzas are ordered at which times, it can tell you that something is going on.

That also brings up the obvious objection to such an intelligence method. It doesn't really prove anything. It doesn't at first tell you *what* is going on. Maybe they're just doing the yearly audit in accounting. Maybe they just changed pizzarias. Maybe their own kitchen is out of order.

A lot of activities can stay very hidden exactly because such approaches generally aren't officially given much credence, and there are few organized techniques available to the general public for applying them. Which is in part because those people who're doing big things and have big things to hide would prefer that nobody can figure them out easily. And because they use such techniques themselves.

Pattern matching and profiling is one angle of it. If you find the data that sticks out, that doesn't match the normal profile of how something behaves, or that matches a profile of particular kinds of unusual behavor - you know something is going on. Again, you might not know exactly what, but you know where to look. Like how your bank might freeze your credit card because you did something they considered unusual, like buying jewelry in Hong Kong, when normally you only buy groceries in Wisconsin. But they're rather bad at this science, so they make mistakes half the time.

The folks who control large amounts of centralized data have a leg up on everybody else, of course. If one has access to your bank records, your telephone records, your travel records, etc, one could say a whole lot about you.

I suspect the FBI and CIA folks are not very good at it, though, and they're probably just drowning in data that they don't know what to do with. However, I also suspect that there are some groups that are experts at analyzing patterns in such data, but they probably aren't talking about it.

Anyway, the working theory is that any kind of incidental data can lead you to figure out big things that are hidden. Maybe not maliciously hidden, maybe just obscured from view and not yet discovered.

If I stood down on the corner and catalogued all cars that drove by, and I did that for a while, I'd learn something from the patterns. Duh, yeah, how many cars drive by, of course, which is useful for traffic planning. But if I look a little broader, I might see surprising data. Why are there an unsual number of red cars driving by every day around 3 o'clock? That might not provide the answer, but it might tell me what to look for next. Doesn't have to be anything exciting. I might just learn that they're company cars, and that the sales people for a certain company meet for donuts in a certain place at that time. But there's something to find out.

We're generally being sold a picture of the world as being very confusing and disjointed. A lot of discrete events that you can't all keep track of, and the only way of making sense of it is to listen to somebody's two minute summary on the news, or by adopting some ideology that summarizes the world in simple ways I can just believe in without examining it too closely. But I don't buy it. I think there are better ways of making much better sense of even a world represented by huge amounts of data.

Obvious techniques are to count things, and to categorize and catalogue them. And cross-relate different kinds of data. And then to look for certain patterns of things being "wrong" or "right" or "off" or "on" about them. E.g. things being out of sequence, misplaced, misnamed, or things fitting unusually well together, or working unusually well. If a hundred carpenters seem to produce a certain typical number of chairs and tables, and one of them produces twice as many, it tells us something. We might discover that he just works twice as long, or we might discover that he has an approach that works better. If a certain news agency produces considerably more erroneous news stories than the other news agencies, there's a story there somewhere.

Incidental data usually isn't random, even if it looks like it at first. Really, most things in the world are connected with each other, so things usually are the way they are because of the way other things are, and there's a bigger picture there.

That also opens the door for apparently non-sensical ways of divining what is going on. If Uncle Joe's left knee hurts before it is going to rain, and that is reliable, we don't really have to know exactly how that comes about. An analysis of his success rate is all we'd need. Likewise, if somebody has a system of divination based on tea leaves, and it happens to work, that is valid data as well. Even if somebody else would like you to believe that you can't say anything useful about things you can't explain as a direct cause-effect relationships.

There's a certain centralized power structure that exists in terms of information. Those who appear to hold the centralized stores of the "proper" data, like governments, the police, banks, credit bureaus, media companies, scientists and educational institutions, pretend that they have a monopoly on telling you what is going on. Where I postulate that we've gotten to a point where grassroot networks of regular folks easily could be as informed as any of those. But not just by passing vague rumors and opinions around. Some tools and proven disciplines would help.

Nothing happens in a vacuum. Life leaves tracks. Almost any kind of incidental data source, if analyzed a bit, can tell you where there are tracks. By combining a sufficient number of dimensions of data, you can probably see where the tracks lead, and you can make out the silhouette of what is there.

[< Back] [Ming the Mechanic]



5 May 2004 @ 16:05 by ming : Intelligence
A good start might be to catalogue publically available data sources. And maybe work on a uniform way of accessing them.

Even stuff that everybody expects to be public data is probably a little hard to come by. Like, if I want geographical and social statistics for most countries in the world. And I wanted to cross-relate them. Like, how does the meridian income relate to the amount of polution, or the birthrate, or the price of a BigMac. If I have time enough, and somebody is paying me for it, I'll find the data quickly enough. But it would be great if any interested person could decide to spend an hour examining the statistics of the world, to see for themselves how things relate to each other. They'd need the data easily available in a uniform format, and/or they'd need some tools for playing with them, graphing them, etc. And it gets interesting the more free they are to combine different kinds of data with each other.

Along the lines of Buckminster Fuller's {link:http://bfi.org/node/564|Geoscope} concept. See Tom van Sant's {link:http://www.geosphere.com|Geosphere} project. But the key thing should be direct access to playing with the data, not just that there are pretty visualizations for teaching purposes.  

6 May 2004 @ 04:14 by maxtobin : Careful Now
Flemming if you are not careful you will be able to prove that Bumble Bees fly before you know it!! And open source tools would be huge bonus for us mere mortals eh? You and Andrew should follow that thread a bit, where can we find a benefactor for you? If we win the lotto or the Fairy GrandMother fills our bank up I would consider being a patron for such a project. That way we may be able to reliably tell where fires have been started.  

6 May 2004 @ 06:38 by ming : Data
Yeah, maybe bumble bees really do fly. I think it would be very useful with better tools for figuring out what things mean, and what's really going on, in the hands of regular people. So, I hope it comes together somehow.  

9 May 2004 @ 16:06 by Jon Husband @ : Fractal activity driven by purpose ?
Excellent post, Ming. Thanks. The patterns that do emerge from watching and absorbing the interconnected activities of people coalescing around this purpose or that purpose, via blogging or using collaborative social software of some sort or other, will of course become more apparent to us (as they have been so doing over the past few years).

The current forms of managerial capitalism, along with a general ethos that is skeptical of sharing (against people's deeper instincts, I submit) are important obstacle to this, as I believe you have often stated.

First we shape our structures, then our structures shape us.

Where are we on that continuum at this point in time ?  

9 May 2004 @ 18:00 by ming : Fractal activity
I guess I'm searching for some keys to bringing out the patterns - to illuminate what is really going on, what the connections are, what works and what doesn't. The scales might tip easily if we could see more clearly. Advertising would be useless if we were sufficiently well informed about what's good and what isn't, and where we can get it most easily. Politics would be a totally different game if we were well enough informed about what the outcomes were of different decisions and who's working towards which outcome. Nobody can manipulate us en masse if we can easily see through their actions and tell each other about it.

But we're still sailing in a fog and we don't know where the rocks are. So those who monopolize the lighthouses are still ahead.

Optimistically, our trends towards higher quality sharing over the net might irreversibly take us to the different world we can sense. Interconnected, collaborative, open source, emergent wirearchy, fractal anarchy. But it is not for sure yet.  

11 May 2004 @ 09:25 by Jon Husband @ : Fractal - experts and webs
Flemming, what were your insights when you were looking at Britt's XpertWeb ? I always thought that very interesting, but probably doomed because of the amount of marketing it would have taken to get that "model" into a sufficiently wide awareness - kinda like today's insufficient YASN's, no ??  

11 May 2004 @ 10:13 by ming : Xpertweb
I think it is the right idea. It is just how to get the details right, to make it work, and take off, virally. Britt is very clear on how he wants it to work. I probably would prefer to think the design more through first, to be sure we get it right. Like, I think it is tied to some problems that in themselves are a bit harder than how we account for stuff people have done for each other and how they liked it. Like ID. If it isn't tied to a very strong digital ID, where we can be pretty sure that people can't fake most of the stuff that will make them look good, then I'm not sure it would actually work.  

11 May 2004 @ 10:34 by Jon Husband @ : Xpertweb
What i've never understood is how XW would support all those who have no marketable expertise, per se. And wouldn't ther be a "power law" of sorts emerge pretty darn quickly ?

One could argue that blogging "mirrors" this already. I can't think of the number of times some expert who is well-know in certain circles (i.e. Many2Many, or AlwaysOn, or in the tautological blog rolls of many A-listers) has said/written something that I've read here, or elsewhere, or written myself. They get attention, they get invited to speak at conferences, they get involved in interesting projects with interesting connected circles of people, and many many bright minds and good hearts get left looking in ... who would get involved and active in a flash.

I don't think ratings are the answer - I do think there's something in a dialogue-driven and built "agenda" of sorts - a very wide, global type of ongoing Open Space, underpinned by fundamental values that the world has ascribed to ... which would mean the re-definition of today's capitalism (see the Support Economy).

And so on ...  

11 May 2004 @ 11:02 by ming : Value
Well, part of the idea in Xpertweb would be that it would be much easier to figure out the perceived value of somebody's work. Their record would be right there, so if they have a string of unhappy clients, it doesn't really matter if they're otherwise really popular in front of a group. The people who would stand out would be those who have a record of good work. Of course there might be a bit of a powerlaw there, where everybody wants one of the top 10 in a given field. Also, you're right, it doesn't do anything for people who don't have marketable expertise. Makes sense if you're really an expert in something, or a skilled craftsman or something. But less so if you aren't able to state what you do. Although it could still be valuable to have a good record of doing *anything*, no matter what it is. And a market might develop for helping people present themselves better. But the first people it would appeal to would probably be smart knowledge workers, and not unskilled laborers. And it isn't clear how well it would serve the majority of humanity.  

21 May 2004 @ 07:07 by Seb @ : non-secrets
Ok, this is where I was meaning to leave this link.

"if you believe that only secrets matter, then you will tend to not know where to go for the non-secrets."


23 May 2004 @ 08:25 by Seb @ : KnowItAll
And this one - http://www.cs.washington.edu/research/knowitall/  

23 May 2004 @ 14:10 by ming : Open Source Intelligence
Yeah, OSINT would be what we're talking about here. And that Robert Steele there would be one of the first person I'd think of who knows how to do that well.  

Other stories in
2014-11-07 23:12: Welcome to the 5th dimension
2011-11-07 17:22: Notice the incidental
2010-07-14 13:35: Consciousness of Pattern
2010-06-28 00:03: Pump up the synchronicity
2009-10-29 14:03: Convergent or Divergent
2007-08-05 23:45: Perverse incentives
2007-06-22 22:18: Elementary magic
2007-03-21 14:20: Cymatics and group formation
2007-03-15 01:06: Structural holes
2007-02-27 23:50: Leverage

[< Back] [Ming the Mechanic] [PermaLink]? 

Link to this article as: http://ming.tv/flemming2.php/__show_article/_a000010-001234.htm
Main Page: ming.tv