Below is some recent research on visual data mining.
This article by CNET, Why Google is Ditching Search, prompted me to look for empirical research supporting the author’s premise. And I found this gem of a research paper, #TwitterSearch: A Comparison of Microblog Search and Web Search.
The Stanford and Microsoft researchers compared how individuals use search in Twitter versus traditional Web platforms like Google and Bing. What the researchers found:
- Web search can leverage social search to discover additional search queries that are temporally and contextually related, thus delivering a more relevant set of search results
- Social search influences the perception of online reputation
- Web search can leverage the hashtag and tagging concepts central to social search (especially Twitter and del.icio.us) to identify and deliver non-spam results that deep link to further relevant results
- Web search can leverage social search to understand what issues are trending, the nuances of these trends, and then relate these discoveries to search queries and thereby deliver a more relevant result
Similarly, these findings above suggest there is increased opportunity within CRM systems. The researchers found that individuals bounce between social and Web search as they narrow their queries. If a brand is leveraging a social platform (via Twitter, Tumblr, Facebook, etc) and focused on SEO, and consumers find consistent redundancy in results for their queries via both search platforms, the likelihood that this consumer will reach out to this brand increases. And if this brand is capable of tracking the source of the lead (what platform delivered the lead) in conjunction with tracking the query that generated the lead (what was actually searched), then the brand can engage the consumer with a higher level of insight. This type of process necessarily promotes high consumer satisfaction (and increased likelihood of lead conversion).
Photo credit: visualpanic
Data analysis is the new plastics. Remember this scene from the movie the The Graduate?
Below is a curated list of articles from this week of innovative social analytics and business intelligence initiatives.
In this article from O’Reilly Radar, we learn that social network analysis is amalgamation of social science analysis such as sociology, political science, psychology, and anthropology combined with traditional mathematical measurements. At it’s core, social network analysis measures relationships between people and organizations. But cutting edge research is also looking at ways to leverage social network analysis as a form of early warning system for natural disasters. Much social network analysis has been regressive in nature, the future will focus more on real time analysis.
And speaking of real time analytics, the article from the Washington Post makes the argument that real time results may have a significant influence on the up-coming 2012 elections.
Perry is done,” came a Twitter posting from a viewer called (at)PatMcPsu, even while the Texas governor struggled to name the third of three federal agencies he said he would eliminate as president. Another, called (at)sfiorini, messaged, “Whoa? Seriously, Rick Perry? He can’t even name the agencies he wants to abolish. Wow. Just wow.
The key point to remember is that the “real time citizen” is no longer content to remain passive. Additionally, will the “real time citizen” quietly wait for poll stations and voting counts to close in other states before announcing the results of his/her own state? Will be interesting to watch how quiet or loud Mr. and Mrs. Real Time Citizen will react in 2012.
Finally, social app analytics start-up Kontagent snagged $12 million in a Series B round. According to an interview with Kontagent’s founder, what makes Kontagent unique is that does not perform “traditional” social analytics function (such as conversation monitoring, tabulating likes, etc) but performs deep analytics, with a focus on teasing out profitability KPIs, and has a team of data analytics and data visualization scientists working to help clients understand, interpret, and make informed business decisions based on Kontagent’s proprietary data visualization techniques.
In response to a request by @Gahlord to research the concept of “social proximity” I have found eight articles that broadly sketch the primary issues and principles related to “social proximity”.
In Towards Design Guidelines for Portable Digital Proximities A Case study with Social Net and Social Proximity (.pdf), the authors apparently introduced the concept of social proximity, which they define as:
[T]he relationships between people in space, within social networks, and through time.
In Life in the network: the coming age of computational social science, the authors discuss the rapidly changing pace of computational social science.
In To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles (.pdf) the authors discuss privacy issues related to social media and the natural tension between “public” and “private” information (see also my earlier article relating to this topic).
In Inferring friendship network structure by using mobile phone data, the authors found that it’s possible to infer with 95% accuracy friendships based on mobile data.
In Bridging the Gap Between Physical Location and Online Social Networks (.pdf), the authors demonstrate how to predict friendship between two users using their respective location trails.
In Social distance, heterogeneity, and social interactions (.pdf, and I hope you’re good in mathematics to understand this article), the authors propose a new model to analyze peer group interactions.
In Connectivity Does Not Ensure Community: On Social Capital, Networks and Communities of Place (.pdf), the author proposes that the strongest online communities are those create senses of social ownership within the community.
In Semantic Grounding of Tag Relatedness in Social Bookmarking Systems (.pdf) the authors discuss how collaborative tagging systems can be used to derive a global tagging relatedness structure from an uncontrolled tagging folksonomy.
In The anatomy of a large-scale social search engine (.pdf) the authors present Aardvark, a social search engine.
This blog article on “controlled serendipity” spurred me to conduct a little content curating of my own, resulting in this gem of a research paper that documents how the BBC utilizes Linked Data technologies to make it easier for BBC users to navigate its vast programming database.
The first article discusses how the Web collective–the user commons if you will–is benefiting from individual efforts at curating content, done largely as a free service driven by a spirit to share.
Sharing has become a reflex action when people find an interesting video, link or story. Great content going viral isn’t new. But the sharing mentality is no longer confined to the occasional gems. It’s for everything we consume online, large or small.
I think anyone engaged in the social Web would readily agree with this sentiment. It’s what makes participating in this distributed forum so fun. The article also points out, however, that the vast content mines that exist can be somewhat difficult to navigate to find true gems. Thus, the implication is that content providers need to step up to the plate and deliver content systems that make it easier on Web “content curators”.
The research paper referenced above describes how the BBC used a concept called Named Entity Recognition (NER) to extract concepts from textual input. This allowed for more efficient human editorial input to ensure that these concepts were accurate. Once approved these “concepts” were transformed into links appearing on a Web page. This process then allowed the BBC to use the “concept links” to create user journeys through their site. All this is based on semantic web principles. The future looks bright, indeed, for those of us who constantly scour the Web for salient content.
Here’s an article that details some interesting issues relative to search, recapping a Xconomy Forum on the Future of Search and Information Discovery panel recently held in Seattle. On the dais were Microsoft, Google, and a couple of University of Washington professors. Here’s some salient take-aways:
- It’s still unresolved whether vertical search will significantly impact general search
- The nexus between real-time search, consumer intent, and semantic search is where the search gold resides
- Hurricane Katrina taught Google a lesson about relevance and real time results
- Opportunities to compete with Google and Bing exist, but only on the edge or fringe such as applications that bypass search engines, employ automated content discovery mechanisms, use semantic search, or perfect mobile geo-search
Google is like smoking cigarettes, it’s a habit that’s going to be difficult to give up. So what can you do? You have to think about the problem space. Google’s approach is to get people in and out of search engine quickly with their result. Not the right way to think about it. Right way to think about it is to think about minimizing time of completing a task, not minimize the amount of time to match a query with a url.
[O]rganize the information in a way that synthesizes the task that you want to accomplish.
Mobile is huge. Apple is the big fish at the moment. Android coming on strong. Won’t hold my breath on Microsoft.
Two things which potentially threaten us.  As we become bigger and older, it could become more difficult for Google to innovate… Also worry about diminishment of sense of entrepreneurship.
Chris Brogan interview
Excellent interview with Chris Brogan on how he’d run an airline and implement some social web karma; great insights, well worth the 9:58 investment of your time. The interviewer, Shashank Nigam, CEO, SimpliFlying, asks some really good questions. My comment after listening to the interview: That was seriously cool.
Insightful post on how Dunkin’ Donuts uses the social web to extend its brand engagement. Dunkin’ Donuts’ recently released Dunkin’ Run app is a nice, simple deployment of a social app that has a built-in ROI component: buying doughnuts.
Interesting TechCrunch profile of Vyoom, which is a social networking site that gives you redeemable points for your participation. The more points you accumulate, the more stuff you can buy. Not sure whether this will work as a stand-alone application/concept, but could certainly see this applied in a rewards program under a major brand (e.g., Southwest’s Rapid Rewards program).
Social media is social what?
A call for dropping the term “media” from the phrase “social media”. Compelling argument to drop the fascination with the platforms and concentrate on the quality of the content and product.
Public relations social web tactics
Long list of new products and services pitched to a Kentucky-based director of social media (two of the brands he reps: Maker’s Mark and Knob Creek bourbons). Very interesting list of social media “newness” and implicit insight into public relations 2.0 tactics.
Interviews with semantic search pioneers
Summary of interviews with key semantic web players from Google, Ask, Hakia, Microsoft, Yahoo, and True Knowledge. Some topics: shift from “popularity” based search results to “credibility” based search results.
Semantic technology and artificial intelligence
There’s lots of discussion lately about the semantic web and well-deserved praise over applications like Wolfram Alpha that employ semantic web theories to deliver relevant search results. In 2002, a short article discussed the concept of the “wisdom web” and highlighted many of the innovative concepts we’re seeing applied today. Future applications will likely employ intelligent agents to accomplish much of the “secretarial” type functions manually input today by humans into search engines, social networks, and other Web applications and platforms (here’s a great summary of intelligent agents in the evolution of Web applications).
Let’s assume a situation where intellectual property and licensing issues are properly resolved and set with respect to granting outside developers access to MLS content and data.
If you’ve heard of an MLS (or a broker with a VOW) that has engaged a group of skilled programmers similar to what Washington D.C. did with its content and data, please let me know. Don’t you think something wonderful could happen with real estate search similar to what’s about to happen with bioinformatics?
Dialogue between bioinformaticists and semantic Web developers has been steadily increasing for a number of years now as widespread data integration problems have clearly begun to impede the progress of research.
This is not to say that challenges don’t exist,
[I]f you’re talking about traversing [information and data] computationally, then it’s much more challenging to make sure everything means the same thing and that the object that you’re getting to on the next path has the same persistence, quality, and structure that you’re expecting to operate on.
Nevertheless, the vision for a more collaborative and effective future is vibrant,
Ultimately, what the semantic Web community hopes to have are applications that will make the complexity of the technology as invisible as possible.
The real estate industry has an existing standardization body: RETS. It seems to me that an MLS (or broker VOW) could provide great value to its public and real estate industry stakeholders by adopting a RETS standard (thus, at some level, solving the data standardization issue raised above) while opening its data pantry to a group of developers, similar to what Washington D.C. did with its Apps for Democracy contest held last year (according to the Apps for Democracy website, the city realized a $2,300,000 value, not to mention the fact that the public now has some nifty tools),
The first-prize winner in the organization category was a site called D.C. Historic Tours, developed by Internet marketing company Boalt. The information about area attractions came from the city, but Boalt developers decided how to present it…The site uses Google Maps as the basis for enabling users to build their own walking tours of the city. It pulls information from Wikipedia, the Flickr photo-sharing service and a list of historic buildings.
Imagine a pool of widgets, desktop apps, apps for iPhone’s, Blackberries, etc, that slice and dice real estate content and data in novel ways. The public would obviously benefit by accessing real estate information in ways that are most meaningful to them. The content/data provider benefits by engaging the public at a deeper, more relevant, and effective manner. And real estate agents ultimately benefit because a more satisfied, more qualified, and more engaged buyer or seller equates to increased business opportunities.
I stumbled across the Semantic Interoperability of Metadata and Information in unLike Environments (SIMILE) program at MIT. Rather than try to summarize what they’re doing, here are some examples: Music Composer Research Database, click a composer’s name to see what happens; UK Traffic, click a blue dot on the map to see what happens.
Web 2.0 coolness
Excellent interviews of Tim O’Reilly by HubSpot CEO Brian Halligan. Discusses baseline concepts of what it means to “be Web 2.0”; change in thinking and corporate ethos and individual creed.
Wonderful missive on the nexus between art and Web 2.0. I especially enjoyed the author’s discussion of what “avant-garde” means–as originally put forth in this essay–in the 21st century. Both are meaningful reads because each author broaches core issues relating to a wide cultural shift in collaboration across different societal strata.
Real estate firms that are thinking about implementing social media marketing strategies should pay attention to Charlene Li‘s predictions. Li’s series of five interviews paint a road map of the social media future firms ought to be considering today:
Her thinking mirrors that of Henry Jenkins, Director of the MIT Comparative Media Studies Program, who essentially argues in his white paper “If it doesn’t spread, it’s dead” that marketers need to shed the terms “viral” and “memes” and adopt “spreadability” as a benchmark.
What Jenkins points out, and Li implicitly endorses, is that humans are not passive hosts that propagate marketing messages. Rather, humans take an active roll in transferring and transforming marketing messages. Thus, marketers need to rethink their approaches to conceiving of, executing on, and managing marketing campaigns by migrating away from command and control modalities to adopting more of a marketing midwifery role.
I think that Li’s and Jenkins’ thoughts also pertain to CRM definitions. Let’s shed the agri-centric CRM labels like “cultivate,” “nurture,” and “harvest” for terms that recognize a consumer’s role in allowing themselves to become part of a CRM system, rather than passive victims of that system. Terms like “engagement,” “conversation,” and “partner” align with Li’s and Jenkins’ sentiments, seem more respectful of an individual’s role in a CRM system, and are reflective of the fact that consumers are active participants in a firm’s “relationship management” processes. And assuming that a firm’s client services division performs at high levels of consumer satisfaction, this ethos shift also has the potential to empower “engaged” consumers to spread the word of a firm’s client services successes (much like Li relates in her Comcast example in the above “organizational trust” interview).
Here’s a story from DavidHenderson.com about a “Twevent” that happened to a senior level public relations employee. The case involved FedEx (the client) and the following Tweet:
True confession but I’m in one of those towns where I scratch my head and say “I would die if I had to live here! citation
DavidHenderson.com summarized the ensuing events:
Someone inside FedEx was following…and that person shared the post among the top executives at the FedEx front office, and the company’s corporate communications staff. At that point, a person in the FedEx corporate communications staff apparently took umbrage to the post…and responded [to him].
The public relations executive posted the following Tweet as events ensued over the next couple of days:
This is hard to fit in 140 characters or less so please read here. All about my recent Twitter post citation
I found this FedEx story via this Sun Microsystems blog post which discusses issues surrounding one’s digital legacy. The key take-away, in my opinion, is to understand that crowdsourcing memes can possibly lead to unintended consequences and misinterpreted meanings.
Thus, when asked by real estate professionals about how they should approach social media generally, and Twitter specifically, I talk about defining digital personas and sticking to that persona in every post, Facebook or LinkedIn update, Tweet, etc.
Here are my thoughts regarding managing one’s digital legacy:
- Define the persona you want to convey to your known audience as well as your unknown audience; this will become your digital legacy over time
- Understand that Facebook differs from LinkedIn which differs from Twitter, etc, and that each social media space has a different environment–ecosystem or culture if you will–that you must first understand and then integrate with after you understand it (I say lurk heartedly to see how other people use a specific medium, read the FAQs and support sections, etc, then step into the playground when you have a general sense of the rules)
- You can have varied persona’s for each environment, but each such persona should roll-up to support the overall “personal brand” you’re trying to build (think of the different personalities you adopt during client presentations, while at the office, at cocktail parties, etc)
- Think 24 months out from now and ask yourself “What do I want people to see when they search me on Google”? Think about what “output” or “outcome” you want in this circumstance, and then work backwards at ensuring that your “inputs” (your blog posts, your Facebook and LinkedIn profiles, and the majority of your Tweets) meet your expected outcome
Perhaps I am over thinking this. However, when I read posts like the above, I cannot help but think that a managed approach like the simple process I’ve outlined is a viable approach for real estate professionals (especially agents new to the space) whose livelihood, value, reputation, and expertise will be run through a Google (or some new equivalent) sieve for the foreseeable future.
As always I am grateful to Owyang to lend his insight and foresight. Here’s another excellent missive on the “Intelligent Web”. In summary, he posits that machines will begin extrapolating relationships and driving recommendations for connections from the juxtapositions and nexus between “our behaviors, context, and preferences”. Sounds a bit like the semantic web. Spinning through the comments on this post brought me to the Innovation Insight blog where Guy Hagen explores MIT research related to “reality mining”, which you can find more about on the MIT Web site. And this research paper out of UC DAVIS demonstrates how the MIT Reality Mining data set was utilized in tracking behaviour via mobile phones.
Imagine an iPhone application overlayed on a real estate firm’s listing data set, where the iPhone reports back over time thousands of user’s mobile browsing habits (i.e., driving around looking at homes for sale or rent). Having such data would allow firms to target advertising, Web site promotions, and give predictive insight over their competitors with respect to fluctuating markets (e.g., patterns will emerge over time that will tell a firm which neighborhoods, etc, are capturing consumer interest, thus enabling a firm to deploy marketing and agent resources towards these locations ahead of their competition).
This paper argues that allowing consumers to “co-create” or “co-author” products–i.e., directly engaging and encouraging consumers to participate in new product development processes–taps vast wells of creativity while exploiting certain cost efficiencies in terms of labor. Similarly, this paper explores how Web 2.0 will fundamentally (has fundamentally) changed the manner by which companies must brand themselves. Gone is a command and control ethos. Emerging is an empowerment and transparency ethos:
- engagement replaces interruption
- diversity and self-expression replace conformism and unity
- the media of the masses replace mass media
- granular insights and rich data replaces generalisation
- conversations in marketing replace control
TouchGraph is an excellent tool that gives you “visual insight” into a site’s external linking structure and relationships, which is a good starting point for website competitive analysis. Let’s compare Redfin, Zillow, and REALTOR.com.
Redfin’s linking relationships
Zillow’s linking relationships
REALTOR.com’s linking relationships
The visual representation of these relationships allows you to quickly explore the link structure of the “affiliated” sites much faster than conducting such an analysis using Google or Yahoo tools. Thus, you can better assess your weaknesses, strengths, and opportunities in cultivating or disabling the same or similar relationships.
With social networking sites surpassing search engines in terms of popularity, will the marketing value of search engine optimization diminish over time? This article makes a great case that the usefulness of organic search for consumers may eventually wane.
Interesting question: when a social network community provides answers–as opposed to an algorithm–can anyone really “optimize” their website for social networks? In fact, in this context, one can argue that the concept of “optimization” is a legacy marketing principle more akin to “push” marketing concepts as opposed to “engagement” or “Web 2.0” marketing concepts.
Obviously, Google and Wikia will return a faster result than the community, and arguably the time I am waiting for the community to respond to my request (if it responds) I can peruse the myriad results via the two search engines. What I am hoping for, though, is that the community will point me in a direction that’s more pointed and vetted via its collective consciousness.
You either have high home prices or lower home prices and lower home prices are what we want, and people shouldn’t be afraid of that,” said Robert Shiller, Yale finance professor, in a Reuters interview. Most of us care about our children and grandchildren, and these people have to buy houses so why would we want high home prices. We want economic growth, we don’t want high home prices.
So, as the slow ride down continues, what’s happening in the realm of social media that will help you when the ride hits bottom and the ascent begins anew? For starters, Business Week Online in its Feb 21, 2008 issue, is a great source for ideas.
Go ahead and bellyache about blogs. But you cannot afford to close your eyes to them, because they’re simply the most explosive outbreak in the information world since the Internet itself. And they’re going to shake up just about every business—including yours. It doesn’t matter whether you’re shipping paper clips, pork bellies, or videos of Britney in a bikini, blogs are a phenomenon that you cannot ignore, postpone, or delegate. Given the changes barreling down upon us, blogs are not a business elective. They’re a prerequisite. citation
Here’s a tip elite athletes adhere to: remember your competition is yourself and those out there who take the time to do one little extra thing, whether it’s one more hand-eye coordination exercise, or 55 more stairs to run, and it’s that one little extra thing that can separate a winner from a loser.
Ideas circulate as fast as scandal. Potential customers are out there, sniffing around for deals and partners. While you may be putting it off, you can bet that your competitors are exploring ways to harvest new ideas from blogs, sprinkle ads into them, and yes, find out what you and other competitors are up to. citation
Yes, social media will change the way real estate practices are conducted. One way–for the better–is simply to allow you to engage in a more meaningful discussions with clients and potential clients. As a real estate professional, blogs operate as your authority imprimatur. As mainstream media begins to gobble up the blog premise and “commoditize” this presence you will look out-of-date and “old school” if you similarly don’t innovate your mode(s) of communication.
Mainstream media companies will master blogs as an advertising tool and take over vast commercial stretches of the blogosphere. Over the next five years, this could well divide winners and losers in media. And in the process, mainstream media will start to look more and more like—you guessed it—blogs.” citation
Two recent research papers shed some light in the cave in terms of mining Web content. Imagine putting your hand in the Yangtze River and trying to catch a sturgeon minnow between two and three inches long. This is akin to conducting a simple keyword search and then singularly perusing each result to discern relevancy (one’s mind conducting semantic correlations to net down relevant results). The challenge is to derive a tool that drives the “semantic sifting” process higher up in the process, thereby making it more efficient to find relevant results.
Jean-Pierre Norguet, et al, discuss semantic analysis of website usage and how to apply this analysis to on-going website development. Nortguet’s approach combined web server log files, site content records, content calls by browsers, and TCP/IP packets. The Norguet team then ran these through an ontology-based OLAP tool. What it derived was a visual representation of interest values pertaining to certain categories of content. This visual representation demonstrated that despite a category’s breadth of presence across a website, interest value indicators provide valuable insight into consumer use patterns. Nouguet argues that visually displaying interest values allows for intuitive decision-making, which aligns more accurately with mapping and responding to consumer interests.
Michelle L. Gregory et al, explored a framework that allows users to map blog entries, query results sets, understand themes, and see how blog content changes over time. Gregory modified a tool called IN-SPIRE–which uses semantic indexing, among other things, to categorize result sets–to analyze 7,000 blog entries chosen at random. In addition to the powerful filtering and querying aspect explored, Gregory demonstrated how one can use this tool to build multi-lingual analyses using one’s native language. The team also delved into the realm of affect analysis. What they showed was powerful visual representation of positive versus negative feelings about a particular blog topic (taking the pulse of a slice of the blogosphere on a particular topic).
Some immediate applications of these types of analyses–in one’s native languge or across a multi-lingual website–are in improving web product development, mapping political sentiments, or sentiments pertaining to one’s own or a competitor’s product.
Jeremiah Owyang explains the concepts and value of social networks from a marketing perspective in an easily digestible manner. Yang et al (registration required), Battiston et al, and Hill et al discuss the scientific underpinnings of these topics. Juxtaposing these discussions against one another leads to some interesting insights with respect to social media marketing.
Yang notes that in 1967, Stanley Milgram demonstrated that mutual acquaintances drive social network strength. As Yang elaborates:
“[T]he probability that two of someone’s friends know one another is much greater than than the probability that two people chosen randomly from the population know one another.”
Yang illustrates the concept of this theory by pointing to the success of Hotmail, which grew from 0 to 12 million users in 18 months.
Battiston explores how “trust” factors between actors in a social network affect the dynamics of recommendations in that social network.
“Trust plays a crucial role in the functioning of such socio-economic networks, not only by supporting the security of contracts [sic?] between agents, but also because agents rely on the expertise of other trusted agents in their decision-making.”
What Battiston drives towards is that trust-based modes of recommendation have an inverse relationship to traditional modes of recommendation, which are primarily based on the volume of recommendations as opposed to the value of recommendations. Battiston argues that trust-based (or value-based) recommendations are inherently better at promoting more satisfying results to actors within a social network.
This, in turn, promotes the propogation of sub-group cultures to form within the social network. And as non-trustworthy agents drop out of the network (because prior recommendations did not fulfill specific trust elements as dictated by the requesting actor), the sub-group refines itself overtime. As more sub-groups are defined within a social network, “network neighbors” emerge amongst members of these sub-groups, where these network neighbors operate as conduits between different sub-groups.
Yang demonstrates that sub-group performance, in terms of marketing results, out-performs all others (this was measured in terms of traditional transaction response rate metrics).
Accordingly, marketers must seek out sub-group network neighbors. These individuals are the brand influencers and advocates within a social network. Jerimiah Owyang has an excellent post on the visual display of this information. Leverage Software has developed a product which likely can visually display these sub-group cross-over individuals, thus making the selection of influencers and advocates easier. Perhaps these individuals would be great focus group candidates, “real time” collaborators in product development initiatives, etc?