The Web 3.0's Pulse : Semantic Web Trends

Currently Hot: Facebook OpenGraph Protocol

Showing posts with label Semantic Search Engine. Show all posts
Showing posts with label Semantic Search Engine. Show all posts

Friday, August 6, 2010

Facebook Questions - Another Search Frontline

Facebook is testing the new Questions  feature - a search engine that finds relevant people to answer other people's questions. Interesting, the word "relevant" here plays great role - Facebook must be pulling data and make conclusions based on that data. "Where from ?" - One may ask. Well, the countless "like"s people do, provide that data, which is now semantically annotated. But behind the scenes, this application concerns several rivals, including Google and Twitter.

The People Search Engine

There are trials to revolutionize the way people search the today. In a world, dominated by authoritative, fast and amazingly complex algorithm for textual search, it seems there is not much to be improved. And for now, people are happy - they just type in what they are interested in, and Google (by saying Google, I also count Yahoo!, Bing and others in)  and: BANG! - it appears in the first 4 results displayed. Although today's search engines subtly show their power by answering well even if the user misspells the word or moreover, categorizing it like Posts, Tweets, Images etc. Google nowadays is even faster in updating its search index, reaching the Real Time Web Informer status.


But however, there are situations in which textual searching can't do much help. Queries like: "Hey Google, what are the songs that have reached No. 1 in UK's Chart in the last 10 years ?" or ... "Bing buddy, I am looking for a movie tonight. Basically I want comedies, like 'American Pie' or 'Dumb and Dumber', do you happen to have some recommendations for me?" etc. etc. (Examples for such situations can be quite a few, I will not go any further). Well, some of these queries might have to wait for the Semantic Search Engine to be built, but the interesting thing is there is a new trend : to build Social (People) Engines, which will not scan text, but people and their habits. These People Engines will try to discover the right person to answer human-only interpretable questions like those above. But to achieve that, these search engines need to have additional metadata about each person : her skills, interests, friends etc. Having that in mind, one can easily conclude that  building such network can be a challenging and expensive task to do - unless... unless it is built by itself - like in the example of Facebook. Facebook is lucky for having so connected network with correctly filled information such as people's names, age, photos, interests, friends etc. In my opinion, Facebook is now preparing to take advantage of that metadata it has: to build internal network for answering questions - people will answer each other's questions on any topic - Facebook will only be the platform to find the "relevant" people to answer them.


Facebook Questions Application


This application is exactly that: means for people to use "Social Search" instead of "Textual Search". A smart move, I say, because engaging real  people in answering topic-specific questions (for free!) is the currently best way to get around the technological gap that prevents us from building software agents to answer the questions for us. Whoever came with the idea of building this application, must have studied people behavior and conclude that people would react on such questions, if they feel they are concerned on some point with the topic of the question - be it their profession or simply a good or bad experience of some product. We will still need to wait to see how this invention of Facebook will impact the Tweetosphere and the ordinary search.

Google's Social Search Engine Efforts


Surprise, surprise, but Facebook did not invent this whole People-answering-questions thing. Google has been spending time on this field quite a while, resulting in an experimental application in Google Labs, which returns 20 % of the results from your Google queries as answers based on your social graph within Google and in inquiry of Aardvark (the closest relative of Facebook Questions app). Aardvark gathers each person's interests they type in and parses the question text, matches entities with people's interests and finds people that might be able to answer the questions. I have been looking into this application for a while and indeed it has proven itself to be very useful: most of the times it did find a person to be able to answer my question and ... what I really like about is that it integrates with the IMs : be it MSN Messenger, GTalk or Skype ... Pretty cool.
Therefore, I think that these kind of engines do have bright future, no matter what vendor creates them.

Readers, what do you think ? Would you use some of these services as your secondary search engines ? Do you think they will one day integrate with the textual search engines ?

Tuesday, July 6, 2010

Google, meet Facebook's OpenGraph Search

With the introduction of the OpenGraph Protocol, Facebook introduces a new concept of searching throughout the Web. Facebook's search works based on what they call "connections" between resources from their OG ontology. Their algorithms are capable of discovering related items to the input query, based on the individual's social neighborhood or perhaps on the frequency of hitting the (now famous) "Like" button. Sounds like a bundle of possibilities, doesn't it ?

FaceRank, the Social Relevance Algorithm

Of course, this is not something Facebook officially announced (it would sound corny, don't you think ?), but the point is, after long 10 years, finally there is a serious candidate to best the PageRank, or at least complement with it. But Google works fine, the whole world searches, people are happy! Why would anyone use Facebook's new lab gadget instead of tested, proven, mature, lightning-fast and precise tool? Well because, there are queries that Google Search simply cannot satisfy! Moreover, their results are based on statistical methods, no people are involved there. What makes Facebook different is the capability to deliver real-time results , fresh and relevant , without deploying complex calculations . If some event is popular, people will rapidly talk about it. Same as with Google, it will be up to the web masters to annotate their web pages with the metadata, but the key differential factor here is that Facebook has the feedback from the users. It can use the number of "Like" hits to give weight to popularity of some particular web page. What if someone puts false metadata? (One of the biggest problems in the Semantic Web, too). In this case, the answer is simple: people will not like it, they will simply ignore it if it is misleading, hence it will be less popular and will have lower positioning. Another advantage from using this approach is that metadata now contains the context of the resource, opening the gates for bringing the conventional Semantic Web Dream . Facebook  is now able to interpret user's query, does she search for related books, movies, sport teams, people... you name it, it finds it... in real time. As written in Times: Google, This Time, Its Personal.

Hey Mark, Recommend Me a Movie, Please

When someone says: "Yeah, the idea of the Semantic Web is great, but if it so wonderful, how come there are no applications to massively leverage it? You say the technology is available for a while.", usually made some point, but I think not anymore. With Facebook's ultimate way of Social Bookmarking, it becomes easily calculable of what users could want, on individual level ! How, you may ask ?
Here is what I am at. (This may be a real idea for semantic application, too). Suppose you want to watch a movie, but you are not really sure what you want to watch... Naturally, you would ask your friends or you would search through the Internet a bit to see where is the movie hype cloud at the moment... (did you realize I said, "at the moment"? Hang on.). Now imagine a widget, that simply communicates the Facebook via OpenGraph API, to check what movies do you like. The widget also supposes that since you like those movies, you have probably watched them, so it makes no sense to suggest them to you again. But how difficult it is, to write a query that says:

"Give me the most popular movies that are related to the comedies I like". We define "related to" as a simple rule: "A movie is related to another if X people that watched the first movie also watched the second. The movie gains ranking in relatedness if at least Y of that people are my friends. The movie gains ranking if there are at least Z pages with more than 50 likes on the Web". 

Hmmm, not so difficult to be written in a query language. For now some of these aspects are not covered in the OpenGraph ontology (I refer to the Movie Genre), but undoubtly, it could easily be added. On the other side, for the application user, it is as simple as logging in to Facebook, and pressing the "Recommend" button. Welcome to the Semantic reality, Neo. Btw, how do you write "My favorite movies" in Google ? :)

But appart from the interesting search ideas the OpenGraph brings, my deepest beliefs are that Facebook's reason number one to introduce this protocol has e-Marketing roots i.e. to deliberately interfere with Google's primary business model - with personalized, perfect ad targeting tool .

What do you think ? Will this Facebook API bring new methods of warfare between the web titans ? Will it provide better searching for end-users ? Will ultimately, data find us ? How will Google eventually respond ? Is this the final gate that needed to be opened, for semantic applications to be massively written ?

Sunday, October 11, 2009

Meet Sindice - the Semantic Web Index

Sindice is a Semantic Web index with a search engine. . (Here is the link: sindice.com). Interestingly, one of its authors is Nova Spivack from Radar Technologies, the same guy who runs twine.com. Is this the semantic search engine twine talks about ? Sindice claims to offer semantic search for terms and properties or triples, but personally, I can hardly notice the benefit of using it - It is only capable of indexing things, it is not capable of extracting terms' relations with other terms - something the Semantic Web is about. Why would I need a semantic index or a search through that index. Probably the best application of these services would be making of semantic piped tasks. The Sigma search engine is aggregating definitions for the terms from different sources, and what I really like about it is the ability for the users to control the sources of the definitions. Sigma can export the search results in various formats, such as RDF or JSON, but I am really having hard time to see the benefits of it without getting the relations of the terms.


In general, Sindice  has some  fancy marketing, but what it really has under the hood remains to be seen. Personally I don't think there is much to brag about.

Saturday, September 19, 2009

The Semantic Search Engine : Dream 3.0 ?

Today I read about the latest try to fulfill the famous Web Dream: The Semantic Search Engine. Wouldn't it be nice to have such a wonderful tool, that can actually understand you ? You can ask it about anything, it is the Global Mind, it crunches data and comprehends the whole Web, the largest knowledge management application humanity has ever built. And the best thing is, it learns and gets smarter with every day... by itself.
Sounds like a quote from a Science Fiction book, but is it that far ? It's been about 10 years since the publication of the famous paper in Scientific American by Tim Berners Lee, but yet no (r)evolution has occured. There is no single killer application, fueled by the Semantic Technologies. But why ?
The whole computer industry lives for roughly 60 years, the Internet era has begun in the 1990s, so a period of 10 years means a lot of time for the Web. That is huge amount of time. We have the standards, we have the tools, we have the frameworks, the knowledge ...
I have read several articles and it seems there is a logical explanation of this phenomenon: it's the humans that are wrong... (again). It's not the problem in making machines undersand what we mean (personally I think it sounds like the most exciting part when telling someone what is the Semantic Web all about: computers will undersand ? Really ? Like in the movies ? Will I be able to ask them via voice control ? ). The trouble is that people are lazy. The WWW is the biggest and the fastest growing entity on the whole planet. It is enormous. People will need extra effort to annotate all that data across the web. But it is tedious and time-consuming (Hey, didn't we invent computers because of that ?). But it's they that don't understand, not the machines. Machines are ready to learn. Another issue for that would come from the fact that humans are spoiled and selfish - people lie. Yes, they do. There is no rightful force to make webmasters embed true information about their web pages. (Remember the keyword stuffing problem ? ). How will someone even make them want to start annotating the pages ? I believe that here lie most of the problems for the stagnation of the Semantic Web and its applications.

There are efforts to automate the process through Natural Language Processing(NLP) but I wonder if it ever reaches the desired level of automation. Here is a good article about what Oracle does : Oracle & OpenCalais - Semantic Database. This thing really makes me happy because of the burst of hope that the Semantic Web is not an e-Myth.
Back to the search engines. The team of Twine.com has been busy trying to achieve the unimaginable: produce a true Semantic Search Engine. Here is the original post I found T2 - Twine's Semantic Search Engine. If this becomes true, all the hype will disappear in the mist. I understand why people are sceptical, but presonally, guys, you don't know what might happen. Maybe it is possible. Requirements are high - a volatile system, evolving every second, reasoning and comprehending, accurate, fast, robust ... but there is still a chance. The Semantic Engine is the one of the most desired applications of the Semantic Web. It will be a major breakthrough - although many find it tough to believe. Will the openess and sharing prevail at the end ?