Monrai Blog

News about Cypher, Semantic Web, Natural Language Processing, and Computational Linguistics

Saturday, May 30, 2009

Razorbase vs. Parallax

For those who don't know, David Huynh, most noted for his work at the Similie Project, released a faceted browser for Freebase (now part of the LOD dataset) earlier this year. Much of my work on facets/set-based browsing is based on his. Here's a compare/contrast (see presentation below) of his Parallax browser with razorbase that may be useful.

There are now two new buttons/actions that came out of earlier observations of how people interact with the razorbase UI: mutual connections and descendants.

Mutual Connections: allows you to view the mutual connection of a certain type linked to the subject, with one click. E.g., if you were viewing Someone's friends, this button will take you to their friends.

Descendants: allows you to view the descendants of a connection. If you were viewing People who influenced someone, this button will take you to people who they influenced. Use to pull friend of a friend, a person's ancestry, etc.

Initial experiences:
I've found that what folks discover is most useful about the service is the ability to develop their own strategy for finding the information they need. By refining my criteria through trial and error browsing, I was able to find valuable Web resources about an esoteric research topic: "Recommendation Engines/technology". The results were several orders of magnitude more precise than Google and Wikipedia yielded.

I began my looking for things named Recommendation Engines, which returned all things with that title in its name. I then drilled into the Documents and Articles category. From there, I examined each article in the list, and manually collected a list of companies that I found to interest me. After pulling a list of companies that develop said technology, I was able to go to the Websites that published stories about those companies, e.g. by looking up Companies named Using the mutual connections button, I found the other stories published by those sites (because in most cases, the site containing the story were sites dealing specifically about my topic). From there, I figured that those stories probably contain links to stories/companies/web services related to my topic, so using the Information Explorer, I pulled all links referenced by those documents, and got a great list that yielded more companies in that space. Then I figured, the links in those documents may also be related. The descendant button allowed me to pull the links two-degrees out from the original list, yielding a list that was less relevant, but which did contain a few precious nuggets. In the poverty of data regarding my esoteric topic, the ability to locate those few nuggets by 1) defining a criteria based on Category and other information about my topic, and 2) drilling through and cutting the results, delivered value that I truly can not find anywhere else on the Web. This was my first real-world experience with the value proposition of the linked data web. The resulting presentation was one which I would not have been able to compile otherwise within the time constraints I was given.

Labels: , , ,

Tuesday, May 19, 2009

How to use Razorbase

Razorbase is a browser for discovering and exploring connections between things (people, places, movies, shoes, food, etc). It does this by querying not the World Wide Web (a global network of websites), but the burgeoning Linked Data Web (a global network of databases). Try it for yourself and see whether you can discover the difference between the two.

This is a tutorial for the service. My goal is that the browser be so intuitive, that you could beam a caveman right in front of it, and he could figure out what it's for and how to use it without being told. Well, so much for that :) (Update: Slides are now available here)

Homepage: There are two controls there of interest, named link, and the query text field. Click the named link modifies the type of query to perform, options:
  • things named... (e.g. things with "Bill Clinton" in its name or title or label)
  • things connected to... (e.g. things connected to fencing)
  • things known by the URI http://... (i.e. things known by the URI
Filters: Razorbase allows you to define complex filters to restrict the items in your results (see Navigation and Zooming below). Click the Your query link to view all filters. The Your query section contains a breadcrumb list/trail so that you never get lost while browsing. Click any node that appears there to go to that subject. Access all nodes by summoning the filters (click 'Your query'). The last breadcrumb in the trail is what you are viewing, it's called the subject.

Group results: You can view the categories for results by clicking the Category Explorer icon (magnify glass), if you want to see all results ungrouped, click the Back to Results icon (blue left arrow)

All info about something: If you want to know what all information is available about the items in the results, click the Information Explorer icon (blue circle with exclamation mark). There is a other info link that appears on the blue bar, click that if you want to see further types of information, then click main info link to go back to main information about the results.

Navigation: You navigate through the dataspace by clicking the blue right arrows, clicking one will take you into whatever it's marking (so Friends >> takes you to the friends of the subject).

Zooming: Zoom in and out of categories by clicking the magnifying glasses next to category names under the Category Explorer icon.

Add filter: The blue plus sign creates a filter on your search by binding the item as the value of a connection (e.g. all people whose email is Be aware that Navigating and Zooming also add filters (but the values in the case of Navigation are unbound). So basically, the blue plus sign does what the blue left arrow does, but instead of being taken to that item, you're taken back to the subject you just left.

No text please: Don't want to have any text search in your query? For example, your viewing Presidents connected to Marilyn Monroe, and you want to drop the criteria that they be connected to her, resulting in all Presidents. To do this, click the Your query link to summon the filters, then next to the text field, click the blue minus sign. This sets your text to anything. Click the blue plus sign to add some text.

Little or no results? Razorbase's novelty is enabled by the fact that OpenLink has figured out how to give SPARQL query access to a quad store of over 4.5 billion triples. In layman's terms, this means we have to ration out how much time the server can use to perform your query. The default is 2 seconds (i.e. the server gets all results it can find in 2 seconds or less). So increasing the time can potentially increase the number of results you get. To increase the time, click the clock button when it appears. Each time you click, the time increases by 2 seconds (up to 12 seconds for now).

Side notes: The power of this UI approach is two-fold, faceted-browsing, which allows you to navigate large set sets by filtering data as you go. The second is set-based browsing, which allows you to see information about multiple results simultaneously.

Hope that helps :) Next, some strategies I've found in my interactions with razorbase.

Labels: , , , , ,

Monday, May 18, 2009

Discover Connections between People, Places and other things

I present a new linked data browser called razorbase, for discovering and browsing connections between things. In the next few days and weeks, I'll be blogging guides that introduce some of the features of the browser, as well as some helpful hints I've discovered while interacting with it...

In the meantime, next time someone asks "Where can I see the Semantic Web?", you can finally reference something touchable :)

Labels: , , , , , , , , , , , ,

Wednesday, May 13, 2009

Welcome to the Game of the Decade!

A second provider of traditional search (see first) has now entered the Semantic Web, or at least has dipped their toe into the pool. This is encouraging news for the Semweb, it validates the merits of structured data. However, traditional search engines have historically been apprehensive about structured data, and can you blame them? After all, they're in an industry built on a major deficiency of the WWW. Most data you see on a web page comes from a structured database. As an entry in a database, there is no mistake about the info connected to a Review, e.g. who wrote the review, what the subject of the review is, or whether the review is positive or negative. But back at the genesis of the WWW, document retrieval and sharing via HTTP is the only thing that had been... worked out. And the WWW grew so big so fast, that they didn't have time to do it right. People wrote programs that queried the databases, and exported the results into documents written in a standard created by Sir Tim Berners-Lee so that your computer knew how to display it on the screen (e.g. make title large, place these reviews in a table with author and score in each row). But this step destroyed the structured expressed by the rows and columns of the database. The meaning of the data in the webpage had been lost. So search engine set out to divine the web page for structure post-mortem, and some approaches worked out better than others.

So, the WWW is broken, and as long as it's broken, traditional search engines have a place in the market. If the web had been connected at the level of the database to begin with (instead of at the document/webpage level), then web page indexing methods would have absolutely no value today. If the WWW is ever fixed, then traditional search engines may have a non-trivial problem to face. When the W3C announced a new recommendation to create a World Wide Web of Databases, this gave the world a push into the direction fixing the WWW fundamental problem. As we inch closer to that reality, the emphasis on search will diminish and ultimately be replaced by the notion of lookup, which is a new game, and which brings fresh, new opportunities for both consumers and entrepreneurs.

Labels: , , , , , ,