MEMBER LOGIN   Username   Password Remember Me  Forget your Password?
EMAILPRINT
+ HOME » + The Semantic Web
5 rating




INTERACTIVE NEWS
AO NEWS HOME

AO NEWS HOME
Get desktop headlines
TECH »
AP NEWS »

AO MEMBERS' POSTS
Members Home
The Honorary Award for Peace and Prosperity goes to Symantec
Making a brillant move to combine two different companies- Symantec now carries a heavy load in it's acquisition of my
favorite company Veritas
[0 opinions] (12 views) un-rated.
Lifetime Digital Memory
What's the next big thing? Videoblogging?

http://en.wikipedia.org/
http://groups.yahoo.com/

What...
[0 opinions] (21 views) un-rated.
War Zone & Corporate Employee Ethics
Death Of An Employee Enticed By Cash ?
[2 opinions] (123 views) 4 rating
Scenario Building Experiment for Year 2010 - Telecommunications Industry & Information Technology
This is a simple scenario building excercise for a Telecommunications & Information Technology which I thought of conducting online. This is just an experimnent and the results of this could lead us to opportunities and warn us about coming Threats. I invite one and all, related with it to participate in it. The book "The Art of Long View" by Peter Schwartz the master scenario-builder said that in such an excercise, it is important to learn the opinion of everyone related to the focus of the scenario...and so here we go....
[2 opinions] (52 views) un-rated.
Quantum Spookiness Precipitates Out of Solution
The Bose - Einstein Condensate
[7 opinions] (47 views) un-rated.
My Ass[ets]
Battered by Black-Scholes
[0 opinions] (37 views) un-rated.
The Radical Insidiousness Of Desktop Search
Desktop search is the narrow end of the wedge that will change how we think about information. Here's why.
[1 opinions] (104 views) 5 rating
"What’s Next for Google"
Good article by Charles Ferguson at MIT's Technology Review.
[1 opinions] (67 views) 5 rating
BBC "If..." on cloning, violence and drug legalisation
This second series of IF aims to involve you in the options that lie ahead for you, and for your children.
[0 opinions] (71 views) un-rated.
Open
X+ & Y+ Coordinates. . .
[2 opinions] (88 views) un-rated.
START BLOGGING

The Ontological Challenge

The semantic web reaches dilemmas of postmodern proportions.
Ken Fromm: (Mr. Fromm is an independent consultant to web application and semantic-based startup companies.)  A couple of the challenges out there—I know that Scott is involved with some FCC reform issues and some efforts of the SEC in content and digital rights. You want to talk a bit about content regulation?

Scott Rafer: (Mr, Rafer is CEO and president of Feedster.)  There's content overlap with user overlap. The first time we'll touch FOAF in a meaningful way is when you take everyone's RSS reading lists—because of the stuff I've done in Wi-Fi some people want to know what I read on wireless, people outside the Valley who aren't very technical.

My first degree of FOAF stringing filtered by Feedster for my wireless feeds is an application that is beginning to crop. And what all this is predicated on is effectively the current definition of fair use, as it's implemented in the United States. The FCC is trying to crush fair use and actually get to the point where they are regulating software innovation. Under rules passed in November, there are actually kinds of software in the broadcast video world that are illegal to open source. Going down the slippery slope of the FCC saying what we can and cannot put under LGPL, for instance, is a real problem.

-- ADVERTISEMENT --



Fromm: Nova, do you want to talk about some of the challenges going forward with ontologies?

Nova Spivack: (Mr. Spivack is the CEO and president of Radar Networks.)  There are several big missing pieces right now in making the semantic web. Certainly the lack of ontologies is a major issue. There are, I guess Deborah would say thousands of ontologies. So there maybe isn't a lack; there may be too many from one perspective. When you start looking at these ontologies, what you find is that some of them are overly specialized; maybe they are focused, for example, on particular niches of interest to DARPA, not particularly of great use to consumers unless you live in New York (with the paranoia that we all experience there).

But anyway, there are a lot of ontologies about medicine, and then there are upper level ontologies that try to define different concepts related to abstract, philosophical sets. But if you're an end user, what you really need are ontology sets that help you work with the types of information and relationships that you deal with every day or when you're shopping, for example.

Currently, there is no good human-readable mid-level ontology that's covering common-sense concepts. Cycorp has probably the most impressive ontology. The only problem is it's so big and complex and requires such a high, steep learning curve to actually do anything with it that it's not really targeted at the needs of normal developers and regular end users. The lack of the good, open ontology that covers common-sense concepts is a big problem. That's something we're working on, too. I think that ultimately there ought to be at least something like that that comes out of the W3C or is handed to the W3C at some point to at least provide a basis for describing certain types of entities and relationships that we all have to use in our applications.

Audience question: I was involved in the building of the in-flight medical language system for a company. The head of the company often said, "You got to take a top-down, bottom-up approach." Can you speak to any kind of bottom-up approach that you've used for building ontologies, like noun phrase extraction?

Spivack: Certainly approaches like that that were attempted. If you try to build an ontology from the bottom up, you can get basically a lot of clusters of things, but somebody has to then go and figure what they mean. There are lower-level systems like WordNet, for example, used a lot in the natural language processing community, and essentially they define words and their relationships to other words. That work is being done at a low level.

The next step is to take a corpus of information and somehow try to figure out a way to automate the connections from that information to some ontology. WordNet is relatively easy to do (you can match words), but if you want to use higher-level concepts—for example, if you have an ontology of different types of companies, and then you take SEC documents and try to figure out how to match different documents to the particular types of companies and industries—that is hard. Even if you have a good ontology, it turns out that you need some pretty clever algorithms to do a decent job of clustering things.

So associating data with ontologies is a problem. Building ontologies, I come from the school of thought of top down. I've never seen a bottom-up ontology that I liked. There aren't many. Having built much of ontologies, I think that the amount of thinking that goes into it is just so intensive that to do it well, I just don't think that, at least without great AI, we'll be able to do it anytime in the next couple of decades.

Question: Deborah had a slide in her presentation that had a very thin red line, and there was a very wide chasm between formal ontologies and informal ontologies. And right now, the bulk of the web, the Google experience that the majority of people are using, is off to the left-hand side of that slide. So there's like two chasms to cross. Getting users across to the kind of information that would be on the second chasm, getting comfortable with what that means and how they can use it and supply it, that seems like a very big challenge.

Spivack: Two things are interesting. One is that the tools are actually encoding the data, so when you talk about describing something and you look at something like Movable Type when it actually creates the information, it's tagging things, it's telling you, it's allowing the user to just input data into a form that's quite easy to use. The other side of that is the information is just described enough to be useful for the application. Sometimes you only need to add a little bit of description to make it really a lot more useful.

If you think about a search engine, it's just another kind of agent. People are already using things like this now. Evolving the search engines and evolving the information that's encoded from the application could find you quite a bit of functionality.

Rafer: The problem the existing search engines face is that their crawlers can't support this, never mind their indexing. So we had to start from scratch on data. We had 49 feeds March 2003.

The second gap is still up to developers. One of the things we do, which a lot of people don't use yet but increasingly, is every set of search results we provide, we provide also as XML. You get RSS out of our engine as well. So to the extent that you want to take advantage of all our Booleans, everything in our table, you can do that. And start creating just little links that provide feeds of information sucked out of our index, filtered however you want, which gets you at least toy-level applications over that second gap, too. People in corporate business intelligence departments are doing competitive research, at least prototyping, this way.

This text is excerpted from SDForum's Semantic Technologies Seminar, cohosted by AlwaysOn, TopQuadrant, and Enterprise Architect. Part Three of three in Series Two of four.

(721 views) [7 opinions]



Related Links
+ HOME

On or Off?
Tell us what you think of this post using our On or Off rating system. Only your most recent vote will count.

WAY OFF
ON THE $
[1] [2] [3] [4] [5]

Join the Discussion
0
NOTIFY?

Member Comments

Few quick remarks:
(a) A refreshing view on ontologies can be found in a recent interview with Tom Gruber-- see http://www.sigsemis.org (SIGSEMIS bulletin, vol 1 (3), page 4).
(b) Here is a sample list of some openly available not so small ontoloiges of varying quality developed for variety of purposes:
- TAP which covers domains of possible interest to end users/consumers: http://tap.stanford.edu/tap/download.html
- SWETO which is more useful for testing scalability and performance of SW tools
(populated with million+ entitie and relationship instances of facts):
http://lsdis.cs.uga.edu/Projects/SemDis/sweto/ [accessbile under Creative Commons terms)
- GlycO which shows how extensive a domain ontology can be; the first version has 767 classes, it is 11 level deep, and is developed by domain scientists: http://lsdis.cs.uga.edu/Projects/Glycomics/index.php?page=5 (note: KB with 100s of thousands of assertions will be posted in the near future) [Open source]
(c) Here are examples of some real-world domain/application/task ontologies developed for commercial enterprises/customers (so not available publicly), with an average of 1 million entity instances (and even more relationship instances), and a couple with over 10m each: Financial Market, Terrorism, Pharma, Anti-money laundering, Equity Research, Repertoire Management.
(d) Tens of high quality ontologies (mostly developed by extensive human/committee efforts, unlike commerical-use ontologies which were primarily populated using high quality knowledge sources) can be found in biology and medical domains which are at the forefront of using ontologies.
(e) Several well known Internet businesses seem to be adopting more formal knowledge organization using "ontologies" for products, books, etc. and are considering coupling them with Web Services (ie Semantic Web Services).
Additional comments based on experiences from building real-world ontologies are on pages 46-53 of this KMWorld 2004 presentation:
http://www.kmworld.com/kmw04/presentations/Sheth.pdf

Amit Sheth http://lsdis.cs.uga.edu http://www.semagix.com

APS | POSTED: 12.11.04 @18:52

I think the world of Scott Rafer. He's a brilliant guy and if I had serious money to invest, Feedster would be near the top of my list. (Oops ... I only invest in companies in China. ;-)

However, the SDF needs to supplement their semantic web seminar with academic gurus. It's really the academicians who are leading the sematic technologies charge.

I did a quick search and there are over 40 publicly accessible links in my Furl archive, including many academic links. See http://www.furl.net/members/goldentriangle and search with: "semantic web" OR "semantic technologies" . I also have a folder specifically on semantic technologies called, "Semantic Web". It's searchable, too.

My blogs on enterprise software and emerging technologies cover this topic:
http://onenterprisesoftware.blogspot.com
http://onemergingtech.blogspot.com

David Scott Lewis | POSTED: 11.12.04 @23:32

For completeness, the deepest issues are both ontological AND epistemological.

But hey, who's keeping track, anyway?

:-)

WilliamLuciw | POSTED: 11.08.04 @10:31

Question is the Landmark Forum Viable?

3sectormav | POSTED: 11.08.04 @08:31





Top Posts


The AO Beat

Related Entries

-- ADVERTISEMENT --



AO Poll


  WHO'S ON NOW?

Grudge Match

The AO E-letter email newsletter series blends strategic business intelligence with the unique AO insider perspective.
Click the links for the latest Newsletter Archives.
iHollywood
Letter from China
Tech Watch
Think Thoughts
Wonk Wise
Weekly Rap
Tony's Blog
VC Deal Pitch

FOUNDING PARTNERS
AFFILIATE PARTNERS
° TOP
Contact Us | Privacy Notice | Site Feedback | Terms of Use | © AlwaysOn Network, LLC 2002.
All rights reserved. Version 1.1. Powered by Geeks like you. site designed & developed by d_prock creative