ConferenceCall 2011 12 01

= Ontolog Invited Speaker Presentation - Dr. Ramanathan V. Guha - Thu 2011.12.01 =


 * Session Chair: Dr. SteveRay (CMU)


 * Invited Speakers: Dr. R V Guha (Google, schema.org)


 * Presentation Title: "schema.org"


 * Archive:
 * [ Agenda & Proceedings ]
 * [ Abstract ]
 * there will not be any slide for this talk
 * [ audio recording of the session ] [ 1:21:31 ; mp3 ; 9.33 MB  ]
 * [ Transcript of the online chat session ] during the panel discussion ''

Agenda & Proceedings:

 * Session Format and Agenda:
 * this will be virtual session over a phone conference setting, augmented by in-session chat and shared computer screen support
 * 1) Introduction of the invited speakers - session chair: SteveRay
 * 2) Presentation by our invited speakers - RamanathanGuha (30~45 min.)
 * 3) Q&A and Open discussion (30~45 min.) [Kindly identify yourself before speaking.]


 * Presentation Title: "Schema.org"


 * http://ontolog.cim3.net/file/resource/presentation/Schema.org--RVGuha_20111201/RVGuha.jpg [ Dr. Ramanathan V. Guha ]


 * Abstract:


 * Schema.org provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google, Yahoo! and Yandex rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes it easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, search engines have come together to provide a shared collection of schemas that webmasters can use.


 * This session will be structured as a Q&A session where Google Fellow Ramanathan Guha will provide a brief introduction to the Schema.org activity and then answer your questions regarding the relation between this work and the broader ontology world.


 * About the Speakers:


 * Speaker Bio (with credit to Wikipedia) Ramanathan V. Guha (1965) is an Indian computer scientist. He graduated with B.Tech (Mechanical Engineering) from Indian Institute of Technology Madras, MS (Mechanical engineering) from University of California Berkeley and Ph.D (Computer science) from Stanford University. Since May 2005, he has been working at Google.


 * Guha was one of the early co-leaders of the Cyc Project where he worked from 1987 through 1994 at Microelectronics and Computer Technology Corporation. He was responsible for the design and implementation of key parts of the Cyc system, including the CycL knowledge representation language, the upper ontological layers of the Cyc Knowledge Base and some parts of the original Cyc Natural Language understanding system. Leaving what became Cycorp, Guha founded Q Technology, which created a database schema mapping tool called Babelfish. In 1994, he moved to work at Apple Computer, reporting to Alan Kay, where he developed the Meta Content Framework (MCF) format. In 1997 he joined Netscape Corporation where together with Tim Bray, he created a new version of MCF that used the XML language and which became the main technical precursor to W3C's Resource Description Framework (RDF) standard. Guha also contributed to the "smart browsing" features of Netscape 4.5 and was instrumental in Netscape's acquisition of the Open Directory Project. In March 1999, he created the first version of RSS as part of Netscape's personalized home page project. In 1999 he left Netscape and in May co-founded Epinions where he worked until 2000. Guha founded Alpiri in late 2000 which created TAP, a semantic web application and knowledge base. In 2002, he became a researcher at IBM Almaden Research Center. In 2005 Guha joined Google. He currently leads development of Google Custom Search and is one of the champions of the current Schema.org activity being promoted by Google, Microsoft Bing, Yahoo! and Yandex.

Transcript of the online chat during the session:
see raw transcript here.

(for better clarity, the version below is a re-organized and lightly edited chat-transcript.) Participants are welcome to make light edits to their own contributions as they see fit.

-- begin of chat session --

PeterYim: Welcome to the

= Ontolog Invited Speaker Presentation - Dr. Ramanathan V. Guha - Thu 2011.12.01 =

Session Chair: Dr. SteveRay (CMU)

Invited Speakers: Dr. R V Guha (Google, schema.org)

Session Topic: A conversation with R V Guha and Dan Brickley on "schema.org"

Session page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01

Phone (US): (206) 402-0100 ... PIN: 141184# Skype: call - "joinconference" ... PIN: 141184# if you can't find the skype keypad, try the "Call" drop down menu, and select "Show Dial Pad"

Phone keypad controls: To un-mute, press "*7" ... To mute, press "*6"

Proceedings:
anonymous morphed into Guha

SteveRay: Welcome, Guha. Glad you made it!

Guha: thanks

Guha: DanBrickley will be joining me in talking

SteveRay: OK. Noted. I will start with an introduction, then hand things over to you.

danbri just joined

danbri thanks Guha

anonymous1 morphed into Roger Cutler

K Goodier: Hi y'all

anonymous2 morphed into PeterBenson

anonymous1 morphed into DougFoxvog

anonymous1 morphed into GeraldRadack

anonymous1 morphed into shensley

anonymous2 morphed into Kurt Kirkham

anonymous1 morphed into AndreasHarth

anonymous2 morphed into Kavitha Srinivas

anonymous1 morphed into KingsleyIdehen

anonymous4 morphed into Stefano Bocconi

anonymous3 morphed into Cirrus Shakeri

anonymous1 morphed into Mike Ward

anonymous2 morphed into Ted Bashor

anonymous morphed into BobbinTeegarden

anonymous morphed into ElizabethFlorescu

anonymous morphed into AdrianWalker

anonymous1 morphed into VladTanasescu

anonymous morphed into Lora Aroyo

anonymous morphed into Stefano Bortoli

PeterYim: -- session formally started 9:38am PST --

danbri: (re old Guha / Bray spec, see http://www.w3.org/Submission/1997/8/ )

danbri: -> http://www.w3.org/TR/WD-rdf-syntax-971002/

danbri: nitpic "RDFa Lite" rather than "RDF Lite"; it's about the in-html notation

danbri: Working Draft out next week

anonymous1 morphed into FrankChum

anonymous1 morphed into GaryBergCross

danbri: discussion of http://en.wikipedia.org/wiki/ISO_8000 http://www.dataforge.com/wpblog/index.php/industry-news/iso-22745-standard-based-exchange-of-product-data/

SteveRay: PeterBenson: ISO 22745 is a set of standard tags with many entries already.

PeterYim: Guha: target audience for schema.org is the "webmasters"

danbri: example: http://schema.org/Movie

DougFoxvog: schema.org could use classification for PhysicalObject. A common superclass Agent of Person & Organization would be useful.

danbri: http://www.rssboard.org/rss-0-9-0

anonymous morphed into Alessander Botti Benevides

LeoObrst: S-expressions in Lisp.

PeterYim: SteveRay paraphrasing JohnSowa's questions for Guha - ref: http://ontolog.cim3.net/forum/ontolog-forum/2011-11/msg00141.html

danbri: so RDF '97 was PICS-NG, which used s-expressions: http://www.w3.org/TR/NOTE-pics-ng-metadata

danbri: (then XML happened)

KingsleyIdehen: John's actual post: http://ontolog.cim3.net/forum/ontolog-forum/2011-11/msg00141.html

JoelBender: (and then N3 happened)

KingsleyIdehen: Then Linked Data happened

danbri: (and then JSON happened...)

DougFoxvog: XML is not restricted to triples. Why was/is RDF so restricted?

KingsleyIdehen: Yes, Linked Data brings it back home to simplicity

JoelBender: (and now JSON-LD is happening ... maybe)

KingsleyIdehen: Yes, but Linked Data is agnostic re. EAV/SPO based 3-tuples

K Goodier: Keeping things simple and delivering value

KingsleyIdehen: and via HTTP we can negotiate representation

FrankChum: Doug, I like RDF for its simplicity and not as restricted

anonymous morphed into Arnaud J Le Hors

KingsleyIdehen: Good example of this all working, via Linked Data simplicity: http://wiki.goodrelations-vocabulary.org/Microdata

KingsleyIdehen: Yes, we have to "hold our noses" re. large scale adoption. +1

NicolaGuarino: usual problem with skype, sorry

KingsleyIdehen: Here is a link to a note showing how Schema.org mapped to DBpedia leads to network effects: https://plus.google.com/112399767740508618350/posts/ck2yhgTWxtD

KingsleyIdehen: A specific page showing LOD Cloud instance data based on Schema.org cross links: http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fschema.org%2FLandmarksOrHistoricalBuildings&urilookup=1

SteveRay: @Nicola: OK, I'll try you again after Ali is done with his second question, if you raise your hand again.

KingsleyIdehen: Final page showing links between Schema.org and DBpedia (and other vocabularies which appear as you follow-your-nose through the Linked Data): http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fschema.org%2FLandmarksOrHistoricalBuildings&p=1&lp=89&op=-1&last=&gp=1

danbri: on the 'do we need rdf' question, .... we see two trends: (1) people who use RDF, find frustration with the fiddly details of the spec (datatypes, etc.). Perhaps such things are just inherently annoying. There needs to be a rule, but the rule is arbitrary. (2) people who don't use RDF explicitly, often drift towards a data model that is very RDF-like, because RDF didn't appear from nowhere. Graph-shaped data is a very common pattern (cf. Kingsley on EAV). Hence all recent talk on 'social graph', 'interest graph', etc.

KingsleyIdehen: Methinks: Schema.org and Linked Data have a mutually beneficial relationship that in effect fans out to adding more semantic structure to links (actually relations) on the WWW. Schema.org delivers immediate and palpable value

NicolaGuarino: @Steve: sorry, I am not able to talk through skype, too bad

PeterYim: @Nicola: please type out your question on the chat

anonymous1 morphed into DuaneNickull

PeterYim: schema.org - as DanBrickley puts it - characterized by a small working group, consensus, ability to move and make decisions quickly

NicolaGuarino: Here is the comment I wanted to make: The reason why super-simple ontologies like FOAF work is that the words are simple to understand But there are words which everybody understands, and words that are ambiguous and difficult to define or explain (e.g., service, unemployed person). It is a fact that people doing markup don't care about deep semantics of their tag. So if the goal is to get billions of pages marked up, that's fine. But what about USING these marked up pages for information integration, services mashup and so on, instead of just for search? BOTTOM LINE: extensive tagging with little semantics may be very useful for search, but not for integration of information

NicolaGuarino: @Guha: but even for application-dependent vocabularies we sometime need very crisp formal definitions....

danbri: (re starting points of Web: http://www.w3.org/History/1989/proposal-msw.html has seeds of RDF in there too)

NicolaGuarino: Deep semantics is needed (sometimes) also for application-dependent purposes, not just for universal purposes

AdrianWalker: To go beyond search applications, some degree of NLP is unavoidable?

DougFoxvog: I suggest that small ontologies can build on larger existing ones. Those who use them do not need to use everything from the larger ontologies. Deep ontologies would have rules and reasoning structures that are immaterial to small systems that use parts of them.

PeterBenson: our experience with ISO 8000 is that you need sufficient data to meet a defined requirement - nothing more. As requirements grow so does the depth of data.

NicolaGuarino: Besides schema.org, why not investing on a MINIMAL formal vocabulary, clarifying for instance the various notions of PART or DEPENDENCE?

Stefano Bortoli: being to narrow in the definition of schemas might end up in a higher cost of maintenance of the application after all. This is a less we should have learned from software engineering at least. So, deep thinking and generalization to some extent is necessary. Simple and easy is good in the short term, but we risk to create asbestos that will be very hard to handle in the future

NicolaGuarino: @Stefano Bortoli: +1

PeterYim: @James Sorace - you can click on the "Settings" button (at the top center of the window) that modify "anonymous" into your real name

DougFoxvog: Contexts can separate ontologies into subsets. Guha is talking about the problems of "an ontology of everything". Cyc developed the idea of Microtheories (but i'm not sure if it was after he left). By placing rules and relationships in such contexts (or microtheories) one can avoid many of the problems of an "ontology of everything". This becomes an issue on the Semantic Web, where triples make it hard to place statements within specific contexts.

VladTanasescu: Any pointers to this ACM article?

GaryBergCross: What consideration has schema.org given to controlled natural languages? Some efforts have tried to make OWL and Common Logic easier to express.

danbri: @DougFoxvog: |Guha]/ has 'Contexts: A Formalization and Some Applications'...

Stefano Bortoli: @Dough I don't think that anyone is really aiming at the "philosophical ontology", not in the Semantic Web at least. Indeed, the first efforts were spent in automatic ontology mappings, rather than producing semantically annotated data. Contexts are particularly complex to manage in a context-less environment such as the WEB. The less we can do, is to try to be formal in defining concepts to reduce the risk of misunderstanding.

GaryBergCross: One issue with Microtheories is when do your create a new one versus adapt an existing one.

PeterYim: Guha: currently adoption is in the order of thousands of sites and billions of pages now

SteveRay: Certainly some standards development efforts are importing existing external concepts or "ontologies" to a much greater degree today.

danbri: on re-use, one q is whether publishers/authors of instance data should bear the cost of that sharing/re-use. Mainstream RDF / SemWeb culture is to have instance data cite several different ontologies. Schema.org rather pre-packages things and offers the package as a single usable thing...

danbri: re rNews - see http://blog.schema.org/2011/09/extended-schemaorg-news-support.html for details

danbri: http://www.iptc.org/site/Home/Media_Releases/schema.org_adopts_IPTC's_rNews_for_news_markup

Roger Cutler: I don't think he said billions of pages. Thousands of sites & billions of pages means millions of pages per site, right?

danbri: (yup, we should make the various mappings to/from schema.org easier to find)

DougFoxvog: @Gary -- You can create a new microtheory when describing a narrower field or are using multiple existing contexts, or when presenting information about a specific event or other individual.

danbri: http://wiki.creativecommons.org/LRMI/Specification_v0.5

NicolaGuarino: A couple of problems I find in the current taxonomic structure of schema.org: 1. A governmentOffice is both a place and an organization

ChristopherSpottiswoode: What a privilege that was, to be able to listen in on that conversation, with all that experience! Thank you all.

DougFoxvog: @Gary -- adapt an existing context when providing more info @ same level

Stefano Bortoli: thanks

Stefano Bortoli: bye

PeterYim: Great session ... thank you Guha, Dan and everyone all for coming!

Guha: Thank you everyone

danbri: Thanks all

PeterYim: -- session ended : 11:00am PST --

-- end of chat session --


 * ... More Questions
 * For those who have further questions or remarks on the topic, please [mailto:ontolog-forum@ontolog.cim3.net post them to the [ontolog-forum]] so that everyone in the community can benefit from the discourse.
 * if you are not a member of the Ontolog community (meaning to say you are not subscribed to the [ontolog-forum] list) yet, we cordially invite you to join us. See our "Membership" details at: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J

Audio Recording of this Session

 * To download the audio recording of the session, click here
 * the playback of the audio files require the proper setup, and an MP3 compatible player on your computer.
 * Conference Date and Time:	1-Dec-2011 9:38 ~ 11:00 am Pacific Standard Time
 * Duration of Recording:	1 Hour 21.5 Minutes
 * Recording File Size:	       9.33 MB (in mp3 format)


 * suggestion: its best that you listen to the session while having the presentation opened in front of you. You'll be prompted to advance slides by the speaker.
 * Take a look, also, at the rich body of knowledge that this community has built together, over the years, by going through the archives of noteworthy past Ontolog events. (References on how to subscribe to our podcast can also be found there.)

For the record ...

How To Join (while the session is in progress)

 * 1. Dial in with a phone and connect through skype: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01#nid2ZLR
 * 2. Open chat in a new browser window: http://webconf.soaphub.org/conf/room/ontolog_20111201
 * 3. Download the speaker's presentation (slides) here: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01#nid2ZLJ
 * or, 3.1 access our shared-screen vnc server, if you are not behind a corporate firewall

Conference Call Details

 * Date: Thursday, 1-Dec-2011
 * Start Time: 9:30am PST / 12:30pm EST / 6:30pm CET / 17:30 UTC
 * ref: World Clock
 * Expected Call Duration: ~1.5 hours


 * Dial-in:
 * Phone (US): +1 (206) 402-0100 ... (long distance cost may apply)
 * ... [ backup nbr: (415) 671-4335 ]
 * Skype: joinconference ...  (generally free-of-charge, when connecting from your computer) 
 * when prompted enter PIN: 141184#


 * Shared-screen support (VNC session), if applicable, will be started 5 minutes before the call at: http://vnc2.cim3.net:5800/
 * view-only password: "ontolog"
 * if you plan to be logging into this shared-screen option (which the speaker may be navigating), and you are not familiar with the process, please try to call in 5 minutes before the start of the session so that we can work out the connection logistics. Help on this will generally not be available once the presentation starts.
 * people behind corporate firewalls may have difficulty accessing this. If that is the case, please download the slides above (where applicable) and running them locally. The speaker(s) will prompt you to advance the slides during the talk.


 * In-session chat-room url: http://webconf.soaphub.org/conf/room/ontolog_20111201
 * instructions: once you got access to the page, click on the "settings" button, and identify yourself (by modifying the Name field from "anonymous" to your real name, like "JaneDoe").
 * You can indicate that you want to ask a question verbally by clicking on the "hand" button, and wait for the moderator to call on you; or, type and send your question into the chat window at the bottom of the screen.
 * thanks to the soaphub.org folks, one can now use a jabber/xmpp client (e.g. gtalk) to join this chatroom. Just add the room as a buddy - (in our case here) ontolog_20111201@soaphub.org ... Handy for mobile devices!


 * Discussions and Q & A:
 * Nominally, when a presentation is in progress, the moderator will mute everyone, except for the speaker.
 * To un-mute, press "*7" ... To mute, press "*6" (please mute your phone, especially if you are in a noisy surrounding, or if you are introducing noise, echoes, etc. into the conference line.)
 * we will usually save all questions and discussions till after all presentations are through. You are encouraged to jot down questions onto the chat-area in the mean time (that way, they get documented; and you might even get some answers in the interim, through the chat.)
 * During the Q&A / discussion segment (when everyone is muted), If you want to speak or have questions or remarks to make, please raise your hand (virtually) by clicking on the "hand button" (lower right) on the chat session page. You may speak when acknowledged by the session moderator (again, press "*7" on your phone to un-mute). Test your voice and introduce yourself first before proceeding with your remarks, please. (Please remember to click on the "hand button" again (to lower your hand) and press "*6" on your phone to mute yourself after you are done speaking.)


 * Please review our Virtual Session Tips and Ground Rules - see: VirtualSpeakerSessionTips


 * RSVP  to [mailto:peter.yim@cim3.com peter.yim@cim3.com] appreciated, ... or simply just by adding yourself to the "Expected Attendee" list below (if you are a member of the team.)


 * This session, like all other Ontolog events, is open to the public. Information relating to this session is shared on this wiki page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01


 * Please note that this session may be recorded, and if so, the audio archive is expected to be made available as open content, along with the proceedings of the call to our community membership and the public at-large under our prevailing open IPR policy.

Attendees

 * Attended:
 * SteveRay (chair)
 * R V Guha "Guha" (invited speaker)
 * DanBrickley "danbri" (discussant)
 * PeterYim
 * RandyKerber
 * BobbinTeegarden
 * FrankChum
 * LeoObrst
 * ElizabethFlorescu
 * BobSchloss
 * DougFoxvog
 * DavidHau
 * BobSmith
 * GeraldRadack
 * StefanoBocconi
 * Vlad Tanasescu (The University of Edinburgh)
 * Kurt Kirkham (Sallie Mae)
 * KatherineGoodier
 * JoelBender
 * GaryBergcross
 * James Sorace (HHS)
 * NicolaGuarino
 * ChristopherSpottiswoode
 * PeterBenson
 * Melissa Hildebrand (Scheib) (ECCMA)
 * FrankAlvidrez
 * AdrianWalker
 * AndreasHarth
 * Lora Aroyo (VU, NL)
 * Roger Cutler (Chevron)
 * RamSriram
 * KingsleyIdehen
 * YefimZhuk
 * Alessander Botti Benevides
 * AliHashemi
 * Arnaud J Le Hors
 * BrianDavis
 * Cirrus Shakeri
 * DuaneNickull
 * Kavitha Srinivas
 * Mike Ward
 * MyCoyne
 * shenley
 * Stefano Bortoli
 * Ted Bashor
 * YuLin


 * Expecting:
 * (please add yourself to the list if you are a member of this community, or, rsvp to )
 * (please add yourself to the list if you are a member of this community, or, rsvp to )


 * Regrets:
 * JohnSowa (cannot attend, but has questions that he will ask via the session chair)
 * ChristophLange (traveling)
 * ToddSchneider
 * MartinHepp
 * FrankOlken (time conflict)
 * ChrisWelty