OntologySummit2012 Applications Synthesis

= OntologySummit2012: (Track-4) "Large-scale domain applications" Synthesis =

Mission Statement:
'''This track will help to ground the discussions in the other tracks and bring key challenges to light by describing current large-scale systems and systems of systems that either use, or could use, ontologies in their deployment. "Large-scale" can mean either very large data sets, very complex data sets, federated systems, highly distributed systems, or real-time, continuous data systems. Examples of large data sets might include scientific observations and studies; complex data sets could be technical data packages for manufactured products, or electronic health records; federated systems could include information sharing to combat terrorism, highly distributed systems includes items such as the smart electrical grid (aka Smart Grid), and real-time systems include network management systems. Of course, some big systems might include all five aspects.'''

see also: OntologySummit2012_Applications_CommunityInput

In implemented systems, ontologies are...

 * Strong for:
 * Supporting change and aggregation
 * Enabling community aggregation, annotation
 * Automated data ingestion
 * Data validation
 * Ensuring consistency of terms across many data sets (Distributed systems)
 * Supporting reasoning
 * Self describing systems
 * Systems with many complex constraints, rules, laws, with frequent changes (Dynamically changing systems)
 * Data mining / semantic signature extraction
 * Rapid system building
 * Weak for:
 * Being understandable by software engineers and customers
 * Query performance (compared to relational databases)

Needs:

 * Need better standards for common elements:
 * Datatypes
 * Ontology patterns (e.g. whole/part patterns)
 * Collect ontological primitives from observation data
 * Need repositories
 * Repositories of ontological patterns could be more useful than repositories of ontologies
 * Need industrial strength semantic services resident in the cloud
 * Need better visualization tools and approaches
 * Need better tools to help interpret legacy systems, transform into semantic systems.
 * Need to establish feedback mechanisms from end users to ontology designers directly from point of use.

Recommendations:

 * Look for the 80-20 rule of semantic development
 * Use well defined and narrow use cases to demonstrate benefits of semantic approaches
 * Having explicit vocabularies (classifiers) is a must in a distributed system;
 * Community should be included in the development and evolution of vocabularies
 * It is critical to capture and evolve domain knowledge in a form that the community is comfortable with
 * Transition from implicit domain knowledge to explicit encoding requires community consensus - and an organization to manage the consensus
 * Some have recommended exposing users to SKOS semantics; use more complicated constructs only on back end if necessary.

Other Observations / Lessons learned:

 * UML to OWL is a common requirement for legacy systems
 * Starting from scratch is rare.
 * Ontology patterns are very helpful, and encourage model reuse
 * Semantic techniques work best when not compromised by implementation tradeoffs
 * Semantic methods are faster to implement and easier to maintain
 * Semantic approaches particularly suited to systems with many complex constraints, rules, laws, with frequent changes
 * Incremental implementation is possible through federation of datastores
 * Ontologies are not always applied to enable reasoners - sometimes just as a more rigorous data modeling approach
 * Engineers turned ontologists often don't have the necessary background/skills
 * Existing infrastructure supports traditional software development far better than large-scale ontology development
 * There are many ontologies of dubious quality
 * Service-oriented architectures allow separation of code and ontology updates
 * Reasoner and query engine performance is highly dependent upon the exact formulation of rules and queries
 * No single technology/tool currently provides the best solution across all large system use cases

-- maintained by the Track-4 champions: SteveRay & TrishWhetzel ... please do not edit