Investigating Semantic Primitives Using an OWL-based Ontology

investigating semantic primitives using l.w
1 / 22
Embed
Share

The investigation focuses on identifying semantic primitives in English using an OWL-based ontology approach. By analyzing basic concepts and frequent words, the study aims to determine the minimal set of semantic relations essential for logical representation, emphasizing the importance of minimizing complexity for efficient knowledge representation.

  • Semantic Primitives
  • OWL Ontology
  • Logical Representation
  • English Language
  • Knowledge Sharing

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Investigating Semantic Primitives Using an OWL-based Ontology Preliminary Results Sept 2022 Patrick Cassidy cassidy@micra.com

  2. Background The COSMO ontology was initiated to develop a Foundation Ontology that could serve to enable accurate data sharing between multiple independently developed databases or applications. It still retains that goal. As the project proceeded, it appeared that the goal would be assisted by finding some set of semantic primitives that could be used to logically specify the meanings of terms in an ontology or a natural language. This appeared feasible from the use of a defining vocabulary of about 2250 words used in definitions in the Longman Dictionary of Contemporary English.

  3. The OWL ontology language appeared particularly useful for this, to enable checking the ontology for logical consistency even as it grew larger. In addition to the Longman Dictionary defining vocabulary words, the set of words used in American sign language (AMESLAN) also appeared to be a useful source for the Semantic Primitives that might be needed as a logical/semantic defining vocabulary . At this point the project became an investigation of how many semantic primitives would be needed.

  4. Goals Of This Investigation Discover which of the elements required to logically specify the meanings of the most basic concepts and most frequently used words in English appear to function as primitives. Discover which minimal set of semantic relations (owl:ObjectProperties) are essential to represent these concepts. Determine how often it will be necessary to add new ObjectProperties for each owl:Class added.

  5. Why Minimize the Number of relations Used? The Ontology is a language for representing knowledge that must be learned to be usable, and approaches the complexity of a human language. Minimizing the complexity facilitates learning and use. Proper logical representation of the relations will require creating logical FOL implications. Unnecessary logical relations will increase computational complexity.

  6. What Are Semantic Primitives? In a logic-based language the non-primitive elements (Classes, Properties) are those whose intended meanings can be logically specified solely as a combination of other elements: omitting those defined elements will not change the logical properties of the primitives. Remove all non-primitive elements, and the primitives remain. This presents the possibility that a relatively small number of primitives will be sufficient to logically specify the intended meanings of a large number of concepts in multiple applications. How many are there? One goal of this investigation is to estimate that number.

  7. Of What Use Are Semantic Primitives? For Interoperability of databases and applications, using the primitives-based ontology as an interlingua allows any number of independently developed applications to communicate data accurate by translating to and from the common logical language. The logical representation avoids the ambiguity inherent in Natural Language Interlingua use.

  8. WhichElements are the Primitives? The following appear to function as primitives (but have not yet been logically represented in an FOL form): The OWL logic primitives Top-Level Object Properties Top-Level Qualitative Attribute Values Closed Class words Top-Level OWL Classes

  9. Adding ObjectProperties When adding a restriction is desired, the goal of keeping the number of ObjectProperties to a minimum (slide 2) requires searching the existing list of properties (now 1368) to determine if the needed relation already exists. When it does not change the logical meaning of an existing relation, an ObjectProperty can be broadened in scope by expanding the domain or range with new Classes or Metaclasses. When a new ObjectProperty is necessary, it is added. The chart below Added Properties per 200 Added Classes shows how many new properties were added as the number of owl:Classes in the COSMO ontology increased. The pattern observed, reaching a low asymptotic level of new required properties per added class after a initial phase of needing new properties frequently, is likely to be repeated for each additional specialized domain ontology linked to the COSMO.

  10. Ontological Choices (1) There are different ways to represent real-world and abstract things, but all can be translated into each other and retain logical consistency. Examples: Substances vs. objects Events and processes Attributes as classes or instances Attributes as objects or disjoint from objects The only assertions impossible to accommodate in a single ontology are assertions that something cannot exist

  11. Ontological Choices (2) Roles In COSMO an owl:Class can be a subclass of Role as well as a subclass of other Class A Student is a subtype of Role and a subtype of Person. Anything can be (play) any number of Role s, each within some (possibly overlapping) interval of time. When specifying Roles, avoiding conflict can require precautions in some situations.

  12. Ontological Choices (3) hasValue vs. some(all)ValuesFrom Using someValuesFrom in ObjectProperty relations requires that the specific value be named for each instance. This existential relation presents problems when the value is not known or irrelevant. E.g. every person has a Mother but in referring to indidivual Person s it is usually unnecessary to specify the Mother, even if known. So hasValue is used as a pun for objects of relations, even if the instance necessarily has a value. allValuesFrom can be used, if all categories of things that can serve as objects are known and enumerable.

  13. Some Lessons Learned Although most new owl:Classes added can be specified to represent the intended meaning with combinations of existing ontology elements, the meaning of the terms used to represent those concepts tend to be ambiguous until the relations that clarify the meanings are added. This raises the possibility that different groups intending to represent the same concept will use fewer or different relations, leading to possible reduced accuracy of interoperability. Thus, the bigger ontology with more detail is likely to serve the purpose better, giving more guidance for meanings.

  14. COSMO ontology Status September 2022 (1) Revision 1909 Element Statistics owl:Classes: 32,000 owl:ObjectProperties: 1,368 top ObjectProperties: 820 SubObjectPropertyOf axioms: 646 InverseObjectProperties: 350 Restrictions: 30,098 Instance Properties Subclass axioms: 79,901 Owl:disjoint axioms 273 Top-level Qualitative Attributes 1260 12,895

  15. COSMO ontology Status September 2022 (2) Labels (for 32,000 owl:Classes) Total Unique WordNet labels: Single-Word WN Unique labels: owl:Classes with no WordNet labels: 6,071 Unique EN labels: Unique Single-word EN labels: 3,909 Unique Single-word WN+EN labels: 34,295 39,139 31,272 10,372

  16. WordNet synset usage Total uses of WN synsets: 31,621 Unique synsets used: 28,222

  17. Added Properties per 200 Added Classes 30 25 20 15 10 5 0 10649 11481 12359 13541 14366 15193 16133 16960 17776 18584 19291 20124 20525 21335 22138 22837 23647 24467 25280 26091 26894 27599 28400 29200 30005 30807 31604 8000 9813 Number of owl:Classes in ontology

  18. Most Frequently used Properties HasQualitativeAttribute isAnAttributeValueOf pertainsTo isTheOppositeOf isPerformedBy hasComponentElement isTheAttributeOfSomethingThatExperienced isTheAdverbialFormOf Produces hasSubEvent objectHasAttributeAfter performsAction Causes isaPhysicalPartOf 3811 3318 2278 1689 1129 1008 978 724 620 571 520 484 427 404

  19. Numbers of Uses for Object Properties 10000 1000 100 10 1 51 1051 151 201 251 301 351 451 501 551 601 651 751 801 851 901 951 1001 1 101 401 701

  20. Usage Frequency Of 502 Properties With at least 10 uses 10000 1000 100 10 1 25 50 75 200 100 125 150 175 225 250 275 300 325 350 375 400 425 450

  21. Further Work and Applications Test ObjectProperty relations as a dimension in vectors or tensors to represent word meanings Use in relation Extraction: recognize relations in text and relate them to object properties Expansion: Include additional frequently used words and specialized words from existing ontology-based applications Expansion: Integrate with Frames Expansion: Add most frequent WordNet tags in text corpora, with proper senses

  22. COSMO Ontology Available at: http://micra.com/COSMO/COSMO.owl These slides: http://micra.com/COSMO/Statistics.pptx Contact: Patrick Cassidy cassidy@micra.com 1-908-561-3416

Related


More Related Content