Skip to main content
Previous sectionNext section

Semantic Attributes

An iKnow attribute flag is associated with one or more terms (word or short phrase) that affects the interpretation of a path or sentence. A part of a path or sentence is flagged as being affected by that attribute, and can thus be separated from similar parts of paths or sentences that do not have the attribute.

iKnow supports two attributes:

  • Negation: A negation attribute identifies a part of a sentence as having negation. For example, the words “no” and “not” negate the meaning of a section of their path or sentence. iKnow language models identify many common negation terms. You can specify additional negation terms using the iKnow UserDictionary.

  • Sentiment: A sentiment attribute flags a sentence as having either a positive or negative sentiment. For example, the words “avoid”, “harm”, “reject” typically convey a negative sentiment in many (but not all) contexts; the words “approve”, “accept”, “beneficial” convey a positive sentiment. Because sentiment terms are highly dependent on the kind of texts being analyzed, iKnow does not automatically identify sentiment terms. You can specify a list of positive and negative sentiment terms using the iKnow UserDictionary. These terms are then identified in the texts by the iKnow engine, which determines which parts of the sentence or path is affected by it.

Negation

Negation is the process that turns an affirmative sentence (or part of a sentence) into its opposite, a denial. For example, the sentence “I am a doctor.” can be negated as “I am not a doctor.” In analyzing text it is often important to separate affirmative statements about a topic from negative statements about that topic.

iKnow provides a means to determine if a sentence or path is negated. During source indexing iKnow associates the attribute “negation” with a sentence and indicates which part of the text is negated.

While in its simplest form negation is simply associating “no” or “not” with an affirmative statement or phrase, in actual language usage negation is a more complex and language-specific operation. There are two basic types of negation:

  • Formal, or grammatical, negation is always indicated by a specific morphological element in the text. For example, “no”, “not”, “don’t” and other specific negating terms. These negating elements can be part of a concept “He has no knowledge of” or part of a relation “He doesn’t know anything about”. Formal negation is always binary: a sentence (or part of a sentence) either contains a negating element (and is thus negation), or it is affirmative.

  • Semantic negation is a complex, context-dependent form of negation that is not indicated by any specific morphological element in the text. Semantic negation depends upon the specific meaning of a word or word group in a specific context, or results from a specific combination of meaning and tense (for example, conjunctive and subjunctive tenses in Romance languages). For example, “Fred would have been ready if he had stayed awake” and “Fred would have been ready if the need had arisen” say opposite things about Fred’s readiness. Semantic negation is not a binary principle; it is almost never absolute, but is subject to contextual and cultural insights.

The iKnow language models contain a variety of language-specific negation words and structures. Using these language models the iKnow analysis engine is able to automatically identify and flag for future use most instances of formal negation as part of the source loading operation. However, iKnow cannot identify instances of semantic negation.

The largest unit of negation in iKnow is a path; iKnow cannot identify negations in text units larger than a path. Many, but not all, sentences comprise a single path.

Properties of Formal Negation

Formal negation can be defined by three properties:

  • Negation markers: formal negation is always marked by one or more negation markers. These negation markers can be part of a concept or a relation. Some examples of negation markers in English are no, not, doesn’t, isn’t, hasn’t, neither, nor, never, nothing, none, nobody, nowhere. iKnow always identifies a negation marker as part of a concept or part of a relation.

  • Negation span: negation is always expressed in the broader context of a statement or a sentence. The effect of formal negation is that the statement or sentence (or some part of it) is negated. Therefore, it is important to determine the span of the negation, the part of the sentence that is made negative by the negation marker(s). The maximum span of a formal negation is a full sentence.

  • Negation stopper: in many cases the span of the negation is not a full sentence. The span of the negation is terminated by a negation stopper, such as the words “but” and “or”. iKnow identifies negation stoppers and uses this information to limit negation span.

iKnow uses these properties to identify negated units of text. Negation markers are tagged at the entity (concept or relation) level by assigning a negation attribute. Negation span is tagged at the path level with negation-begin and negation-end tags.

Japanese supports the negation attribute at the entity level, but because of the fundamentally different definition of paths in Japanese, path expansion is not supported. Therefore, negation for Japanese does not necessarily expand to all affected entities at the path level.

Using Negation Attributes

Negation analysis information can be used with the following methods:

You can specify the negation attribute ID using the $$$IKATTNEGATION macro, defined in the %IKPublic #Include file.

Negation Attribute Structure

Negation is implemented in iKnow as an attribute. That is, sources, sentences, or paths that contain negation have the negation attribute. This attribute is a %List structure with the following elements:

  • Element 2 is the word “negation”

  • Element 3 is the entity position that contains the first negation marker. A negation marker can be part of a relation or a concept. For example, in “The White Rabbit usually hasn't any time.” the negation marker is in entity 3, the relation “usually hasn’t”. In “The White Rabbit usually has no time.” the negation marker is in entity 4, the concept “no time”. Note that for this position count non-relevant words (such as “the” and “a”) are counted as separate entities.

  • Element 4 is the scope of the negation as a count of entities. Negation scope is counted from the first entity containing a negation marker to the last entity containing a negation marker. For example, “The man is neither fat nor thin” has a negation scope of 3 entities: “is neither/fat/nor”.

  • Element 5 shows the position of the negation marker within the entity as a bit map. A “1” indicates a word that is a negation marker; a “0” indicates a word that is not a negation marker. A negation marker consisting of two adjacent words, such as “is not”, is indicated as “11”. Entity mapping stops when the negation marker has been indicated. For example, the relation “is often not” is “001”, while the relation “often is not” is “011”, and the relation “is not often” is “11”.

Negation Bit Map

Element 5 is the negation bit map. It indicates where the negation markers are in the negation scope. When the negation scope is 1, this is a simple bit map. When the negation scope is greater than one, this is a series of bit maps separated by spaces, one bit map for each entity within the negation scope.

Within the negation scope, if an entity contains a negation marker the negation marker and each word preceding it is indicated by either a 1 (negation marker word) or a 0 (word preceding the negation marker). If an entity within the negation scope does not contain a negation marker, the whole entity is represented by a single 0. Note that non-relevant words, such as “a” and “the”, are considered to be separate entities. Some examples of negation bit mapping are shown in the following table:

Negation Bit Map Sentence Text with / entity dividers and underlined negation markers
01 0 1 Bartleby / is neither / busy / nor idle.
01 0 1 Bartleby / is neither / sixty-five years old / nor retired.
1 0 1 Bartleby / is / no idler / and certainly is / no loafer.
11 0 0 01 Bartleby / is not / my / favorite fictional character / but neither is / he / my / least favorite.
1 0 0 01 Bartleby / isn’t / my / favorite fictional character / but neither is / he / my / least favorite.
001 0 0011 Bartleby / is either not / trying very hard / or he is not / succeeding.
11 0 0 0011 Bartleby / is not / a / wholly realistic character / and yet is not / wholly unbelievable.
1 0001 Bartleby / never works, / but he is never / wholly idle.
1 001 Bartleby / does / nothing / and yet never is / he / idle.

The largest entity bit map is 8 bits. In rare cases a negation marker can be more than eight words from the beginning of its entity. If the negation marker is a two-word marker at positions 8 and 9, the second “1” is omitted (“00000001”); if the negation marker is at position 9 or greater, no bit map is returned. In the following examples the negation marker is in the second entity, a relation containing many words (due to the semantic ambiguity of the word “in”): “They start when you get in and are not finished when you leave.” maps as “00000011”; “They start when you get in and they are not finished when you leave.” maps as “00000001” (second word of the negation marker not mapped); “They often start when you get in and they are not finished when you leave.” returns no bit map.

You can determine if a negation bit map has been omitted by comparing the Element 4 scope of negation entity count with the Element 5 number of blank-separated bit maps. If these two counts do not match one or more negation entity bit maps are missing.

Negation and Dictionary Matching

iKnow recognizing negated entities when matching against a dictionary. It calculates the number of entities that are part of a negation and stores this number as part of the match-level information (as returned by methods such as GetMatchesBySource() or as the NegatedEntityCount property of %iKnow.Objects.DictionaryMatch). This allows you to create code that interprets matching results by considering negation content, for example by comparing negated entities to the total number of entities matched.

For further details, refer to the Smart Matching: Using a Dictionary chapter of this manual.

Negation Examples

The following example uses %iKnow.Queries.SourceAPI.GetAttributes() to search each source in a domain for paths and sentences that have the negation attribute. It displays the PathId or SentenceId, the start position and the span of each negation. To limit %iKnow.Queries.SourceAPI.GetAttributes() to paths, specify $$$IKATTLVLPATH rather than $$$IKATTLVLANY:

#Include %IKPublic
  ZNSPACE "Samples"
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).NameIndexExists(dname))
     { WRITE "The ",dname," domain already exists",!
       SET domoref=##class(%iKnow.Domain).NameIndexOpen(dname)
       GOTO DeleteOldData }
  ELSE 
     { WRITE "The ",dname," domain does not exist",!
       SET domoref=##class(%iKnow.Domain).%New(dname)
       DO domoref.%Save()
       WRITE "Created the ",dname," domain with domain ID ",domoref.Id,!
       GOTO ListerAndLoader }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { WRITE "Deleted the data from the ",dname," domain",!!
            GOTO ListerAndLoader }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 100 ID AS UniqueVal,Type,NarrativeCause FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeCause")
UseLister
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
UseLoader
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
GetSourcesAndAttributes
   SET numSrcD=##class(%iKnow.Queries.SourceQAPI).GetCountByDomain(domId)
   DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.srcs,domId,1,numSrcD)
   SET i=1
   WHILE $DATA(srcs(i)) {
      SET srcId = $LISTGET(srcs(i),1)
      SET i=i+1
      DO ##class(%iKnow.Queries.SourceAPI).GetAttributes(.att,domId,srcId,1,10,"",$$$IKATTLVLANY)
      SET j=1
      WHILE $DATA(att(j)) {
          IF $LISTGET(att(j),1)=1 {
            SET type=$LISTGET(att(j),2)
            SET level=$LISTGET(att(j),3)
            SET targId=$LISTGET(att(j),4)
            SET start=$LISTGET(att(j),5)
            SET span=$LISTGET(att(j),6)
               IF level=1 {WRITE "source ",srcId," ",type," path ",targId," start at ",start," span ",span,!}
               ELSEIF level=2 {WRITE "source ",srcId," ",type," sentence ",targId," start at ",start," span ",span,!!}
               ELSE {WRITE "unexpected attribute level",! }
         }
     SET j=j+1
     }
    }
Copy code to clipboard

The following example uses %iKnow.Queries.SentenceAPI.GetAttributes() to find those sentences in each source in a domain that have the negation attribute. It displays which sentence id of those sentences that have this attribute, and the entity position that contains the negation marker. It then displays the text of these sentences.

#Include %IKPublic
  ZNSPACE "Samples"
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).NameIndexExists(dname))
     { WRITE "The ",dname," domain already exists",!
       SET domoref=##class(%iKnow.Domain).NameIndexOpen(dname)
       GOTO DeleteOldData }
  ELSE 
     { WRITE "The ",dname," domain does not exist",!
       SET domoref=##class(%iKnow.Domain).%New(dname)
       DO domoref.%Save()
       WRITE "Created the ",dname," domain with domain ID ",domoref.Id,!
       GOTO ListerAndLoader }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { WRITE "Deleted the data from the ",dname," domain",!!
            GOTO ListerAndLoader }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 100 ID AS UniqueVal,Type,NarrativeCause FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeCause")
UseLister
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
UseLoader
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
GetSourcesAndSentences
   SET numSrcD=##class(%iKnow.Queries.SourceQAPI).GetCountByDomain(domId)
   DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.srcs,domId,1,numSrcD)
   SET i=1
   WHILE $DATA(srcs(i)) {
      SET srcId = $LISTGET(srcs(i),1)
      SET i=i+1
      SET st = ##class(%iKnow.Queries.SentenceAPI).GetBySource(.sent,domId,srcId)
      SET j=1
      WHILE $DATA(sent(j)) {
         SET sentId=$LISTGET(sent(j),1)
         SET text=$LISTGET(sent(j),2)
         SET j=j+1
CheckSentencesForNegation
         SET atstat=##class(%iKnow.Queries.SentenceAPI).GetAttributes(.att,domId,sentId)
         SET k=1
            WHILE $DATA(att(k)) {
             WRITE "sentence ",sentId," has attribute=",$LISTGET(att(k),2)
             WRITE ", marker at entity position=",$LISTGET(att(k),3),!
             /* Format for display */
             WRITE sentId,": "
             SET x=1
             SET totlines=$LENGTH(text)/60
               FOR L=1:1:totlines {
               WRITE $EXTRACT(text,x,x+60),!
               SET x=x+61 }
             WRITE "END OF SENTENCE ",sentId,!!
         SET k=k+1 }
    }
  }
Copy code to clipboard

Adding Negation Terms

You can specify negation for other specific words or phrases using the iKnow UserDictionary. Using the AddNegationTerm() method, you can add a list of negation terms to a UserDictionary. When source texts are loaded into a domain, all appearances of these terms are flagged with the negation marker.

Negation Special Cases

The following are a few peculiarities of negation in English:

  • No.: The word “No.” (with capital letter and period, quoted or not quoted) in English is treated as an abbreviation. It is not treated as negation and is not treated as the end of a sentence. Lowercase “no.” is treated as negation and as a sentence ending.

  • Nor: The word “Nor” at the beginning of a sentence is not marked as negation. Within the body of a sentence the word “nor” is marked as negation.

  • No-one: The hyphenated word “no-one” is treated as a negation marker. Other hyphenated forms (for example, “no-where”) are not.

  • False negatives: Because formal negation depends on words, not context, occasional cases of false negatives may inevitably arise. For example, the sentences “There was no answer” and “The answer was no” are both flagged as negation.

Because negation operates on sentence units, it is important to know what iKnow does (and does not) consider a sentence. For details on how iKnow identifies a sentence, refer to the Logical Text Units Identified by iKnow section of the “Conceptual Overview” chapter.

Sentiment

A sentiment attribute flags a sentence as having either a positive or negative sentiment. Sentiment terms are highly dependent on the kind of texts being analyzed. For example, in a customer perception survey context the following terms might be flagged with a sentiment attribute:

  • The words “avoid”, “terrible”, “difficult”, “hated” convey a negative sentiment.

  • The words “attractive”, “simple”, ”self-evident”, “useful”, “improved” convey a positive sentiment.

Because sentiment terms are often specific to the nature of the source texts, iKnow does not automatically identify sentiment terms. You can flag individual words as having a positive sentiment or a negative sentiment attribute. By default, no words have a sentiment attribute. You can specify a sentiment attribute for specific words using the iKnow UserDictionary. Using the AddPositiveSentimentTerm() and AddNegativeSentimentTerm() methods, you can add a list of sentiment terms to a UserDictionary. When source texts are loaded into a domain, each appearance of these terms and the part of the sentence affected by it is flagged with the specified positive or negative sentiment marker.

For example, if “hated” is specified as having a negative sentiment attribute, and “amazing” is specified as having a positive sentiment attribute, when iKnow applies them to the sentence:

I hated the rain outside, but the running shoes were amazing.

Negative sentiment would affect “rain” and positive sentiment would affect “running shoes”.

Sentiment attributes are supported for the following languages: English, German, Portuguese, Russian, and Ukrainian. Sentiment attributes are not currently supported for Japanese.

Using Sentiment Attributes

Sentiment Analysis information can be used with the following methods:

You can specify a sentiment attribute ID using either the $$$IKATTSENPOSITIVE or $$$IKATTSENNEGATIVE macro, defined in the %IKPublic #Include file.