TelecomParis_IPParis.png Telecom Paris
Dep. Informatique & Réseaux

Dessalles_2018.png J-L. DessallesHome page

May 2023


Cognitive Approach to Natural Language Processing (SD213)

                                other AI courses

Processing Aspectual Relations


Aspect is one of the most remarkable characteristics of the human semantic competence. Any attempt to build clever machines will have, sooner or later, to reverse-engineer Aspect processing. You may understand a sentence like:

"Last year, when the birds flew by, the alarm went off"

as either a single event or a repetitive event (the latter is the only interpretation in French with imperfect tense: L’année dernière, quand les oiseaux passaient, l’alarme se déclenchait.). You might even infer a causal link between the presence of the birds and the trigerring of the alarm. How can machines achieve this?

The difficulty of the problem should neither be overestimated nor underestimated. The purpose of this lab is not to offer a definitive solution. It is rather to show that aspectual competence corresponds to an algorithm that is waiting to be reverse-engineered, and that this algorithm might not be that complex after all.

Aspectual correctedness

The following sentences are all syntactically correct, but some of them seem odd from a semantic point of view.

        English                    -                     French
  • Mary will drink the glass_of_wine
    (= glass of wine = the content of the glass)
  • Mary will drink the glass_of_wine in ten seconds
  • Mary will drink the glass_of_wine in 2024
  • Mary will drink the glass_of_wine for one minute
  • Mary will drink the glass_of_wine during the show
  • Mary will drink water
  • Mary will drink water in one minute
  • Mary will drink water in 2024
  • Mary will drink water for ten seconds
  • Mary will drink water during the show
  • Mary will eat
  • Mary will eat in one minute
  • Mary will eat in one hour
  • Mary will eat in 2024
  • Mary will eat during (the year) 2024
  • Mary will eat for one hour
  • Mary will eat for one year
  • Mary will eat during the show
  • Mary will like the wine
  • Mary snored
  • Mary snored in ten minutes
  • Mary snored in 2010
  • Mary snored for ten minutes
  • Mary snored during the show
  • Mary liked the wine
  • Mary liked wine
  • Mary liked the wine in 2010
  • Mary likes the wine
  • Mary eats
  • Mary eats in one hour
  • Pierre a mangé le gâteau
  • Pierre a mangé le gâteau en une minute
  • Pierre a mangé le gâteau en 2010
  • Pierre a mangé le gâteau pendant une minute
  • Pierre a mangé le gâteau pendant le spectacle
  • Pierre a mangé du gâteau
  • Pierre a mangé du gâteau en une minute
  • Pierre a mangé du gâteau en 2010
  • Pierre a mangé du gâteau pendant une minute
  • Pierre a mangé du gâteau pendant le spectacle
  • Pierre a mangé
  • Pierre a mangé en une minute
  • Pierre a mangé en 2010
  • Pierre a mangé pendant une minute
  • Pierre a mangé pendant le spectacle
  • Pierre a ronflé
  • Pierre a ronflé en une minute
  • Pierre a ronflé en 2010
  • Pierre a ronflé pendant une minute
  • Pierre a ronflé pendant le spectacle
  • Pierre aimait le gâteau
  • Pierre aimait le gâteau en 2010

The examples in English are stored in
The examples in French are stored in

Note that tense is noted using symbols such as _PP for present perfect or _PRET for preterite.

Aspect Examples
Indicate below the sentences (if any) which, according to you, are semantically odd.
(select them in the language in which you feel more comfortable)


Aspectual switches

Reverse-engineering aspectual processing proves quite hard. All native speakers, even young ones, nevertheless deal with aspect effortlessly and without errors. Recent progress has been done in this area, thanks to the work of linguists and computational linguists (such as a former PhD Student, Damien Munch). Fundamental concepts underlying the processing of Aspect are presented now. They take the form of aspectual switches.


You probably decided that a sentence such as "Mary will drink water in one minute" makes sense if she starts drinking in one minute from now, but not if "one minute" measures the duration of the drinking activity. The problem seems to be topological by nature. Let’s use the classical open/close distinction. Intuitively ‘drinking water’ is like an open set, whereas ‘in one minute’ would correspond to a closed set. Similarly, the future tense (FUT) puts the event into a closed set. If we follow this intuition, we expect an incompatibility between "drink water" and "in one hour".

We introduce the aspectual switch viewpoint, which may take two values: f et g. These values correspond to the notions of closed and open sets respectively, or to the notions of figure and ground, or (for linguists) the notions of telic and atelic. A figure is telic, i.e. is perceived as a whole, whereas a ground is atelic, i.e. is perceived from the inside. Another way of making the distinction consists in saying that grounds are self-similar (part of a ground is still the same ground), but part of a figure is no longer that figure.

Considering self-similarity, would you associate the following verbs rather to figures or to grounds?

snore     figure ground it depends
sneeze    figure ground it depends
remove    figure ground it depends
walk      figure ground it depends
send      figure ground it depends



Our second aspectual switch has to do location in time. Time periods such as ‘next week’, ‘one week’, ‘2024’, ‘one year’ behave differently. Consider the sentences:
  1. the market will collapse next week (= at some moment of next week)
  2. the market will collapse in one week (= it will take one week / after one week)
  3. the market will collapse in 2024 (= at some moment in 2024)
  4. the market will collapse in one year (= it will take one year / after one year)
What is the difference, for instance, between ‘2024’ and ‘one year’?
Answer: the former is anchored in time, while the latter may be located anywhere.

We introduce the aspectual switch anchoring to capture this difference. It may take two values:

The interpretation of "in/during p" as "at some moment of/in/during p" is only possible with a anchored period.
The implicit interpretation of "in" as meaning "after" is only possible with an unanchored period.

Using these criteria, decide which of the following expressions should be regarded as anchored (try to prefix each phrase with "She died during").

the exhibition    anchored unanchored None
an exhibition     anchored unanchored None
a day             anchored unanchored None
that day          anchored unanchored None



The Occurrence switch may take two values: sing or mult. It indicates that the event is consider to occur once or several times.


Much of aspectual processing is triggered by duration conflicts. For instance in: there is a mismatch between the duration of the event (to marry, to use a bike) and the duration of the time frame (a whole year). This triggers futher processing, such as slicing in the first case (at some moment in 2024) or repetition in the second case (several times in 2024)

Typical durations are stored in the lexicon (we know that a typical meal lasts for about one hour). In the implementation, durations are stored in the feature Duration as log10(<duration in seconds>). For instance, dur:1 means that the event typically lasts for 10 seconds.

The duration feature may behave like a genuine aspectual switch, as some operations (slicing and predication) generate events that lack duration. For instance, She drank alcohol! (in a context in which the fact was unexpected/wished/feared) loses its durativity. Similarly, at some point in 2024 has no duration either. These periods aremarked using a nil duration marker, for instance: dur:nil(0.3). The duration aspectual switch prevents durative events from being matched with non-durative ones. However, in She drank alcohol in 2024, both nil durations can be matched.

Aspectual operators

Phenomena like aspectual coercion (see bibliography) reveal that aspect may be changed dynamically through at least four unary operators: zoom, repeat, slice, predication. Let’s consider them in turn.


The effect of zoom is to transform a f into a g.
dp vwp:f occ:sing anc:0 dur:D          vwp:g

In the implementation, this operator applies only to determiner phrases (dp).


Consider sentences like:
  1. When Mary was building the wall, Peter was sick.
    Quand Marie construisait le mur, Pierre était malade.
  2. When Mary was building the wall, Peter was cooking.
    Quand Marie construisait le mur, Pierre faisait la cuisine.
In 2, we tend to consider that Peter cooked several times. This comes from the fact that there is a mismatch between the duration of wall-building (one week or more, typically) and the duration of cooking (about one hour).

We introduce the dynamic aspectual operator repetition. Its first effect is to change a figure f, once repeated, into a ground g. The second effect of repetition is to change the value of the occ switch from sing to mult.

vp/vpt vwp:f anc:0 occ:sing dur:D          vwp:g occ:mult dur:min(D)

Note that in the implementation, repetition affects only vp (verb phrases) or vpt (i.e. a verb phrase possibly followed by a prepositional phrase (vp [+ pp])). The mention dur:min(D) is used to force the repetition to last longer than the repeated event.
Which examples among the following may reasonably receive a repetitive interpretation?

    She looked at her watch during the meeting
    She will visit him before he leaves
    He will buy carrots while she is parking



A sentence like:
is interpreted as the fact that she will marry some day in 2024. The durative interpretation is blocked by the fact that a wedding cannot last one year (compare with "she will be sick in 2024") and that one usually does not marry several times in the same year (compare with "she will use/be using her bike in 2024"). The idea of "at some moment in 2024" or "some day in 2024" presupposes that 2024 has been transformed into a temporal slice of itself.

We introduce the dynamic operator slice that transforms an anchored durative period into a singular non-durative figure.

pp vwp:f anc:1 occ:sing dur:D          vwp:f anc:1 dur:max(D)

In the implementation, only anchored temporal complements (pp) can be sliced. Note the use of max to indicate that a slice should last less longer than the container.
Which examples among the following is likely to involve a slice?

    Her phone rang during the show
    She was happy during these years
    She took dozens of photos during the walk


Some sentences have a strange behaviour. For instance: seems odd at first reading. ‘Eating soup’ is a ground; it cannot be assigned to a delimited period of time (figure).
Then, an admissible interpretation pops up: that she will start eating soup after two minutes and 30 seconds.
In the current implementation, this behaviour is generated by the slice operator. minutes, for instance, will correspond to two entries in the lexicon: the usual indication of duration, and another meaning which refer to "minutes from now". The latter is an anchored figure, while the former is unanchored.


Our fourth aspectual operator is predication. Predication lies ‘at the top’ of linguistic expression. It could be said that the purpose of most sentences is to produce a predicate that will receive an attitude (the point of the sentence). "Peter likes Mary" translates easily into like(Peter, Mary) (we assume here that a convenient predicate like is waiting in the semantic knowledge, ready to be associated with the word "like").

Consider the situation described by "drink water". One may imagine that the drinking action proceeds through time, lasting about, say, 4 seconds. If the relevance of the situation is due to the fact that it is opposed to "drink alcohol", then it loses its temporal nature. In such a context, "to drink water" may mean "not to drink alcohol". Predication comes with an implicit attitude and an implicit negation. If you say "she will dress in brown", the predication of "dressed in brown" comes with the presupposition that her not dressing in brown was (un)expected/wished/feared. Same thing for "She will be dressed!" (as opposed, for example, to the feared/wished/expected event "She will show up naked").

We introduce a dynamic aspectual operator predication that may apply to certain phrases. Here, predication will be indicated by an exclamation mark.
Consider the sentences:     

  1.     she will !(snore) during the show
  2.     she will !(snore for ten minutes)
In 1, predication concerns "snore", which must be seen as unexpected/wished/feared, to match a slice of "the show" (there is also a repetitive interpretation).
In 2, prediction concerns the whole phrase "snore for ten minutes". In this case, it is rather the duration that must be seen as unexpected/wished/feared.

One effect of predication is to convert events into non-durative figures (f). If we say about a vegetarian person:

it would be odd to add "during ten minutes". Once the point has been made about eating meat (vs. not eating meat), duration can no longer be applied.
vp/vpt anc=0 dur:D          vwp:f anc:1 dur:nil(D)

In the implementation, predication affects only vp (verb phrases) or vpt (i.e. a verb phrase possibly followed by a prepositional phrase (vp [+ pp])).
Predication (1)
In the sentence:
  • she will sneeze for ten minutes
we understand that she will sneeze repeatedly. Which of the following are figures?

will sneeze
for ten minutes
sneeze for ten minutes
!(sneeze for ten minutes)


Processing Aspect

Two implementations are proposed here, one in Prolog and the other in Python. Both yield roughly the same results. The Python version reads two prolog files, the lexicon file and the grammar file

Discovering lexical structures

Observe the small lexicon in As you can see, ‘eat’ has several definitions that differ (in part) by their syntactic category. For instance,

lexicon(eat, vp, [vwp:f, im:eat_meal, dur:3.5]).

Here, vp means that the verb can be seen as a verb phrase on its own and does not expect any complement.

The feature vwp means viewpoint. Its value may be f (figure) or g (ground).

The feature im stands for ‘image’. It would ideally refer to some perceptive representation of the scene. Here, it will just consist of a nested textual labels.

The feature dur represents typical duration (when applicable) as a 10-base logarithm in second units. For instance, the above duration noted 3.5 represents 103.5 = 3162 sec. ≈ 53 min.

Draw (1)
Introduce new lexical entries for the verb ‘draw’, as it is used in "draw a circle".
Don’t forget to indicate duration, viewpoint and image.


Draw (2)
Introduce another lexical entries for the verb ‘draw’, meant as an activity (without complement).
Don’t forget to indicate duration, viewpoint and image.


Discovering the grammar

Observe now the small grammar that is used to parse our examples. It is expressed using simple DCG rules. For instance:

vpt --> vp, pp.

This rule means that a vpt is a verb phrase vp followed by a prepositional phrase pp.
Note that the grammar is binary (no more than two items on the right-hand side of rules). This is meant to represent the action of the syntactic merge operation.
Note also the use of ip (inflection phrase), of tp (tense phrase) and of dp (determiner phrase), in accordance with modern linguistics.

Semantic merge

The most central component of the program is the "semantic merge", which is triggered whenever two phrases are syntactically merged. The basic semantic merge consists of matching the feature structures of the two merged phrases.
Consider for instance the prepositional phrase in ten minutes.
The preposition in corresponds to the feature structure [vwp:f].
The phrase ten minutes corresponds to the feature structure [vwp:f, anc:0, im:10_minutes, dur:2.8].
You can see that both structures match by executing the Prolog program or the Python program

Note: The Prolog version is currently disabled.
Only the Python version is up to date.

In prolog:
?- test(pp).
---> in ten minutes
The sentence is correct [pp([vwp:f, anc:0, im:10(minute),dur:2])]

To execute the Python version, you need:

In Python:

Input a sentence > [press Enter]
or maybe a phrase > in ten minutes

__ pp: (vwp:f, anc:0, dur:2.8)
__ pp: (vwp:f, anc:1, dur:2.8)

The merging of structures is found in (matchFS) or in (FeatureStructure.merge).
Viewpoint, anchoring and multiplicity are merged based on identity through unification. Duration is merged by checking duration compatibility.


Semantic merge is not limited to mere feature matching. The "intelligent" part of aspectual processing lies in the set of aspectual operators (repeat, slice, predication...). These operators are implemented as "rescue" operations.

In Prolog, the rescue procedure is found in It is called repeatedly at each backtrack.
In Python, the rescue procedure is in as a method in the class WordEntry.
The point of rescue is to transform the aspectual representation of the current frame through slicing, repetition or predication when applicable. Note that the applicability of operators depends on the syntactic category (e.g. only pp can be sliced).

Playing with the program

Execute the program by running or

To run the program on French sentences, open and comment/uncomment the two first lines to change the language.
In Python, comment/uncomment two lines in the main procedure of
This will load instead of and instead of

At this point, the program correctly interprets some sentences. The interpretation of examples is given as a kind of paraphrase.

For instance, in Prolog:
?- test.
Sentence ---> Mary will drink a glass_of_wine

== Ok. f.d [occ:sing, dur:0.9] ---> in the future Mary ingest this glass_of_wine

and in Python (no paraphrase in Python):
__ s: (vwp:f, anc:1, occ:sing, dur:0.9)
    ingest(Mary,1_glass_of_wine) loc(+)

Now test drink a glass_of_wine in ten seconds.
In Prolog, you have to call:
?- test(vpt).
Sentence ---> drink the glass_of_wine in ten seconds
and you have to press ; (semicolon) after true to get all interpretations.

In Python:
Input a sentence > [press Enter]
or maybe a phrase > drink a glass_of_wine in ten seconds

Will Drink
You should get two interpretations for drink a glass_of_wine in ten seconds.
Which aspectual operator is involved in this difference of interpretation?

zoom    slicing
predication    repetition


Consider the sentence (in Prolog, simply type test. to test a sentence):
Mary _PRES drink a glass_of_wine

You should get only one interpretation.
Which aspectual operator did apply here?

zoom    slicing
predication    repetition


Let’s replace 🍷 by 💧. Try the sentence Mary will drink water for ten seconds. You shouldn’t get any interpretation.
Why does this sentence get rejected? If we try the following:

Input a sentence > [press Enter]
or maybe a phrase > drink water for ten seconds
__ vpt: (vwp:g, anc:0, dur:0.9)
    ingest(_,water) dur(for_10_seconds)

(?- test(vpt).) (in Python, just type-in the phrase).
Phrase of type vpt ---> drink water during one minute
== Ok. g.u [occ:sing,dur:0.9] ---> drink_some zoom 1 minute water

we get a correct interpretation of the phrase as a ground (g).
If we look at the definition of will in, we can see that it corresponds to a figure f:

lexicon(will, t, [anc:_, vwp:f, im:+]).

This explains the mismatch.
Yet, we feel that Mary will drink water for ten seconds should receive some kind of interpretation. If you think about it, you will observe that it is only true if "drink water for ten seconds" can be regarded as unexpected/wished/feared (e.g. if it is a feat / if Mary does not drink enough / if there is not enough water for all). In other words, the sentence is acceptable is "drink water for ten seconds" translates into a predicate that receives an attitude. As such it loses its temporality.

Open and uncomment the call to predication in the rescue0 clause.
In Python, augment the list of operators as asked in WordEntry.rescue in
Now, the program is able to assign an attitude to vpt through the predication operation. Predication is indicated with an exclamation point !... in the output. Now we get several interpretations for the phrase drink water for ten seconds, including

__ vpt: (vwp:g, anc:0, dur:0.9)
    ingest(_,water) dur(for_10_seconds)

__ vpt: (vwp:f, anc:0, dur:nil(0.9))
    !ingest(_,water) dur(for_10_seconds)

Drink water (1)
Test the sentence "Mary will drink water for ten seconds" and verify that it now receives some interpretations. Why is it so?

Because predication cancels durativity and matches the future tense
Because predication cancels durativity and allows matching with ‘ten seconds’
Because predication generates a figure that matches the future tense


Before enabling predication, a sentence like Mary will drink water was rejected. Now that we have predication, it is accepted as: __ s: (vwp:f, anc:1, dur:nil(0.9))    !ingest(Mary,water) loc(+).

Drink water (2)
Try the sentence Mary will drink water in 2028.
It is rejected. Why?

Because of a viewpoint mismatch
Because of an anchoring mismatch
Because of a viewpoint and an anchoring mismatch
Because of a duration mismatch
Because of a multiplicity mismatch
None of the above


To get the sentence Mary will drink water in 2028 accepted, visit again the source to enable the slice rescue operation. In Prolog, suppress the fail, line in the clause of slice. In Python, add ‘slice’ to the list of rescue operators, as you did for predication.

Drink water (3)
Why is the sentence Mary will drink water in 2028 now accepted?
(you may analyze the pp phrase in 2028 and see that there is now a new alternative )
To answer this question, you may have to locate the compare procedure in the Duration class to see how a Nil duration (due to predication) compares with a max duration (due to slicing).

Because a sliced period matches any duration
Because a sliced period matches any shorter duration
Because a sliced period matches any longer duration
Because a sliced period matches any nil duration
Because a sliced period matches any shorter nil duration
Because a sliced period matches any longer nil duration


Going further

In order to get all the program’s features: You might test the program on the full set of sentences by typing:
python Output.txt 1

and reading the content of Output.txt (the trailing 1 is to set the trace level to 1).

You may then try to introduce new lexical words (as you did with ‘draw') and new aspectual words such as ‘after’, try sentences with imperfect tense, explore sentences in French, implement lexicon and grammar from another language, and so on. Note that you can adapt the trace level to see more detail.


This implementation of aspectual processing is provided as an illustration. It is not yet perfect and it is not complete. You may want to improve it on some aspects (no pun intended).

You might wish to update the Prolog version. Missing Prolog files and are accessible (prefixed by ‘___’ on the server).




Back to main page