Natalia Levshina

Quantitative linguistics, corpora and language universals

My photo

Statistics and R Tutorials

Some of my statistical models

Constructive Competition

Semantically similar constructions and lexemes are a challenge for linguistic theory and practice. According to the Principle of No Synonymy (Goldberg 1995) accepted by Cognitive Linguists, if two constructions are different in form, they should also differ in function. Nowadays, usage-based linguists have been able to find the functional distinctions between near-synonyms in such tricky cases as the Saxon vs. Norman Genitive (Szmrecsanyi 2010), the double object dative vs. the to-dative (Bresnan et al. 2007), or the position of phrasal verb particles (Gries 2003). Their experience has shown that advanced multivariable statistical techniques are a necessity in this task. My puzzle at the moment is Dutch causative constructions with the auxiliaries doen 'do' and laten 'let'. I'm using multiple logistic regression with fixed and random effects (Baayen 2008). In the graph below you can see the influence of some semantic and syntactic factors on the odds of doen vs. laten in a context. The more to the right, the higher the odds of laten. The strongest pro-laten and anti-doen factor is animacy of the causer (CrSem=Anim). Article submitted for publication.


From Lexical Fields into Constructional Spaces

In the Structuralist era, systematic relationships between linguistic units were the focus of attention. This is why lexical fields were very popular. The early Cognitive Linguistics was busy with polysemy and the relationships between senses, but interest in related words and constructions has been growing recently (see Constructive Competition). My model, based on multiple correspondence analysis, can deal with more than several related items at different levels of schematicity. The items - lexemes or constructions - are projected onto one conceptual space, which is constructed on the basis of usage events. The example below is a conceptual space of German categories Stuhl 'chair' and Sessel 'armchair' based on the data from online German furniture catalogues. The size of the symbols reflects the frequency of usage events. The analysis showed that Sessel is associated with comfort (e.g. reclining back) and relaxation, whereas Stuhl is a more functional piece of furniture.


The second example shows the Dutch causative constructions with doen and laten in a two-dimensional conceptual space formed by different semantic and syntactic features. The map shows, for example, that doen is semantically restricted to causation patterns with mental caused events and inanimate causers (e.g. 'It makes me feel sad'). Article in preparation.



studies relationships between the formal, conceptual and social dimensions of language variation. For example, a quantitative analysis of Dutch causatives with doen and laten shows that some of the conceptual features boost variation between Dutch in the Netherlands and in Belgium. The figure below demonstrates an interaction between Transitivity and Country. Transitive verbs in the Netherlands have on average less chances to occur with doen than in Belgium. From the other perspective, the geographic difference between the countries is more outspoken for transitive verbs (the NL-BE slope is steeper for transitives than for intransitives). Article submitted for publication.


Causatives of the World, Unite!

Different languages construe causation and causal relations in different ways. The plot below shows a semantic map of English causative constructions with the Dutch ones plotted on it (see From Lexical Fields into Constructional Spaces). It is interesting that some semantic distinctions where English displays variation in the form of the verb are made in Dutch with the help of the prepositional marking of the causee. For instance, compare the English have + Ved (Past Participle) and the Dutch laten + Vinf + causee marked with preposition door "by" (laten_door_V) in the top left sector. Like living organisms, languages can adjust to the same challenges in structurally different ways. Article submitted for publication.


Acknowledgment: the graphics were produced with the help of the R language and environment
The script for the 3D-graph was written by Dirk Speelman.

Copyright 2013 -     Natalia Levshina    Disclaimer
This page was last modified on: 13.01.2018