Methods: We will make use a combination of both off-line and on-line dependent measures in pursuing the objectives outlined in (a)-©. Regarding (a), the aim is to establish what facts are to be explained, in part through systematic and well-controlled large-scale surveys of informant judgements about key cases. The main methods used for this phase will include the inference task (van Tiel et al., 2016, Chemla 2009, Chemla & Spector 2011, Gotzner & Romoli 2017 a.o.). The inference task involves presenting an utterance and its possible inference and asking participants whether or how much they would conclude the latter from the former. The design would involve manipulating factors potentially at play, discussed above (e.g. complexity/cost, salience, informativeness, discourse inferential factors, etc.). Items will be normed using several novel tasks including a connective cloze task to elicit intuitions of parallel or contrast for items like (9-10). In addition to testing items with widely discussed scalar terms as in in (1-4), our survey will take in less discussed adjectival scales (see below) and ad hoc scales (illustrated in (9-10) above). In adopting the inference task, several methodological considerations emerge from recent literature on SI and these will bear on our project. In particular, it has been established that different scalar expressions give rise to different rates of scalar implicature (Scalar Diversity). The distinctiveness of alternatives (van Tiel. et al, 2016), scale structure (Gotzner et al., 2018) and the prior likelihood of different lexical enrichments (Sun & Breheny, under review) have been shown to be factors which affect the likelihood that a participant will respond positively that a scalar inference is present. All of these factors will be considered in designing and developing the experiments related to (a). In particular, in investigating the case of gradable adjectives mentioned above, we aim to build on van Tiel et al. (2016) and Gotzner et al. (2018) and further explore the factors underlying variation across different adjectives, with the goal of establishing clearly the role of scale structure vs. other factors in the derivation of their inference, which will in turn be crucial in order to test the predictions of the main approaches to alternatives.

Another question raised in the theoretical literature, mentioned in relation to (7) above, concerns whether conversational expectations of explicitness may play a role in the availability of an implicit inference. We have conducted initial pilot work manipulating whether in filler items, the speaker is normally fully explicit in depicting the state of affairs or not. For example, a filler item might explicitly specify that A is the case, but not B, or it might only mention that A is the case, leaving it implicit that B is not. The initial results show that in the explicit condition, participants are less liable to derive the scalar inference. Our aim to extend this initial pilot and further explore to what extent explicitness is really a factor in these critical data points. The main outcome of objective (a) will be a large amount of data on the strength or weighting of different factors.

Regarding (b) and (c) our aim is to establish the best way to integrate the different factors into a single framework for computing alternatives. In particular what kind of adjustments to the Structural and the RSA accounts might provide a better framework for understanding Alternatives. In the course of theory evaluation, we will make the innovative step of using experimental paradigms that tap into implicit on-line processing in order to test hypotheses about the relative weightings of these factors. In one paradigm, we will use visual world eye-tracking, which has been shown to be able to give a measure of participants’ anticipatory inferences (Altman & Kamide, 1999; Breheny, Ferguson & Katsos 2013). In particular, we aim to exploit the paradigm from Tian, Ferguson & Breheny (2016), where participants’ gaze is tracked while viewing two contradictory states of affairs (e.g. a washed bowl and a dirty(not washed) bowl). This study reveals how participants’ gaze shifts between these states as they move from representing the context of a negative sentence (the positive state of affairs) to its intended meaning (the negative state of affairs). In pilot work on SI, with a context sentence like ‘John finished his drink and washed his bowl. What about Bill?’ we present four images: full glass, empty glass, clean bowl, dirty bowl. We found that when participants process the answer segment, ‘Bill finished his drink.’, they first attend to images consistent with the alternative (empty glass, clean bowl) and then gaze shifts to the implicature-consistent state (unwashed bowl). Using this effect, we can test models that combine factors contributing to the strength of the alternative by probing looks to potential alternatives, as well as looks to the SI consistent state. The innovation here will be to use gaze toward the SI-state as a measure of the strength of implicature (linked to off-line results from the inference task). The gaze data towards the alternative would then give us a direct measure of the strength of the respective symmetric alternatives. The expectation is that high rates of SI in off-line studies will be reflected in strong ultimate bias toward the SI state, while low rates of SI will be linked to low SI-state bias Also, strong SI bias will be linked to clear prior looks to a single Alternative state, while lower SI bias will be linked to gaze data reflecting a competition between symmetric alternatives. By manipulating factors such as direction of inference (8-9), cost and informativity via context stimuli, we expect to be able to link SI rates to a more direct measure of strength of competing alternatives. This data, together with data from priming studies (see below) should provide better information on the relation between the factors being considered as contributing to the strength of Alternatives and rates of SI. In turn, these data will provide a test for competing models, based on theoretical assumptions.

Discourse factors and motivating experimental work

To take one example, discourse saliency is a factor which plays a key role in the structural theory. For example, Fox & Katzir (2011) argue that constituents made salient in the discourse can be treated as if they were no more complex than a constituent in the assertion. This can explain why (6) – due to Matsumoto (1995) – implies that yesterday it was not a little bit more than warm. (6) Yesterday it was warm, today it was a little bit more than warm. The appeal to salience leads to predictions that are important to verify, yet introspective judgements are reported to be delicate. In particular, Fox & Katzir (2011) argue that a salient constituent can create a situation where it can compete with a lexical alternative and so create symmetry, leading to no overall implicature. Thus the claim is that (7) does not imply that Alice did not do just some of her homework yesterday; The reason for the absence of this inference would be the presence of the lexically generated symmetric alternative (all), which cannot be ignored. (7) is compared to (8), for which there would be no symmetric alternatives, given the presence of the universal deontic modals, and which would instead involve an inference that yesterday Alice didn’t have to do all of her homework and she didn’t have to do just some of the homework. (7) Yesterday, Alice did some of her homework. Today, she did just some of her homework. (8) Yesterday, Alice had to do some of her homework. Today, she has to do just some of her homework. Examples such as (7) versus (8) are important evidence for the structural theory’s argument that symmetry is determined by structural factors alone and cannot be broken by contextual salience. However, introspective judgments about them and related data points are very delicate, particularly as (7) is generally judged to be infelicitous with respect to (8) (and none of the approaches to alternatives offer an explanation of its infelicity). It is possible that conversational expectations related to explicitness have an impact in these cases. And it is also possible that this data point brings in other factors that have been unearthed in the Scalar Diversity literature – mentioned below. We think that it is therefore crucial that data point like (7) vs (8) are investigated experimentally, so that the role of discourse saliency, in contrast with other factors, is brought out clearly and better understood. As another example, direction of discourse inference seems to also play a role into a theory of alternatives. To illustrate one instance, consider the following example from Trinh and Haida (2016). Suppose someone says (9): (9) John ran and didn’t smoke. Bill ran. This utterance can have an implication that Bill smoked, presumably arising from the negation of the alternative Bill ran and didn’t smoke, made salient in the context. How to obtain this alternative, without at the same time not having to consider its less complex, symmetric counterpart (Bill ran and smoked) is entirely non-trivial and constitutes another important challenge for theories of alternatives (Trinh and Haida 2016, Breheny et al 2017 for discussion). Here, we also focus on another aspect of the problem. Consider a minimally different utterance: (10) John ran but didn’t lift weights. Bill ran. As observed by Trinh & Haida 2016, (10) does not seem to have an SI that Bill lifted weights. This pair of examples suggest that the direction of contextual implications of the predicates “not smoke” and “not lift weights” matter for selection of alternatives. That is, “not smoke” implies the same kind of health benefits as “run”, while “not lift weights” and “run” potentially contrast in health benefits. Regarding (9), Trinh & Haida (2016) show that to simply posit an ad hoc scale (<run and not smoke, run>) for (9), building on a proposal by Klinedinst 2004, makes incorrect predictions when a competing alternative run and smoke is considerably less complex. While we can agree with Trinh & Haida that it is insufficient to suppose simply an ad hoc scale is created by context for (9), the contrast with (10) remains unaccounted for in their proposal. And indeed, it seems, natural, that parallelism of discourse implication may add benefit (or reduce cost) of alternatives, compared with contrasting implications. A factor of this kind may then trade off with others such as complexity and informativeness and that could be the source of contrasts like (9) vs (10). An important aim of our project is to explore whether factors that encompass `parallelism of implications’ vs `contrast’ between alternatives play indeed a role for cases like the above. If correct, this would give us a more complete picture of the factors underlying the different cases of the symmetry problem. In addition, as discussed in Breheny et al. (2017), Trinh & Haida’s proposal for a modified structural account of examples like (8) does not adequately cover similar cases of this type. For one, it does not handle minimally different examples without conjunction, suggesting that a different way of re-thinking the structural approach is necessary. Similarly, the RSA approach, as it stands, does not even handle the basic case in (9), thus it also requires modifications to reach empirical adequacy with data points like those. In addition, as discussed in Spector 2017, this approach makes a series of specific predictions which also require experimental testing. Finally, another case discussed in Breheny, Klinedinst, Romoli & Sudo (2017) involves adjectives and their possible inferences. In particular, some negative adjectives appear to give rise to scalar inferences, (11), while others do not, (12). (Note that contrasts between cases like full and transparent show that scale structure does not fully predict the strength of the scalar inference (Gotzner et al 2018), as both of these adjectives are associated with upper (and lower) bounded scales (Kennedy & McNally, 2005)). (11) a. The cup is not full ~> the cup is not empty b. A tie is not required ~> A tie is allowed c. Mary’s promotion is not certain ~> Mary’s promotion is possible (12) a. John is not tall ~/~> John is not small b. This neighborhood is not safe ~/~> This neighborhood is not dangerous c. The glass is not transparent ~/~> The glass is not opaque These cases are problematic for all approaches to alternatives. They, however, also involve delicate comparative judgments about subtle inferences. Therefore careful experimental investigation of this area is necessary as a further step in understanding the empirical landscape related to the symmetry problem. In sum, the symmetry problem remains an important challenge for theories of alternatives and in turn theories of scalar implicatures. An as soon as we move away from the basic case of “some”, none of the approaches available in the literature appear able to account for all the various data points involving symmetry. Further, the divergent predictions of such approaches have not been for the most part experimentally tested. We aim to make progress in this area by providing experimental evidence to bear on those predictions and compare, develop and modify the different theories on the basis of those results. In particular, the objectives of the project are outlined below: Objectives: (a) To establish large-scale database of facts to be explained, in order that theoretical discussion is less reliant on subtle introspective judgements. (b) To Identify which factors may be relevant to determining alternatives and their weighting, using off-line judgement and on-line methods. (c) To evaluate different theoretical approaches to alternatives (such as the structural theory and Bayesian/probabilistic frameworks) in terms of their ability to accommodate several factors (cost/structural complexity, informativity, etc.) and appropriately modify/develop such approaches on the basis of the collected experimental data.


At the heart of the theory of SI is the idea of alternative expressions, or alternatives for short. That alternatives do play a role in SI computation is supported by the findings of several behavioural studies, which have focused on the differences between children and adults in their ability to compute alternatives (Guasti et al. 2005, Barner et al. 2011, Tieu, …Romoli et al. 2014, 2016, 2017, Hochstein et al. 2016); and by the empirical and predictive power of the theoretical proposals enumerated above. Yet, many of these proposals rely on assumptions about the space of alternatives relative to a word or linguistic construction. They thus depend on a general theory of how alternatives are generated and selected for the computation of SIs. However, developing such a theory has proven to be entirely non-trivial (Breheny, Klinedinst, Romoli & Sudo 2017, Romoli 2012, Fox 2007, Fox & Katzir 2011, Katzir 2007, 2014, Kroch 1972). In particular, the central unsolved issue is the so-called symmetry problem. To illustrate the issue, consider the sentence (1) Some of the homework is difficult. This has a SI that not all of the homework is difficult, and under most theories, this SI is derived by negating the alternative sentence (2) All of the homework is difficult. The problem, however, is why this is an alternative for the generation of the SI while (3) Some but not all of the homework is difficult is not. After all, (3) states explicitly the meaning of (1) enriched by its SI. It is apparently as relevant as the attested Alternative, by any standard understanding of ‘relevance’. Notice that if (3) were an alternative to (1), a SI that all of the homework is difficult would be predicted, which is not actually observed. 

Two Main Approaches

There are several recent approaches to this problem, the most developed of which are the structural theory of alternatives (Katzir, 2007, Fox & Katzir, 2011, Trinh & Haida 2016) and the Bayesian, Rational Speech Act (RSA) approach (see Frank & Goodman, 2016; Bergen et al. 2016). Our main theoretical focus will be on the predictions of these two approaches, though we will also consider recent proposals such as Buccola et al. (2018). These approaches have in common an idea that alternatives are selected on the basis of structural complexity or cost. For example, (3) is structurally more complex than (2), which is equally complex to (1). Both structural and RSA approaches use the differences in complexity/cost of potential alternatives in computing what alternatives are used. As we discuss in detail in Breheny, Klinedinst, Romoli & Sudo 2017, however, a theory based on complexity alone, while it can take care of the simple case above, does not handle more complex cases of the symmetry problem. A simple illustration of the problem involves (4) Not all of the homework is difficult. This implies that some of the homework is difficult, presumably because (5) None of the homework is difficult is the alternative. However, neither approach can rule out (1) as an alternative to (4) on the basis of complexity alone. A solution offered by RSA (one which is implementable also within a structural account) would be to select alternatives also on the basis of a measure of ‘informativity’. However, in Breheny et al. (2017) we also showed that an approach based on complexity and informativity has problems with other cases (see for instance (8), which we discuss below). While we do believe that structural complexity/cost, is a factor relevant for determining alternatives, we think that other factors are also involved in breaking symmetry between alternatives. Recent work recognises this point, and suggests that discourse saliency and frequency might also have roles to play (Katzir 2007, Fox & Katzir 2011, Katzir 2014, Swanson 2010, 2018, Trinh & Haida 2016; see also Russell 2012). However, it is theoretically left unclear to what extent these factors are relevant and whether and how they interact with each other. This is due in part to the fact that many of the discussed data points rest on subtle judgements and that, while SI in general has been investigated extensively from an experimental perspective, data points and predictions more specifically relevant for theories of alternatives have been paid much less attention.

Scalar Implicature

Scalar implicature: Scalar Implicature (SI) is a type of enrichment of literal meaning that is based on the meanings

of alternative expressions. A leading example, from the seminal work of Paul Grice (1975, 1989) involves “or”.

There is compelling evidence that its literal meaning is not “exclusive”, i.e. that ‘John ate cake or ice cream’, for

example, is literally true if John ate both things. Yet in many linguistic and conversational contexts the sentence is

perceived to convey exclusivity — that John ate only one of them. Following Grice, the latter fact is widely

hypothesized to be due to the existence of a competitor or alternative expression, ‘and’: since ‘John ate (both) cake

and ice cream’ is more informative, in typical conversational circumstances it would be chosen over the ‘or’

sentence in order to convey that John did eat both things.

Since Grice’s work, SI attracted significant attention in pragmatics, semantics, and logic (Gazdar 1979, Horn 1989,

Hirschberg 1991), and more recently, linguistics and psychology (Chierchia 2004, 2013, Chierchia, Fox & Spector

2012, Fox 2007, Geurts 2010, Klinedinst 2007, Noveck & Sperber, 2004, Magri 2009, Van Rooy & Schulz 2004,

Van Rooij & Schulz 2006, Romoli 2015, Spector 2006, among many others). Moreover, recent studies have

uncovered that SI is behind a wide variety of semantic phenomena, not just the interpretations of connectives like

“or” and quantifiers like “some”, but also plural marking (Ivlieva 2013, Magri 2012, Mayr 2015, Spector 2007, Zweig

2009), questions (Cremers 2016, Cremers & Chemla 2014, Klinedinst & Rothschild 2011, Nicolae 2013, Uegaki

2015), free choice inferences (Fox 2007, Klinedinst 2007, Santorio & Romoli 2017), polarity sensitivity (Chierchia

2004, 2006, 2013, Nicolae 2013), presupposition (Romoli 2012, Mayr & Romoli 2016, 2017), neg-raising (Romoli

2013), temporal inferences (Musan 1995, Magri 2009, 2011, Thomas 2012, Sudo & Romoli 2017, Kane, …, Sudo,

Folli & Romoli 2017), etc. A rapidly growing number of recent studies also address the question of how the capacity

to understand this wide ranging set of linguistic phenomena develops through childhood (Noveck, 2001, Chierchia

et al. 2001, Gualmini et al. 2001, Barner & Bachrach 2010, Barner et al. 2011, Singh et al. 2016, Tieu, Romoli, et

al. 2016, Tieu, …, Romoli, et al. 2017).