Methods: We will make use a combination of both off-line and on-line dependent measures in pursuing the objectives outlined in (a)-©. Regarding (a), the aim is to establish what facts are to be explained, in part through systematic and well-controlled large-scale surveys of informant judgements about key cases. The main methods used for this phase will include the inference task (van Tiel et al., 2016, Chemla 2009, Chemla & Spector 2011, Gotzner & Romoli 2017 a.o.). The inference task involves presenting an utterance and its possible inference and asking participants whether or how much they would conclude the latter from the former. The design would involve manipulating factors potentially at play, discussed above (e.g. complexity/cost, salience, informativeness, discourse inferential factors, etc.). Items will be normed using several novel tasks including a connective cloze task to elicit intuitions of parallel or contrast for items like (9-10). In addition to testing items with widely discussed scalar terms as in in (1-4), our survey will take in less discussed adjectival scales (see below) and ad hoc scales (illustrated in (9-10) above). In adopting the inference task, several methodological considerations emerge from recent literature on SI and these will bear on our project. In particular, it has been established that different scalar expressions give rise to different rates of scalar implicature (Scalar Diversity). The distinctiveness of alternatives (van Tiel. et al, 2016), scale structure (Gotzner et al., 2018) and the prior likelihood of different lexical enrichments (Sun & Breheny, under review) have been shown to be factors which affect the likelihood that a participant will respond positively that a scalar inference is present. All of these factors will be considered in designing and developing the experiments related to (a). In particular, in investigating the case of gradable adjectives mentioned above, we aim to build on van Tiel et al. (2016) and Gotzner et al. (2018) and further explore the factors underlying variation across different adjectives, with the goal of establishing clearly the role of scale structure vs. other factors in the derivation of their inference, which will in turn be crucial in order to test the predictions of the main approaches to alternatives.

Another question raised in the theoretical literature, mentioned in relation to (7) above, concerns whether conversational expectations of explicitness may play a role in the availability of an implicit inference. We have conducted initial pilot work manipulating whether in filler items, the speaker is normally fully explicit in depicting the state of affairs or not. For example, a filler item might explicitly specify that A is the case, but not B, or it might only mention that A is the case, leaving it implicit that B is not. The initial results show that in the explicit condition, participants are less liable to derive the scalar inference. Our aim to extend this initial pilot and further explore to what extent explicitness is really a factor in these critical data points. The main outcome of objective (a) will be a large amount of data on the strength or weighting of different factors.

Regarding (b) and (c) our aim is to establish the best way to integrate the different factors into a single framework for computing alternatives. In particular what kind of adjustments to the Structural and the RSA accounts might provide a better framework for understanding Alternatives. In the course of theory evaluation, we will make the innovative step of using experimental paradigms that tap into implicit on-line processing in order to test hypotheses about the relative weightings of these factors. In one paradigm, we will use visual world eye-tracking, which has been shown to be able to give a measure of participants’ anticipatory inferences (Altman & Kamide, 1999; Breheny, Ferguson & Katsos 2013). In particular, we aim to exploit the paradigm from Tian, Ferguson & Breheny (2016), where participants’ gaze is tracked while viewing two contradictory states of affairs (e.g. a washed bowl and a dirty(not washed) bowl). This study reveals how participants’ gaze shifts between these states as they move from representing the context of a negative sentence (the positive state of affairs) to its intended meaning (the negative state of affairs). In pilot work on SI, with a context sentence like ‘John finished his drink and washed his bowl. What about Bill?’ we present four images: full glass, empty glass, clean bowl, dirty bowl. We found that when participants process the answer segment, ‘Bill finished his drink.’, they first attend to images consistent with the alternative (empty glass, clean bowl) and then gaze shifts to the implicature-consistent state (unwashed bowl). Using this effect, we can test models that combine factors contributing to the strength of the alternative by probing looks to potential alternatives, as well as looks to the SI consistent state. The innovation here will be to use gaze toward the SI-state as a measure of the strength of implicature (linked to off-line results from the inference task). The gaze data towards the alternative would then give us a direct measure of the strength of the respective symmetric alternatives. The expectation is that high rates of SI in off-line studies will be reflected in strong ultimate bias toward the SI state, while low rates of SI will be linked to low SI-state bias Also, strong SI bias will be linked to clear prior looks to a single Alternative state, while lower SI bias will be linked to gaze data reflecting a competition between symmetric alternatives. By manipulating factors such as direction of inference (8-9), cost and informativity via context stimuli, we expect to be able to link SI rates to a more direct measure of strength of competing alternatives. This data, together with data from priming studies (see below) should provide better information on the relation between the factors being considered as contributing to the strength of Alternatives and rates of SI. In turn, these data will provide a test for competing models, based on theoretical assumptions.