2015 review

Here’s a selective round-up of my academic year

Teaching: I taught my Cognitive Psychology course for the second time. It takes inspiration from MOOCs and ‘flipped classroom’ models, so I try and scaffold the lectures with a bunch of online resources and pre- and post- lecture activities. This year I added pre-lecture quizes and personalised feedback for each student on their engagement. Based on thinking about my lecture discussions I wrote a short post on Medium ‘Cheap tricks for starting discussions in lectures‘ (the truth is, lectures are bad place for starting discussions, but sometimes that’s what you have to work with). I rewrote my first year course on emergent models of mind and brain. It uses interactive jupyter notebooks, which I’m very happy with. The lectures themselves show off a simple neural network as an associative model of memory, and the interactive notebooks mean that students can train the neural network on their own photos if they want. I also held an ‘intergenerational tea party’ every Thursday afternoon of autumn semester where I invited two students I supervise from every year of the undergraduate course (and my PG students and post-docs). If you came to one of these, thanks – I’ll be doing it again next semester.

Writing: I had a piece in the Guardian The science of learning: five classic studies,  as well as my regular BBC Future column, and a few pieces for The Conversation, and some ad-hoc blogging as a minor player on the mindhacks.com blog. I self published an e-book ‘For argument’s sake: evidence that reason can change minds‘ which was very briefly the 8th most popular experimental psychology e-book on Amazon (1 place behind ’50 sexting tips for women’).

Engagement: The year began with me on a sabbatical, which I spent at Folksy. Thanks to everyone there who made it such an enjoyable experience. I learnt more observing small business life in my home city than I think I would have in another Psychology department on the other side of the world. This year I was also lucky enough to do some work with 2CV related to a Transport for London brief on applying behavioural science to passenger behaviours, with Comparethemarket.com on understanding customer decisions and with McGraw-Hill Education on analysis of student learning. Our work on decision biases in court was also kindly mentioned on the UK parliament website, but I have to say that my getting-out-of-the-university highlight of the year was appearing in the promotional video for Folksy’s drone delivery project (released 1/4/2015).

Research: We rebooted interdisciplinary Cognitive Science activities at Sheffield with a Workshop, several seminars and a mailing list for everyone to keep in touch. Kudos to Luca for help instigating these things.

Several existing grants kept me very busy:

Our Leverhulme grant on Bias & Blame continued with our investigation into the cognitive foundation and philosophical implications of implicit bias. The PI, Jules Holroyd was awarded a prestigious Vice Chancellor’s Fellowship at Sheffield, so she’ll be a colleague in the new year as well as a collaborator (well done Jules!).  As part of this project we pre-registered an experimental test of our core hypothesis and this December Robin Scaife finished a heroic effort in data collection, so expect results on this in the new year. Pre-registration was an immensely informative process, not least because it made me finally take power analysis seriously (previously I just sought to side-step the issue). As a result of this work on decision making and implicit bias I did training for employment tribunal judges on bias in decision making, during which I probably learnt more from them than they learnt from me.

We’ve been scanning at the York Neuroimaging Centre, as part of our project on ‘Neuroimaging as a marker of Attention Deficit Hyperactivity Disorder (ADHD)’ . One of the inspirations for this project, Maria Panagiotidi, passed her PhD viva in November for her thesis titled: ‘The Role of the Superior Colliculus in Attention Deficit Hyperactivity Disorder’. Congratulations to Maria, who goes on to work as a research psychologist for Arctic Shores in Manchester.

Funded by the Michael J Fox Foundation we’d continued testing in Sheffield and Madrid, using typing as a measure of the strength of habitual behaviour in Parkinson’s Disease. For this grant the heroic testing efforts were performed by Mariana Leriche. For the analysis we are combing timing information (my specialty) and an information theoretic analysis based on language structure. Colin Bannard (University of Liverpool) is leading on this part of the analysis and working with him has been a great pleasure and immensely informative on computational linguistics.

Our students as part of the Sheffield Neuroeconomics network approach their final years. Angelo Pirrone and I have been working with James Marshall in Computer Science on perceptual decision making, and fitting models of decision making.

That’s not all, but that is all for now. The greatest pleasure of the year has been all the people I’ve had a chance to work with; students, colleagues and collaborators. Everything I have done this year has been teamwork. So apologies if you’re not mentioned above – it is only due to lack of space, not lack of appreciation – and my best wishes for 2016.

Written by Comments Off on 2015 review Posted in events

Individualised student feedback

My Cognitive Psychology course is structured around activities which occur before and after the lectures, many of them online. This year I wrote a Python script which emailed each student an analysis and personalised graph of their engagement with the course. Here’s what it looked like:


———- Forwarded message ———-
From: me
Date: 4 December 2015 at 10:53
Subject: engagement with PSY243
To: student@sheffield.ac.uk

This is an automatically generated email, containing feedback on your engagement with PSY243 course activities. Nobody but you (not even me) has seen these results, and they DO NOT AFFECT OR REFLECT your grade for this course. They have been prepared merely as feedback on how you have engaged with activities as part of PSY243.

Here is a record of your activities:
Weeks 1-9, concept checking quizes completed (out of 7):  4
Week 1-10, asked question via wiki or discussion group:  NO
Week 3, submitted practice answer:  NO
Week 7, submitted answer for peer review (compulsory):  YES
Week 8, number of peer reviews submitted (out of 3, compulsory):  3
Week 10, attended seminar discussion :  NO

We can combine these records to create a MODULE ACTIVITY ENGAGEMENT SCORE.

* * * Your score is 57% * * *

This puts you in the TOP half of the course. Obviously this score does not include activities for which I do not have records. This includes things like lecture attendance, asking questions in lectures, private study, etc.

If we plot the engagement scores for the whole year against the number of people who get that engagement score or lower we get a graph showing the spread of engagement across the course. This graph, and your position on it, are attached to this email. People who have done the least will be towards the left, people who have done the most will appear towards the right of the curve. You can see that there is a spread of engagement scores. Very few people have not done anything, very few have done everything.

I hope you find this feedback useful. PSY243 is designed as a course where the activities structure your private study, rather than as a course where a fixed set of knowledge is conveyed in lectures. This is why I put such emphasis on these extra activities, and provide feedback on your engagement with them. Next week you have the chance to give feedback on PSY243 as part of the course evaluation, so please do say if you can think how the course might be improved

Yours,
Tom, PSY243 MO

student

I designed this course to be structured around a single editable webpage, a wiki, which would provide all the information needed to understand the course from day one. My ambition was to use the lectures to focus on two things you can’t get from a textbook. The first being live exposure to a specialist explaining how they approach a problem or topic in their area. The second being an opportunity to discuss the material (a so called ‘flipped classroom‘). This year I added pre-lecture quizzes to the range of activities available on the course (you can see these here). These were designed so students could test their understanding of the foundational material upon which each lecture drew, and are part of this wider plan to provide clear structure for student’s engagement with the course around the lectures.

If you’re the sort of person who wants to see the code, it is here. At your own risk.

Written by Comments Off on Individualised student feedback Posted in Teaching

Crowdsourcing analysis, an alternative approach to scientific research

Crowdsourcing analysis, an alternative approach to scientific research: Many Hands make tight work

Guest Lecture by Raphael Silberzahn, IESE Business School, University of Navarra

11:00 – 12:00, 9th of December, 2015

Lecture Theatre 6, The Diamond (32 Leavygreave Rd, Sheffield S3 7RD)

Is soccer players’ skin colour associated with how often they are shown a red card? The answer depends on how the data is analysed. With access to a dataset capturing the player-referee interactions of premiership players from the 2012-13 season in the English, German, French and Spanish leagues we organised a crowdsourced research project involving 29 different research teams and 61 individual researchers. Teams initially exchanged analytical approaches — but not results — and incorporated feedback from other teams into their analyses. Despite, the teams came to a broad range of conclusions. The overall group consensus (that a correlation exists) was much more tentative than would be expected from a single-team analysis. Raphael Silberzahn will provide insights from his perspective as one of the project coordinators and Tom Stafford will speak about his experience as a participant in this project. We will discuss how also smaller research projects can benefit from bringing together teams of skilled researchers to work simultaneously on the same data and thereby balance discussions and provide scientific findings with greater validity.

Links to coverage of this research in Nature (‘Crowdsourced research: Many hands make tight work’), and on FiveThirtyEight (‘Science Isn’t Broken: It’s just a hell of a lot harder than we give it credit for’). Our group’s analysis was supported by some great data exploration and visualisation work led by Mat Evans. You can see an interactive notebook of this work here

 

Written by Comments Off on Crowdsourcing analysis, an alternative approach to scientific research Posted in Research, events

Bias mitigation

200px-Unbalanced_scales2.svgOn Friday I gave a talk on cognitive and implicit biases, to a group of employment tribunal judges. The judges were a great audience, far younger, more receptive and more diverse than my own prejudices had led me to expect, and I enjoyed the opportunity to think about the area of cognitive bias, and how some conclusions from that literature might be usefully carried over to the related area of implicit bias.

First off, let’s define cognitive bias versus implicit bias. Cognitive bias is a catch all term for systematic flaws in thinking. The phrase is associated with the ‘Judgement and decision making’ literature which was spearheaded by Daniel Kahneman and colleagues (and for which he received the Nobel Prize in 2002). Implicit bias, for our purposes, refers to a bias in judgements of other people which is both unduly influenced by social categories such as sex or ethnicity and in which the person making this biased judgement is either unaware or unable to control the undue influence.

So from the cognitive bias literature we get a menagerie of biases such as ‘the overconfidence effect‘, ‘confirmation bias‘, ‘anchoring‘, ‘base rate neglect‘, and on and on. From implicit bias we get findings such as that maths exam papers are marked higher when they carry a male name on the top, job applicants with stereotypically black American names have to send out twice as many CVs, on average, to get an interview or that people sit further away from someone they believe has a mental health condition such as schizophrenia. Importantly all these behaviours are observed in individuals who insist that they are not only not sexist/racist/prejudiced but are actively anti-sexism/racism/prejudice.

My argument to the judges boiled down to four key points, which I think build on one another:

1. Implicit biases are cognitive biases

There is slippage in how we identify cognitive biases compared to how we identify implicit biases. Cognitive biases are defined against a standard of rationality – either we know the correct answer (as in the Wason selection task, for example), or we feel able to define irrelevant factors which shouldn’t affect a decision (as in the framing effect found with the ‘Asian Disease problem‘). Implicit biases use the second, contrastive, standard. Additionally it is unclear whether the thing being violated is a standard of rationality, or a standard of equity. So, for example, it is unjust to allow the sex of a student influence their exam score, but is it irrational? (If you think there is a clear answer to this, either way, then you are more confident of the ultimate definition of rationality than a full century of scholars).

Despite these differences, implicit biases can usefully be thought of as a kind of cognitive bias. They are a habit of thought, which produces systematic errors, and which we may be unaware we are deploying (although elsewhere I have argued that the evidence for the unconscious nature of these process is over-egged). Once you start to think of implicit biases and cognitive biases as very similar, it buys some important insights.

Specifically:

2. Biases are integral to thinking

Cognitive biases exist for a reason. They are not rogue processes which contaminate what would be otherwise intelligent thought. They are the foundation of intelligent thought. To grasp this, you need to appreciate just how hard principled, consistent thought is. In a world of limited time, information, certainty and intellectual energy cognitive biases arise from necessary short-cuts and assumptions which keep our intellectual show on the road. Time and time again psychologists have looked at specific cognitive biases and found that there is a good reason for people to make that mistake. Sometimes they even find that animals make that mistake, demonstrating that even without the human traits of pride, ideological confusion and general self-consciousness the error persists – suggesting that there are good evolutionary reasons for it to exist.

For an example, take confirmation bias. Although there are risks to preferring to seek information that confirms whatever you already believe, the strategy does provide a way of dealing with complex information, and a starting point (i.e. what you already suspect) which is as good as any other starting point. It doesn’t require that you speculate endless about what might be true, and in many situations the world (or other people) is more than likely to put contradictory evidence in front of you without you having to expend effort in seeking it out. Confirmation bias exists because it is an efficient information seeking strategy – certainly more efficient than constantly trying to disprove every aspect of what you believe.

Implicit biases concern social judgement and socially significant behaviours, but they also seem to share a common mechanism. In cognitive terms, implicit biases arise from our tendency towards associative thoughts – we pick up on things which co-occur, and have the tendency to make judgements relying on these associations, even if strict logic does not justify it. The scope of how associations are created and strengthened in our minds is beyond the scope of the post.

For now it is clear that making judgements based on circumstantial evidence is unjustified but practical. An uncontentious example might be you get sick after eating at a particular noodle bar. Maybe it was bad luck, you were going to get sick anyway or it was the sandwich you ate a lunch, but the odds are good you’ll avoid the noodle bar in the future. Why chance it, there are plenty of other restaurants? It would be impractical to never make some assumptions, and the assumption-laden (biased!) route offers a practical solution to the riddle of what you should conclude from your food poisoning.

3. There is no bias-free individual

Once you realise that our thinking is built on many fast, assumption-making, processes which may not be perfect – indeed which have systematic tendencies which produce the errors we identify as cognitive bias – you then realise that it would be impossible to have bias-free decision processes. If you want to make good choices today rather than a perfect choices in the distant future, you have to compromise and accept decisions which will have some biases in them. You cannot free yourself of bias, in this sense, and you shouldn’t expect to.

This realisation encourages some humility in the face of cognitive bias. We all have biases, and we shouldn’t pretend that we don’t or hope that we can free ourselves of them.

We can be aware of the biases we are exposed to and likely to harbour within ourselves. We can, with a collective effort, change the content of the biases we foster as a culture. We can try hard to identify situations where bias may play a larger role, or identify particular biases which are latent in our culture or thinking. We can direct our bias mitigation efforts at particularly important decisions, or decisions we think are particularly likely to be prone to bias. But bias-free thinking isn’t an option, it is part of who we are.

4. Many effective mitigation strategies will be supra-personal:

If humility in the face of bias is the first practical reaction to the science of cognitive bias, I’d argue that second is to recognise that bias isn’t something you can solve on your own at a personal psychological level. Obviously you have to start by trying your honest best to be clear-headed and reasonable, but all the evidence suggests that biases will persist, that they cannot be cut out of thinking and may even thrive when we think ourselves most objective.

The solution is to embed yourself in groups, procedures and institutions which help counter-act bias. Obviously, to a large extent, the institutions of law have evolved to counter personal biases. It would be an interesting exercise to review how legal cases are conducted from a psychological perspective, interpreting different features as to how they work with or against our cognitive tendencies (so, for example, the adversarial system doesn’t get rid of confirmation bias, but it does mean that confirmation bias is given equal and opposite opportunity to work in the minds of the two advocates).

Amongst other kinds of ‘ecological control‘ we might count proper procedure (following the letter of the law, checklists, etc), control of (admissible) information and the systematic collection of feedback (without which you may not ever come to realise that you are making systematically biased decisions).

Slides from my talk here as Google docs slides and as PDF. Thanks to Robin Scaife for comments on a draft of this post. Cross-posted to the blog of our Leverhulme trust funded project on “Bias and Blame“.

Written by Comments Off on Bias mitigation Posted in Research

Power analysis for a between-sample experiment

Understanding statistical power is essential if you want to avoid wasting your time in psychology. The power of an experiment is its sensitivity – the likelihood that, if the effect tested for is real, your experiment will be able to detect it.

Statistical power is determined by the type of statistical test you are doing, the number of people you test and the effect size. The effect size is, in turn, determined by the reliability of the thing you are measuring, and how much it is pushed around by whatever you are manipulating.

Since it is a common test, I’ve been doing a power analysis for a two-sample (two-sided) t-test, for small, medium and large effects (as conventionally defined). The results should worry you.

power_analysis2

This graph shows you how many people you need in each group for your test to have 80% power (a standard desirable level of power – meaning that if your effect is real you’ve an 80% chance of detecting it).

Things to note:

  • even for a large (0.8) effect you need close to 30 people (total n = 60) to have 80% power
  • for a medium effect (0.5) this is more like 70 people (total n = 140)
  • the required sample size increases drammatically as effect size drops
  • for small effects, the sample required for 80% is around 400 in each group (total n = 800).

What this means is that if you don’t have a large effect, studies with between groups analysis and an n of less than 60 aren’t worth running. Even if you are studying a real phenomenon you aren’t using a statistical lens with enough sensitivity to be able to tell. You’ll get to the end and won’t know if the phenomenon you are looking for isn’t real or if you just got unlucky with who you tested.

Implications for anyone planning an experiment:

  • Is your effect very strong? If so, you may rely on a smaller sample (For illustrative purposes the effect size of male-female heigh difference is ~1.7, so large enough to detect with small sample. But if your effect is this obvious, why do you need an experiment?)
  • You really should prefer within-sample analysis, whenever possible (power analysis of this left as an exercise)
  • You can get away with smaller samples if you make your measure more reliable, or if you make your manipulation more impactful. Both of these will increase your effect size, the first by narrowing the variance within each group, the second by increasing the distance between them

Technical note: I did this cribbing code from Rob Kabacoff’s helpful page on power analysis. Code for the graph shown here is here. I use and recommend Rstudio.

Written by Comments Off on Power analysis for a between-sample experiment Posted in Research, Teaching

New grant: ‘Neuroimaging as a marker of Attention Deficit Hyperactivity Disorder (ADHD)’

We have been awarded ~£11k by the White Rose Collaboration Fund. This will allow us to carry out a small neuroimaging study investigating brain activity associated with higher levels of ADHD traits. The collaboration combines expertise and facilities across the Universities of Sheffield, Leeds and York. Paul Overton has previously proposed that the subcortical area known as the superior colliculus may be crucial in ADHD. This is the focus of Maria’s PhD thesis (co-supervised by Paul and me). Jaclyn Billington from Leeds has experience imaging the colliculus, and Tony Morland is the deputy director of York’s neuroimaging facility (as well as having a wealth of experience imaging the areas associated with visual function). Alex Wade and Jeff Delvenne provide additional expertise in visual attention. I lead the project.

Here is the blurb:

We will create a unique network of expertise, personnel and facilities from across the WR network in order to establish a novel biomarker of Attention Deficit Hyperactivity Disorder (ADHD).

Despite a high prevalence (up to 10% of children by some estimates), ADHD remains controversial in terms diagnosis and treatment. Using brain scanning, this network aims to establish a biological marker common to all ADHD suffers. Such a biomarker could revolutionise our response to ADHD, allowing us to better understand the condition, diagnose earlier, manage the symptoms and target pharmacological interventions. This could potentially alleviate suffering and improve function for millions.

Theoretical direction for this proposal arises from Overton’s recent proposal that a core dysfunction in ADHD is hypersensitivity of the Superior Colliculus (SC), a key subcortical brain region known to play a critical role in attention, spatial orientation and saccadic eye movements. The development of this ‘collicular hypersensitivity’ hypothesis was possible because of the tradition of research into the fundamental neuroscience of subcortical structures at Sheffield.

This hypothesis has been taken forward by Stafford (Sheffield) who, with Panagiotidi, has been developing behavioural tests of collicular sensitivity. Early results show that healthy adults who are high and low on ADHD traits differ in these behavioural measures. However, behavioural tests are limited in that they cannot provide definitive insight into the neural basis of function. Teams in York and Leeds provide expertise in functional brain imaging and the neural basis of attention which would allow the direct translation of the Sheffield research programme into a test of a biomarker for ADHD.

Our primary objective will be to test two groups, high and low in ADHD traits for collicular responsiveness, using fMRI brain imaging. This testing will use behavioural measures which have been shown to discriminate the two groups, and analytic and imaging expertise from the Leeds and York based applicants in order to determine collicular responsiveness

Written by Comments Off on New grant: ‘Neuroimaging as a marker of Attention Deficit Hyperactivity Disorder (ADHD)’ Posted in Projects

Event: Crowdsourcing Psychology Data – Online, Mobile and Big Data approaches

StaffordFig3Smart phones, social media and networked sensors in everything from trains to toasters – The spread of digital technology creates new opportunities for cognitive scientists. Collecting and analysing the resulting “big data” also poses its own special challenges. This afternoon of talks and discussion is suitable for anyone curious about novel data collection and analysis strategies and how they can be deployed in psychological and behavioural research.

Time: 1pm-5pm, 11th of November 2014

Venue: Department of Psychology, University of Sheffield

We have four speakers followed by a panel discussion. Our speakers:

Martin Thirkettle: “Taking cognitive psychology to the small screen: Making a research focussed mobile app”

Developing a mobile app involves balancing a number of parties – researchers, funders, ethics committees, app developers, not to mention the end users. As the Open University’s “Brainwave” app, our first research-focussed cognitive psychology app, nears launch, I will discuss some of the challenges we’ve faced during the development process.

Caspar Addyman: “Measuring drug use with smartphones: Some misadventures”

Everyday drug use and its effects are not easily captured by lab or survey-based research. I developed the Boozerlyzer, an app that let people log their alcohol intake, their mood and play simple games that measured their cognitive and emotional responses. Although this had its flaws it led to a NHS funded collaboration to develop a simple smartphone tracker for Parkinson’s patients. Which was also problematic..

Robb Rutledge: “Crowdsourcing the cognitive science of decision making and well-being”

Some cognitive science questions can be particularly difficult to address in the lab. I will discuss results from The Great Brain Experiment, an app that allowed us to develop computational models for how decision making changes across the lifespan, and also how rewards and expectations relate to subjective well-being.

Andy Woods: “[C]lick your screen: probing the senses online”

We are at the cusp of some far-reaching technological advances that will be of tremendous benefit to research. Within a few short years we will be able to test thousands of people from any demographic with ‘connected’ technology every bit as good as we use in our labs today — indeed perhaps more so. Here I discuss on-web versus in-lab, predicted technological advances and issues with online research.

Tickets are free and available: here.

Written by Comments Off on Event: Crowdsourcing Psychology Data – Online, Mobile and Big Data approaches Posted in events

New grant: Reduced habitual intrusions : an early marker for Parkinson’s Disease?

SurprisalDensityPlotFor4CharacterWindowI have very pleased to announce that the Michael J Fox Foundation have funded a project I lead titled ‘Reduced habitual intrusions : an early marker for Parkinson’s Disease?’. The project is for 1 year, and is a collaboration between a psychologist (myself), a neuroscientist (Pete Redgrave), a clinician specialising in Parkinson’s (Jose Obeso, in Spain) and a computational linguist (Colin Bannard, in Liverpool). Mariana Leriche will be joining us a post-doc.

The idea of the project stems from hypothesis that Parkinson’s Disease will be specifically characterised by a loss of habitual control in the motor system. This was proposed by Pete, Jose and others in 2010. Since my PhD I’ve been interested automatic processes in behaviour. One phenomenon which seems to offer particular promise for exploring the interaction between habits and deliberate control is the ‘action slip’. This is an error where a habit intrudes into the normal stream of intentional action – for example, such as when you put the cereal in to the fridge, or when someone greets you by asking “Isn’t it a nice day?” and you say “I’m fine thank you”. An interesting prediction of the Redgrave et al theory is people with Parkinson’s should make fewer action slips (in contrast to all other types of movement errors, which you would expect to increase as the disease progresses).

The domain we’re going to look at this in is typing, which I’ve worked with before, and which – I’ve argued – is a great domain for looking at how skill, intention and habit combine in an everyday task which generates lots of easily coded data.

I feel the project reflects exactly the kind of work I aspire to do – cognitive science which uses precise behavioural measurement, informed by both neuroscientific and computational perspectives, and in the service of am ambitious but valuable goal. Now, of course, we actually have to get on and do it.

Written by Comments Off on New grant: Reduced habitual intrusions : an early marker for Parkinson’s Disease? Posted in Research, Projects

Teaching: what it means to be critical

3282473832_cb97c4e525_mWe often ask students to ‘critically assess’ research, but we probably don’t explain what we mean by this as well as we could. Being ‘critical’ doesn’t mean merely criticising, just as skepticism isn’t the same as cynicism. A cynic thinks everything is worthless, regardless of the evidence; a skeptic wants to be persuaded of the value of things, but needs to understand the evidence first.

When we ask students to critically assess something we want them to do it as skeptics. You’re allowed to praise, as well as blame, a study, but it is important that you explain why.

As a rule of thumb, I distinguish three levels of criticism. These are the kinds of critical thinking that you might include at the end of a review or a final year project, under a “flaws and limitations” type-heading. Taking the least value first (and the one that will win you the least marks), let’s go through the three types one by one:

General criticisms: These are the sorts of flaws that we’re taught to look out for from the very first moment we start studying psychology. Things like too few participants, lack of ecological validity or the study being carried out on a selective population (such as university psychology students). The problem isn’t that these aren’t flaws of many studies, but rather that they are flaws of too many studies. Because these things are almost always true – we’d always like to have more people in our study! we’re never certain if our results will generalise to other populations – it isn’t very interesting to point this out. Far better if you can make …

Specific criticisms: These are things which are specific weakness of the study you are critiquing. Things which you might say as a general criticism become specific criticisms if you can show how they relate to particular weaknesses of a study. So, for example, almost all studies would benefit from more participants (a general criticism), but if you are looking at a study where the experiment and the control group differed on the dependent variable, but the result was non-significant (p=0.09 say), then you can make the specific criticism that the study is under-powered. The numbers tested, and the statistics used, mean that it isn’t possible to resolve either way that there probably is or probably isn’t an effect. It’s simply uncertain. So, they need to try again with more people (or less noise in their measures).

Finding specific criticisms means thinking hard about the logic of how the measures taken relate to psychological concepts (operationalisation) and what the comparisons made (control groups) really mean. A good specific criticism will be particular to the details of the study, showing that you’ve thought about the logic of how an experiment relates to the theoretical claims being considered (that’s why you get more credit for making this kind of criticisms). Specific criticism are good, but even better are…

Specific criticisms with crucial tests or suggestions: This means identifying a flaw in the experiment, or a potential alternative explanation, and simultaneously suggesting how the flaw can be remedied or the alternative explanation can be assessed for how likely it is. This is the hardest to do, because it is the most interesting. If you can do this well you can use existing information (the current study, and its results) to enhance our understanding of what is really true, and to guide our research so we can ask more effective questions next time. Exciting stuff!

Let me give an example. A few years ago I ran course which used a wiki (reader edited webpages) to help the students organise their study. At the end of the course I thought I’d compare the final exam scores of people who used the wiki against those who hadn’t. Surprise: people who used the wiki got better exam scores. An interesting result, I thought, which could suggest that using the wiki helped people understand the material. Next, I imagined I’d written this up as a study and then imagined the criticisms you could make of it. Obviously the major one is that it is observational rather than experimental (there is no control group), but why is this a problem? It’s a problem because there could be all sorts of differences between students which might mean they both score well on the exam and use the wiki more. One way this could manifest is that diligent students used the wiki more, but they also studied harder, and so got better marks because of that. But this criticism can be tested using the existing data. We can look and see if only highly grading students use the wiki. They don’t – there is a spread of students who score well and who score badly, independently of whether they use the wiki or not. In both groups, the ones who use the wiki more score better. This doesn’t settle the matter (we still need to run a randomised control study), but it allows us to finesse our assessment of one criticism (that only good students used the wiki). There are other criticisms (and other checks), you can read about it in the paper we eventually published on the topic.

Overall, you get credit in a critical assessment for showing that you are able to assess the plausibility of the various flaws a study has. You don’t get marks just for identifying as many flaws as possible without balancing them against the merits of the study. All studies have flaws, the interesting thing is to make positive suggestions about what can be confidently learnt from a study, whilst noting the most important flaws, and – if possible – suggesting how they could be dismissed or corrected.

Written by Comments Off on Teaching: what it means to be critical Posted in Teaching

New paper: wiki users get higher exam scores

Just out in Research in Learning Technology, is our paper Students’ engagement with a collaborative wiki tool predicts enhanced written exam performance. This is an observational study which tries to answer the question of how students on my undergraduate cognitive psychology course can improve their grades.

One of the great misconceptions about sudying is that you just need to learn the material. Courses and exams which encourage regurgitation don’t help. In fact, as well as memorising content, you also need to understand it and reflect that understanding in writing. That is what the exam tests (and what an undergraduate education should test, in my opinion). A few years ago I realised, marking exams, that many students weren’t fulfilling their potential to understand and explain, and were relying too much on simply recalling the lecture and textbook content.

To address this, I got rid of the textbook for my course and introduced a wiki – an editable set of webpages, using which the students would write their own textbook. An inspiration for this was a quote from Francis Bacon:

Reading maketh a full man,
conference a ready man,
and writing an exact man.

(the reviewers asked that I remove this quote from the paper, so it has to go here!)

Each year I cleared the wiki and encouraged the people who took the course to read, write and edit using the wiki. I also kept a record of who edited the wiki, and their final exam scores.

The paper uses this data to show that people who made more edits to the wiki scored more highly on the exam. The obvious confound is that people who score more highly on exams will also be the ones who edit the wiki more. We tried to account for this statistically by including students’ scores on their other psychology exams in our analysis. This has the effect – we argue – of removing the general effect of students’ propensity to enjoy psychology and study hard and isolate the additional effect of using the wiki on my particular course.

The result, pleasingly, is that students who used the wiki more scored better on the final exam, even accounting for their general tendancy to score well on exams (as measured by grades for other courses). This means that even among people who generally do badly in exams, and did badly on my exam, those who used the wiki more did better. This is evidence that the wiki is beneficial for everyone, not just people who are good at exams and/or highly motivated to study.

Here’s the graph, Figure 1 from our paper:

wikigraph

This is a large effect – the benefit is around 5 percentage points, easily enough to lift you from a mid 2:2 to a 2:1, or a mid 2:1 to a first.

Fans of wiki research should check out this recent paper Wikipedia Classroom Experiment: bidirectional benefits ofstudents’ engagement in online production communities, which explores potential wider benefits of using wiki editing in the classroom. Our paper is unique for focussing on the bottom line of final course grades, and for trying to address the confound that students who work harder at psychology are likely to both get higher exam scores and use the wiki more.

The true test of the benefit of the wiki would be an experimental intervention where one group of students used a wiki and another did something else. For a discussion of this, and discussion of why we believe editing a wiki is so useful for learning, you’ll have to read the paper.

Thanks go to my collaborators. Harriet reviewed the literature and Herman instaled the wiki for me, and did the analysis. Together we discussed the research and wrote the paper.

Full citation:
Stafford, T., Elgueta, H., Cameron, H. (2014). Students’ engagement with a collaborative wiki tool predicts enhanced written exam performance. Research in Learning Technology, 22, 22797. doi:10.3402/rlt.v22.22797

Written by Comments Off on New paper: wiki users get higher exam scores Posted in Research, Teaching