Science


In April 2018 I was struck by Ramsay-Hunt syndrome, and half my face was paralyzed. For about two months I had to somehow struggle through a new job with my face sliding off and my entire body completely exhausted and stricken with pain. I recovered over the following year until my face was about (in my estimation) 90-95% better, and probably no long term consequences. Then two weeks ago this awful condition hit me again, though this time I felt it coming, got the treatment early, and avoided any serious trouble. After this last 18 months of face-eating hell, I feel like I’m an experienced Ramsay-Hunter, but when I was trying to understand this disease last year I found precious little information on the internet about it. So, I have decided to use this blog for what blogs are good for, and to give my experience of Ramsay-Hunt Syndrome, as well as some suppositions and general suggestions for dealing with it based on what I experienced, my own hazy research and discussions with different people. Ramsay-Hunt Syndrome (hereafter referred to as RHS) has a very wide range of effects, if the internet is to be trusted, and a lot of them are pretty subtle and unpleasant. So I’d like to outline here what I experienced, some things I think about the disease based on my experience, and some stuff I picked up around the internet. To be clear if you read on: I am not a doctor, I have no medical advice for you, and if you’re coming to me for medical advice you’re in a dire place. This is just my experience, and you should not use it as anything except supportive anecdotal knowledge. Nonetheless, I hope it will help you. If you have experienced RHS yourself and want to add your own experiences in the comments, or are experiencing it and have questions (or want reassurance) then please also comment.

What is this godawful disease?

Ramsay-Hunt Syndrome is basically shingles inside your face. It is caused by Herpes Zoster (shingles) which is a consequence of being infected with chicken pox when you were a child. Basically the chicken pox reactivates, but instead of coming back as an intensely painful rash on your skin (as happens with most people) it comes back as a vicious, cruel, and completely godless infection of your facial nerve. Once it gets its hooks in it does the following things:

  • It causes intense pain in the back of your neck/head/jaw, that is like no other pain you have experienced
  • It causes a rash in one of your ears and/or your tongue
  • It paralyzes half of your face so that nothing moves. Nothing.

This facial paralysis is the worst part of the disease, because it completely disables half of your face, which makes speaking and eating difficult, and also stops you closing your eye[1].

There is no cure for this disease, because it’s one of the herpes family, a cluster of diseases that were designed by satan to annoy human beings. It is easily treated into remission however using acyclovir, an anti-viral drug. If you’ve had cold sores or genital herpes then you’ll probably be familiar with this family of stupid little viruses and their treatments.

Chickenpox is very common, since the vaccine was only available in 1984 and isn’t on the mandatory vaccination schedule of many countries. So if you’re older than about 38 years old chances are you had it, and if you are younger than 38 but from one of the many countries that don’t (or didn’t) have the vaccine in their schedule you may well have had it. If you’re like me you carry the scars of that idiot little disease on your face, but if you don’t have the scars you may not remember if you ever had it, in which case check with your parents. You need to know what’s coming for you.

The common view seems to be that RHS is triggered by stress, just as shingles is. So if you had chickenpox as a kid there’s basically only one way to prevent it: don’t get stressed. Hrmph!

Also RHS is not the same as Bell’s Palsy. Bell’s palsy is a sudden paralysis of the facial nerve, but it doesn’t come with the rash and intense, unrelenting pain, and it doesn’t do the other dodgy shit that RHS prides itself on (see below). I had Bell’s Palsy about 20 years ago, probably as a result of stress in combination with some stupid infection. Bell’s Palsy is a walk in the park compared to RHS.

What happened to me?

So let’s describe my experience. I was just finishing an extremely stressful job where I had been bullied for years by the most vicious pig of a man you can conceive of, and had secured a new job. I was taking a few weeks off and exercising daily, doing two hour morning kickboxing sessions. One Friday in mid-March I visited my new employer to fill in some forms and was informed that my job was guaranteed and I would definitely be starting on 1st April. When I left the workplace I could feel the stress falling off of me like water, and my spirits uplifted, really uplifted, for the first time in a long time. Since I had been training all week I was tired and I had muscle pain in my left shoulder but I didn’t think much of it.

On Saturday morning I woke up relatively early to go to role-playing, and noticed in the bathroom mirror that my eye and face was a bit weird, but I again didn’t think much of it. It was a bit weird but I’d gone to bed late and I think I’d been having celebratory drinks, so I just figured whatever and headed off to role-playing. By the time role-playing started two hours later I was in great pain that intensified over the day. At first I assumed it was some strain from kickboxing, but by mid-afternoon my face was beginning to fail and my speech was noticeably slurred. The pain by then was intense so I was icing the spot and trying to keep my shit together (fortunately I was playing not GMing). My friends started suggesting the possibility that I was having a stroke (I was 45), but as my face slid off I realized what was happening, and assumed I was just having a bad bout of Bell’s Palsy, brought on by the relief of stress on the Friday[2]. Since I’d experienced Bell’s Palsy before I knew what needed to be done: I had to go to a doctor to get some eye drops, buy an eye patch, and wait a few months. A pretty depressing start to a new job but whatever. So I finished the game, went home, slept as best I could, and the next morning I went to a doctor.

So Sunday morning my face was wrecked, and I felt like an operation was being conducted on my jaw. My eye was also now open permanently so things were touch and go, but I got to a doctor by lunchtime. The doctor was a standard internal medicine specialist (in Japan this is basically what you go to when you don’t know what’s up) with a nice surgery who I trusted, and he was very sure it was not Bell’s Palsy. He made me sit in the waiting room while he booked some urgent tests at the local hospital, to rule out a stroke, but then came out after ten minutes or so to check my forehead. He made me raise my brow like a reverse frown (what do you call that?) and upon seeing that my left forehead was completely static – not moving even a millimetre – he decided it must be RHS, canceled the tests, and gave me the medicine I needed. He gave me acyclovir to kill the herpes, pain killers, steroids to help my face recover, and eye drops for my eye. I went to a local pharmacist, hit the drugs, and crashed.

Acyclovir is a miracle drug, it works on the virus fast and within maybe two days the pain was gone, but my face was done for. I had to go into my new job the next week to begin preparing classes, setting up my work space, transferring grants (which takes sooo many forms!) and so on, but I couldn’t work my face at all and also I was exhausted. I could only work perhaps 3-4 hours a day before I had to struggle home and crash. But the worst was yet to come. After 5-6 days the acyclovir finished, and the disease came back within a day – worse than before. The pain was even worse, and it was hellish. This was when the other symptoms began (see below). Fortunately my new work has a very good hospital attached, so I saw a doctor there and they told me that I had been given an older, weaker version of acyclovir, and the steroid dose I’d been given was way too low to help my face. This doctor gave me valacyclovir, which is I guess the incredible hulk of acyclovirs, and nearly doubled my steroid dose. The pain subsided pretty quickly and over the next two weeks things calmed down. By the time April finished the secondary symptoms had gone and my face was beginning to move. In May the doctor shifted me to a rehabilitation plan, and I set about the long path to recovery.

What are the secondary symptoms?

If you google around you’ll hear all sorts of horror stories about this nasty little bug. I read people saying they lost their sense of balance, that they were always dizzy, that they nearly went blind, and that their ability to think or calculate was messed up. I found this out because in that first week I noticed I was doing things that are really unusual for me, including:

  • Taking the wrong train home
  • Getting confused about where in the train platform to go to get to my work
  • Forgetting names, words and basic facts
  • Confusing chats and sending the wrong messages to the wrong people

I went to hanami at my former work near the end of March and met a PhD student who I had known for three years, who had completed a master’s degree in my department and gone on to finish her first year of her PhD: I asked her when she was starting her PhD. I sent messages for my role-playing group to non-roleplaying friends, and vice versa. Also I was getting tired very quickly, and putting on weight (which may have been the steroids I guess). I went back to kickboxing after maybe a month, and that was okay, but for the first two weeks my whole body was a mess. I also discovered, once my eye could close again, that I had become photophobic. I didn’t notice this until mid May, which is when the sun really comes out in Tokyo, and it made my eyes tear up as soon as I went outside.

I’m also sure that this disease fucked my eyesight. I am longsighted and wear reading glasses but between March and May my eyesight suddenly deteriorated so I had to get new glasses. I also thought I was seeing double, but couldn’t get anyone at the eye doctor to believe me or confirm it.

I also had small pings of pain in the back of my jaw and neck for months after the main source of horror had gone away. It was there, reminding me that I was its bitch.

In preparing this post I did some searching and discovered this review article which describes the peripheral nervous system consequences of RHS. It can do a wicked and wondrous array of nasty little things to you, many of which resolve with rehabilitation and treatment, but some of which I think are permanent.

Rehabilitation experience

Rehabilitation for RHS is primarily the task of recovering facial movement, since this is the main physical consequence of it. For this I was given facial exercises (gurning, basically) and massages to do to try and regain facial function. The recovery rates for RHS are apparently not very good – less than 70% of people get full facial recovery, and the chance declines with age of course. I did my exercises reasonably assiduously, and the facial massages, and after a year I think I got back to about 90% function. I have two remaining problems with my face:

  • If I read while I’m eating my left eye gets strained and sometimes lets a few tears out (it can hurt a bit)
  • If I purse my lips my left eye closes slightly

I can also feel a bit of plasticity in the cheek around my mouth on the left side, and I can see a little pocket of muscle above the tip of my mouth on the left side that is dead and just kind of sits there like a lump of uselessness whenever I smile. That’s not a killer – I’ve never thought much of my smile, and whatever charm I have for the ladies is built on something else I’m sure. Most people don’t notice my face is lopsided, I haven’t lost any speech or anything, so I’m mostly good.

In fact, during rehabilitation I learnt finally how to wink with my left eye, something I never used to be able to do. A career of comedy awaits …

Rehabilitation for this disease isn’t hard. I noticed that my face hurt to touch, all over the left side, which the doctors told me was because the nerves are waking up and getting aggravated, and some of the rehabilitation exercises would make my face hurt as I strained to move shit around. Just like exercising your body, the muscles were weak and underworked, and they got worn down by practice. I also noticed some parts recovered quicker than others, and sadly the fine motor control around my eyes is the slowest to recover.

The doctors also warned me against starting rehabilitation before my viral symptoms were fully gone. They told me that if you begin rehabilitation too soon you can develop bad habits, like for example closing your eye every time you bite, because the nerves learn new pathways (like how I got my new left-eye wink superpower). In fact I think I have this when I yawn – my left eye shuts involuntarily.

The doctors also told me – and I also saw through google sensei – that getting the anti viral medication in early is important. Basically, if you don’t start the miracle acyclovir within 72 hours you’re done for, and the earlier you start the better. I waited a day and then started the weaker old one, so I guess that made my experience worse than if I had scuttled straight down to the best hospital in town, begged my way in on the claim that I was having a stroke, and got myself on valacyclovir from the morning it started. I won’t make that mistake again! But it’s also possible the doctors wouldn’t have recognized the problem and would have sent me in for a series of pointless and expensive stroke checks, and started me late on the anti-virals. The anti-virals really are key.

Actually when I went to the doctor at my university hospital after the pain returned (and got the stronger acyclovir) he wanted to hospitalize me, and put me on a drip for the medicines. He confessed to me that he didn’t think I needed IV acyclovir especially, but he wanted to force me into a bed away from my work so that the stress would stop and my face would recover. He thought stress was the real problem here, driving the whole thing, and was worried the medicine wouldn’t work until I get my work under control. But the thing is I had just started a new job, and he wanted to hospitalize me on the day of my first lecture. It’s not a good look! And in truth I couldn’t stand to spend a week in bed with nothing to do, so I begged off of that. Maybe my recovery would have been better if I’d agreed to that.

So if you want a good recovery:

  • Get on the antivirals as soon as possible (and if your doctor offers bog-standard acyclovir tell him to go jump – go straight for the strong stuff)
  • Get the stress out of your life, including by hospitalization if necessary
  • Don’t start rehabilitation until the awfulness is settled down a bit
  • Do your gurning exercises ruthlessly, and keep an eye out for weird new facial behaviors

Then bingo, a year later you’ll be able to (mostly) get your face back.

And trust me: you don’t realize how important your face is until it falls off. Life without a face sucks!

The second bout and the prodrome

So this year I went on a series of business trips and had quite a bit of stress, and a week ago I could feel this bastard disease creeping in again. I could feel my face getting a bit tired, and when I took a selfie on Monday night last week I could see my smile had retrogressed. Bastards! I could also feel a twinge in the back of my jaw, and when I went to work on Wednesday I was getting confused about train doors and having strange emotions. So I went to the hospital again, explained the whole thing to an otolaryngologist and got the miracle valacyclovir into me before the disease was fully up and running. My face sagged a bit but I’m already doing rehabilitation a week later, because the virus never got started. This time I caught the stupid thing as it was sneaking in the door, and slammed it shut. This time also the doctors were worried it was something else and so put me through some tests: MRI and some blood tests. The MRI came up completely clean and pure, even confirmed I have a brain (who knew!), and after a long and exhausting conversation with the neurologist in which he refused to believe any of the symptoms I just exhaustively described here, I was free to get out and begin the rehabilitation. My next appointment to track facial progress is in two weeks.

This tells me two things about this disease. First of all, it tells me that stress is really bad once you’re at risk of this disease, and you need to keep it well under control. No one warned me that this little shit would come crawling around scratching at my door a second time, but it did. So if you have RHS, and there seems to be a good chance it was triggered by stress, then you need to get that stress out of your life. I would say this means doing whatever you have to do – change jobs, meditate, murder your boss (don’t get caught obviously), whatever it takes. My new job is relatively low stress and all the stress I experienced was from a cataclysmic series of tightly timed overseas trips, and I think I can control that easily by never again making such a series of business trips in such a short time. Compared to the stress that triggered the first bout of RHS what I’m going through now is trivial, and I didn’t even notice I was stressed until this disease hit. I guess I’m weaker than I used to be.

The second thing this tells me – and this is not medical science here – is that this disease has a prodrome. It has early symptoms that warn you it’s coming, and if you notice them you might be able to sense its presence. Looking back at my first experience of this neuropathic party, the neck pain and the slight tiredness in my face were there before the evil little bastard stuck the shank in behind my jaw, and had I known I might have been able to react more quickly[3]. Those same symptoms came this time around, so I went to the doctor early and started the valacyclovir before it could take hold. This theory makes sense to me because it is well known that other herpes viruses have a prodrome: Herpes 1 and 2 both have a kind of itchy weirdness in the area where the sores are going to arise, and if you hit the acyclovir then you may be able to prevent or lessen the resulting outbreak. So I guess chickenpox – which is a herpes virus – could have a similar course. I couldn’t find anything on this on the internet, but it’s my feeling that this is what happens.

A brief note on UHC

Japan has Universal Health Coverage. I don’t recall how much this disease set me back last year but this time the tests, drugs and bothering the hospital doctors without a referral cost me a total of about 30,000 yen, so it would have set me back 100,000 yen (about $US800) if I didn’t have insurance. I’m sure that it would cost a lot more in America’s weird-arsed system, since Japan has strict price controls, but I think it’s safe to say that 100,000 yen is tough for a lot of people to fork out, and the prospect of not being able to get treatment for this because you can’t afford it, and having to live your life with this intense, unbearable pain and the slow degradation of your face for what I can only assume would be weeks before the virus gave up and left – that’s awful. UHC is an absolutely fundamental part of a civilized society, and every political party should be 100% about getting it if you don’t have it, or protecting it if you do. Never let that wonderful part of modern social democracy slide away or be weakened by the vicious jackals who control our conservative parties. Or your face will fall off.

Preventing this disease

The best way to prevent this hairy bastard from coming and fucking your face through your ear is to get vaccinated against chickenpox. Sadly though the varicella vaccine is not in most countries’ mandatory schedules, so you won’t have received it even if you were born after 1984 unless you’re in one of the few that does cover it. Therefore, if you’re a parent in a country without this vaccine on the schedule, and you’re reading this, my advice is: pay the extra amount to get this vaccine for your kids. They will never thank you, partly because they’re ungrateful bastards but also because they’ll never know the fun they’re missing, but trust me it’s worth it. If you’re a policy-maker in a country that doesn’t have this vaccine on the schedule, hurry up and add it.

If you’re an adult who had chickenpox as a child then the first line of defense against this nasty thing is to avoid stress, make a life for yourself that has manageable stress and don’t let whatever stress you do experience last for too long. I went through years of intense stress before the first bout was triggered, but once it was there my next bout required a much lower threshold. So be careful with stress, and get control of your work as much as you can (I appreciate that this is useless advice for a lot of people, whose industry or career options are top-heavy with unpaid work, bullying superiors, and shitty conditions, but it’s the only advice that I have, sorry).

There is some evidence that the varicella vaccine, given to adults who had chickenpox, may reduce the risk of this disease. I’m thinking of getting it once this shit has died down, but it’s also possible that the same people whose low-paid high stress jobs put them at risk of RHS are also unable to afford the out-of-pocket costs for this vaccine. If you’re reading this I’m sorry, I’m out of options. Kill your boss, or find a way to move to a country with a better health system. Or vote Democrat and get that shit fixed[4].

Conclusion

The most important lesson for this is that you need to reduce the stress in your life to avoid this disease, and that as you get older the risk will increase so you need to purge that stress as you age. It might also help to get a vaccine against varicella even if you’re an adult who had chickenpox in childhood, just to get that extra bit of protection, but your doctor may not like that idea.

If you go to a doctor with the first symptoms of this and he/she offers you mere acyclovir, tell him/her you’ll pay the extra for valacyclovir. Wave this blog post at them, and explain the issue. What do they care?! Trust me you don’t want this thing hanging around, so push for it. Then take your rehabilitation seriously, and you may be able to get to a fully functional face once the shitshower passes on. Another thing I think I should have done but didn’t was demand a second course of valacyclovir, to really curbstomp this ugly fucker. Once those drugs are done though, you’re going to be looking at an unpleasant couple of months regardless, so good luck.

If you had other experiences of RHS, or want to rant about this nasty little hitchiker, or are having it now and need reassurance or have questions, put them in the comments. I’d love to hear how other people got through this virus, and I really hope that this blog post can help someone to deal with the horrors of this disease. You are going to get better and you will get your face back, I promise you!


fn1: I don’t know what kind of person designed human beings but requiring a muscle to activate to close your eye, rather than open it, is phenomenally stupid. You don’t realize how stupid that design flaw is until you can’t use that muscle, and suddenly you’re staring at everyone like a pscyhopathic cyclops.

fn2: I have this weird thing, that has existed since my teenage years, where I handle stress well but then when the stress disappears my body completely breaks. Used to happen with migraines, seems to happen with RHS. Others get sick during their stress but my response appears to be delayed.

fn3: I wouldn’t have, because I’d have thought it was Bell’s Palsy and just gone and bought an eyepatch.

fn4: I’m not American, but I’m aware that most people who read blogs like mine are, for some reason, and I have to remain aware of your society’s … shortcomings … when I write medical-related things.

No doubt many of my readers are aware that there is a stream of feminism, which calls itself “gender critical”, that rejects the idea that transgender women are women and aims to “protect” cis women from having these women in women-only spaces. In the 1970s and 1990s this manifested as an internecine feminist turf war over whether trans women should be allowed into women’s spaces. This battle appeared to die down in the 2000s but a new generation of gender critical feminists are now attempting to defend women’s bathrooms, sports and changing rooms from transgender women. They seem to be particularly active in the UK, where feminists like Professor Kathleen Stock are attempting to fight changes to the Gender Recognition Act that would lead to people being able to use the bathroom of the gender they identify with, rather than the gender they were born as.

A core demand of these feminists is that only women who were born female should be able to use women’s bathrooms. In this post I am going to use Bayes’ Rule to show that the inevitable consequence of this political position, if it were to be enforced, would be the widespread harassment of natal women. I will also present anecdotal evidence that this is exactly what is happening now as their ideas gain traction, and discuss the inevitable hypocrisy and contradiction in the gender critical position in light of their responses to the concerns that some (primarily butch-presenting) lesbians have raised about their cause.

Content warning: This post will use a lot of language associated with the “woke” American left, like the prefix “cis”, and also the language of these gender critical feminists, like “natal woman” and the weird distinction they insist on between “woman” and “female.” I will explain my choice of language at the end. Bear with me!

Applying Bayes’ Theorem to Bathroom Exclusion

So how does bathroom exclusion work? The goal of gender critical feminists is that women who were not born female – that is transgender women – are not allowed to use women’s bathrooms, and that this exclusion should be enforced through codification in the Gender Recognition Act. They don’t specify exactly what follows from this but the obvious implication is that if a natal woman in a bathroom fears that another woman in the bathroom is actually a natal man she should be able to confront that person and demand they leave and go to the men’s bathroom, fully supported by the force of the law, public opprobrium and if necessary state force (represented in the US, let us remember, by an armed and trigger-happy police force that has little regard for people who do not follow strict white middle-class standards of dress and behavior).

In practice what this will mean is that a natal woman will need to judge whether another woman in the bathroom is a “real” woman or not by her face, clothes and manner. She certainly won’t be able to demand a genital check[1], so her entire means of discrimination will be by a visual check. Now, discrimination of this kind is a statistical process on which a large amount of theoretical work has been done since the 18th century, and in particular in public health we use Bayes’ Rule to determine the effectiveness of a discrimination process. Bayes rule provides us with a formula that links the sensitivity and specificity of a test to the probability of correctly discriminating between two groups. It depends on three essential quantities:

  • The sensitivity of the test, which is the probability that the test will correctly identify a person with a condition as having the condition
  • The specificity of the test, which is the probability that the test will correctly identify a person without the condition as not having the condition
  • The prevalence of the condition being tested

Wikipedia offers an example based on drug testing, but the rule is universal: it applies to any attempt to discriminate between two classes of a thing with a test that is imperfect, and it has some alarming and counter-intuitive results in the case that the condition being tested is rare.

In the case of bathroom exclusion, we want to know the following three things:

  • What is the probability a normal person[2] will correctly identify that a transgender woman is a transgender woman?
  • What is the probability a normal person will correctly identify a non-transgender woman as non-transgender?
  • What is the prevalence of transgender women in the population?

How good do you think you are at the first two of these things? I’m not aware of any tests of ordinary bathroom users, but facial recognition software has reached high levels of accuracy above 90%. So, let’s suppose that we were to put a facial recognition device on a bathroom door that had 90% sensitivity and specificity, and assume that 5% of women are transgender. Bayes’ Rule tells us that we would have a positive predictive value of 32%. That is, only 32% of the women refused access to the bathroom would be transgender women: 68% of women rejected (2 in every 3) would be natal women who had been misclassified as transgender.

Now, I think that 5% is way too high an estimate of the prevalence of transgender women. At 3%, with the same specificity and sensitivity, only 22% of rejections are correct – 80% of women refused admission to the bathroom are natal women. Figure 1 shows the relationship between specificity and this proportion at a prevalence of 3% for three different levels of sensitivity.

Figure 1: Proportion of women harassed in a bathroom who are natal women, for three different levels of sensitivity. Prevalence of transgender women is set at 3%. The x-axis shows specificity (percentage chance of correctly identifying a non-transgender woman is not a transgender woman)

As should be quite clear from this figure, even at very high specificity – for example where you are 95% likely to correctly identify natal women as natal women, and 99% likely to identify transgender women as transgender women, more than half of all women rejected from the bathroom will be natal women, not transgender.

Do you think that most women using bathrooms have greater than 95% accuracy at accurately determining other women’s birth gender based solely on their appearance? Do you think they have better than 98%? If not, then you are basically setting up a system of harassment of natal women. I have prepared figure 1 in terms of specificity because it is specificity that determines how many natal women you harass in your project to exclude trans women from bathrooms. By way of comparison, the specificity of commonly-used HIV tests is better than 99.998%: less than 1 mistake in 50,000. Can you be that good?

How will discrimination work in practice?

Bayes’ rule is an absolute law of discrimination tests, not some weird philosophical notion. If you set out to discriminate between two groups of people your results are determined by Bayes’ Law, without exception. It applies equally to HIV tests, gender selection, screening terrorists at airports, or picking penis size from nose length. Those three numbers – prevalence, specificity and sensitivity – determine how well you discriminate, without any exception in any cases. When you stride across the bathroom to grab that girl and tell her she’s not a “real” woman and should go to the urinal, you make yourself subject to Bayes’ Law.

Of course in practice your sensitivity and specificity depends on something: you don’t discriminate randomly, but on the basis of certain visual cues. What are those cues? Of course they will be markers of femininity: breasts, long hair, feminine facial features, make up, feminine style clothes. This will be especially the case if the Gender Recognition Act is not changed and a narrative of exclusionary behavior is established that encourages ordinary cis women with no experience of trans issues to begin singling out women for exclusion. These women will have no idea what trans women look like, how much they can “pass” as natal, or what kind of styles and manners butch-presenting lesbians use. The result of this will be what we always see when we establish discriminatory systems: non-conforming people, poor people, non-white people and people with disabilities will be singled out for discrimination. The gender critical feminists will achieve a strange paradox in which in order to be protected from trans women in the bathroom, natal women will have to act extra feminine and hew more clearly to gender stereotypes. We see this being reported now as the bathroom exclusion principle begins to apply. Consider this tweet from a queer Scottish woman:

This woman has had to begin wearing a badge that specifies her birth gender, because feminists keep mistaking her for a transgender woman. This problem will also affect any other women who do not look sufficiently womanly: women with a little bit of facial hair (which is more prevalent in women with certain sorts of illness), sportswomen who don’t dress femme, non-white women who confuse the white majority’s facial recognition, butch-presenting lesbians, and (particularly ironically) feminist women who reject standard stereotypes of feminine dress and behavior. That 80% of women excluded from bathrooms who are actually natal women and not transgender are more likely to be non-white, disabled, or non-gender conforming. They’re also more likely to be lesbians.

This is what gender critical feminism’s completely uncritical approach to bathroom exclusion will do. Here is another example of this, tweeted by a butch-presenting lesbian:

How have gender critical feminists responded?

The first thing to note about gender critical feminists is they seem to be very ignorant of the history of this debate. Holly Lawford-Smith seems to think the whole thing became a feminist issue in the 2010s, and appears to be completely ignorant of the history of transgender wars in women’s spaces in the 1970s and 1990s. Kathleen Stock, one of the major proponents of bathroom exclusion in the UK, responded to the above tweet with this:

“Worse things happen at sea.” Clearly, Stock is willing to throw her lesbian comrades under the bus in order to attack transgender women, and has given no thought to the relative balance of probabilities. She and her colleagues in the gender critical world know nothing about how this discrimination will actually work, haven’t bothered to consider who will be the real victims of their exclusionary practice, and don’t think it will affect many natal women. As I have shown, quite the opposite is the case: the majority of people affected by this exclusionary approach will be natal women.

But the gender critical feminists have become increasingly radical as they have been challenged more on this. Not only do they not take the risk of exclusion seriously, they have also begun to make their definitions stricter and more exclusionary. We see this particularly in response to the controversy around Caster Semenya, where a lot of gender critical feminists appear to have decided that she is “male” on the basis of having difference in sexual development. See, for example, this reddit thread in which they debate whether she is a man or not. So in response to criticism of their original exclusionary position they have extended their definition from “born with female genitals” to “born with female genitals AND normal testosterone.” I don’t think it’s a coincidence that these white feminists from a rich European country have decided to define a black woman who beat feminine-presenting white women as actually a man: this is another example of how their discriminatory practices will play out in practice, as a series of overlapping forms of prejudice work to punish the poor, the dark-skinned and the disabled far more effectively than the wealthy, white, feminine-presenting heterosexual women who make up the majority of the female population. Their concern with “protecting” women is really only about middle class cis white women.

The inevitable hypocrisy of trans exclusion

The underlying principle of gender critical feminism appears to be that sex and gender are different, and that gender differences need to be eliminated. Somehow this has been twisted to mean that trans women who choose to present as feminine – wearing dresses, make up and long hair – are simply “acting” female and aren’t really women at all. Before she was banned from twitter Holly Lawford-Smith liked to criticize transgender women who expressed happiness at successfully passing as women, deriding them for thinking that their appearance and their gender had any connection. Yet when it comes to pushing trans women out of women’s spaces these feminists will necessarily have to judge on the basis of how women present, not who they are. Sure, if they successfully get transgender women excluded from women’s prisons and women’s sport they may be able to do it on the basis of checking genitals (though see my footnote 1 below), but when it comes to bathrooms, women’s spaces on campus or at work, prayer rooms, women’s swimming pools and women’s beaches, they’ll have to do it entirely on the basis of how these women present. And like all human beings everywhere, they will be most likely to believe a woman is a woman if she looks femme. The more strongly they push this transgender exclusionary principle the more they will be forced to judge by feminine presentation.

Worse still, once they release their prejudices into the wild with the backing of the state, ordinary non-feminist women with no experience of trans issues will be the ones doing the judging and excluding. And you can bet that when those women decide to exclude a girl from the bathroom they won’t do it by themselves: they’ll call their masculine-presenting cis white boyfriends, or the police, to help them do it. This will lead to women with facial hair, manly physiques, and non-femme aesthetics being harassed, beaten up and potentially imprisoned (in male-focused custodial settings!) for the simple crime of not looking girly enough. Once it is released in the wild, gender critical feminism will become a feminism of harassing women who do not conform with patriarchal expectations of their physique, clothing and manners.

And that is not feminism.

Conclusion

Gender critical feminists need to drop this bathroom exclusion stuff and their opposition to the changes to the Gender Recognition Act. It is leading to the harassment of lesbian and transgender women now, and if their campaign is successful it will lead to much more harassment of non-conforming women. Rather than protecting natal women from men, it will lead to natal women being harassed by cis women, their natal male boyfriends, and the violent agents of the state. They also need to recognize that there is a fundamental hypocrisy at the heart of their exclusionary policies, and the only way that they can be put into practice is by accepting and reinforcing the worst patriarchal norms of gendered behavior and appearance. Their feminism is very bad for transgender women but it is also bad for all women who do not conform to gender stereotypes. It is toxic, dangerous, hypocritical and confused, and they need to rethink their whole approach to gender.

A note on language

I want to target this post at people who support gender critical feminists’ approach to exclusion of transgender women from women’s spaces, and so in the title and much of the text I have used their terms for things: I have used their name for themselves, and their language of “natal women” and “born women”. I also haven’t touched on the issue of trans men, an issue that gender critical feminists are extremely uncomfortable talking about because it completely ruins their ideological certainty. However, I think that the language they use is wrong and also highly unpleasant. They aren’t “gender critical”: it has been made clear by their feminist critics that they haven’t read the literature on this, and don’t understand the history of or long-standing theoretical debates about transgender issues within feminism. They also haven’t bothered to be very critical of the potential consequences of their beliefs. I think they are far better described by the acronym their opponents give them: TERF. They want to exclude trans women from women’s spaces, which makes them trans exclusionary, and their feminism is certainly radical, though not in the sense they want to believe. So they should be called TERFs, and we should not subscribe to their false dichotomies of “natal” versus trans women. We also should not adopt the horrible American practice of calling women “females” as if they were animals. So although I have used their language in this post, I don’t like it at all and I think it’s another part of their philosophy that needs to be kicked to the curb.


fn1: It’s worth bearing in mind that even a genital check is possibly not sufficient if the trans woman in question has had gender confirming surgery, since most cishet women have very little experience of or exposure to other women’s genitals in any detail, and might not be able to identify the difference between “real” genitals[2] and surgically designed genitals. We’ll come back to this issue later in the piece.

fn2: Here I use the word “normal” to indicate that the person is a member of the population with a standard education, upbringing and level of political awareness, not to suggest that natal women are “normal” and transgender women are not normal

Could you lie to this nice lady?

On 18th May 2019 Australia held a federal election, and the ruling Liberal/National Party (LNP) Coalition scored a victory over the Australian Labor Party (ALP) that was billed by most observers as an “upset” because opinion polls had in general been predicting a narrow ALP victory. The opinion polls predicted that the ALP would get a two-party preferred vote of 51.5% over 48.5% for the LNP, and would cruise to victory on the back of this; in fact, with 76% of the vote counted the Coalition is on 50.9% two party preferred, and the ALP on 49.1%. So it certainly seems like the opinion polls got it wrong. But did they, and why?

Did opinion polls get it wrong?

The best site for detailed data on opinion polls is the Poll Bludger, whose list of polls (scroll to the bottom) shows a persistent estimate of 51-52% two-party preferred vote in favour of the ALP. But there is a slightly more complicated story here, which needs to be considered before we go to far in saying they got it wrong. First of all you’ll note that the party-specific estimates put the ALP at between 33% and 37% primary vote, with the Greens running between 9% and 14%, while the Coalition is consistently listed as between 36% and 39%. Estimates for Pauline Hanson’s One Nation Party put her between 4% and 9%. This is important for two reasons: the way that opinion pollers estimate the two party preferred vote, and the margin of error of each poll.

The first thing to note is that the final estimates of the different primary votes weren’t so wildly off. Wikipedia has the current vote tally at 41% to the Coalition, 34% to ALP and 10% to Greens. The LNP vote is higher than any poll put it at, but the other three parties’ tallies are well within the range of predicted values. The big outlier is One Nation, which polled at 3%, well below predictions – and far enough below to think that the extra 2% primary vote to the Coalition could reflect this underperformance. This has big implications for the two party preferred vote estimates from the opinion poll companies, because the two-party preferred vote is not a thing that is sampled – it is inferred from past preference distributions, from simple questions about where respondents will put their second choice, or from additional questions in the poll. So uncertainty in primary votes of the minor parties will flow through to larger uncertainty in two-party preferred vote tallies, since these votes have to flow on. By way of example, a 1% difference in the primary vote estimate for the Greens (e.g. 9% vs. 10%) will manifest as a difference of 10% in the total number of two-party preferred votes flowing to the major parties. If the assumed proportion of those votes that go to the Liberals is wrong, then you can expect to see this multiplied through in the final two-party preferred vote. In the case of One Nation, some polls (e.g. Essential Research) consistently gave them 6-7% of the primary vote, when they actually got 3%. So that’s a 50% miscalculation in the number of preference votes that flow to someone from this party. This is a unique problem for opinion polling in a nation like Australia and it raises the question: Have opinion poll companies learnt to deal with preferencing in the era of minor parties?

The second thing to note is the margin of error of these polls. Margin of error is used to show what the range of possible “true” values for the polled proportion might be. For example, if a poll estimates 40% of people will vote Liberal with a 2% margin of error that means that the “real” proportion of people who will vote Liberal is between 38% and 42%. For a binary question, the method for calculating the margin of error can be found here, but polls in Australian politics are no longer a binary question: we need to know the margin of error for four proportions, and this margin of error grows as a proportion of the estimate when the estimate is smaller. For example the most recent Ipsos poll lists its margin of error as 2.3%, but this suggests that the estimated primary vote for the Coalition (39%) should actually lie between 36.7% and 41.3%. This means that the estimated primary vote for the ALP should have a slightly wider margin of error (since it’s smaller) and the Greens even more so. Given this, it’s safe to say that the observed primary vote totals currently recorded lie exactly within the margins of error for the Ipsos poll. This poll did not get any estimates wrong! But it is being reported as wrong.

The reason the poll is reported as wrong is the combination of these two problems: the margin of error on the primary votes of all these parties should magnify the margin of error on the two-party preferred vote so that in the end it is larger than 2.3%, so we should be saying that the two-party preferred vote for the Coalition that is inferred from this poll is probably wider than the range 47 – 51%. That’s easily wide enough for the Coalition to win the election. But newspapers never report the margin of error or its implications.

When you look at the actual data from the polls, take into account the margin of error and consider the uncertainty in preferences, the polls did not get it wrong at all – the media did in their reporting of the polls. But we can ask a second question about these polls: can opinion polls have any meaning in a close race?

What do opinion polls mean in a close race?

In most elections in Australia most seats don’t come into play, and only a couple of swing seats change, because most are safe. This election has definitely followed this pattern, with 7 seats changing hands and 5 in doubt – only 12 seats mattered in this election. Amongst those 12 seats it appears (based on the current snapshot of data) that the Coalition gained 8 and lost 4, for a net gain of 4. Of those 12 seats 9 were held by non-Coalition parties before the election, and 3 by the Coalition. Under a purely random outcome – that is, if there was nothing determining whether these seats changed hands and it was purely random, the equivalent of a coin toss – then the chance of this outcome is not particularly low. Indeed, even if the ALP had a 60% chance of retaining their own seats and a 40% chance of winning Coalition seats, it’s still fairly likely that you would observe an outcome like this. A lot of these seats were on razor thin margins, so that literally they could be vulnerable to upset if there was something like bad weather or a few grumpy people or a change in the proportion of donkey votes.

I don’t think polls conducted at the national level can be expected to tell us much about the results of a series of coin tosses. If those 12 seats were mostly determined by chance, not by any structural drivers of change, how is a poll that predicts a 51% two-party preferred vote, with 2% margin of error, going to determine that they’re going to flip? It simply can’t, because you can’t predict random variation with a structural model. Basically, the outcome of this election was well within the boundaries one would expect based purely on the non-systematic random error at the population level.

When a party is heading for a drubbing you can expect the polls to pick it up, but when a minor change to the status quo is going to happen due to either luck or unobserved local factors, you can’t expect polls to offer a better prediction than coin flips.

The importance of minor parties to the result

One thing I did notice in the coverage of this election was that there were a lot of seats where the Coalition was garnering the biggest primary vote but then the ALP and the Greens’ primary vote combined was almost as large or a little larger, followed by two fairly chunky independent parties. I think in a lot of elections this means that Greens and independents’ preferences were crucial to the outcome. As the Greens’ vote grows I expect it encompasses more and more disaffected Liberal and National voters, and not just ALP voters with a concern about the environment. For example in Parkes, NSW the National Party and the ALP experienced major swings against them, but the National candidate won with a two-party preferred vote swing towards him. This suggests that preferences from minor parties were super important. This may not seem important at the national level but at the local level it can be crucial. In Herbert, which the Coalition gained, two minor parties got over 10% of the vote. In Bass the combined ALP/Green primary vote is bigger than the Coalition’s, but the Liberal member is ahead on preferences, which suggests that the Greens are not giving strong preference flows to the ALP. This variation in flows is highly seat-specific and extremely hard to model or predict – and I don’t think that the opinion polling companies have any way of handling this.

Sample and selection bias in modern polling

It can be noted from the Pollbludger list of surveys that they consistently overestimated the ALP’s two-party preferred vote, which shouldn’t happen if they were just randomly getting it wrong – there appears to be some form of systematic bias in the survey results. Surveys like opinion polls are prone to two big sources of bias: sampling bias and selection bias. Sampling bias happens when the companies random phone dialing produces a sample that is demographically incorrect, for example by sampling too many baby boomers or too many men. It is often said that sampling companies only call landlines, which should lead to an over-representation of old people so that the sample is 50% elderly people even though the population is only 20% elderly. This problem can be fixed by weighting, in which the proportions are calculated with a weight to reflect the relative rarity of young people. This method increases the margin of error but should handle the sample bias problem. However, there is a deeper problem that weighting cannot fix, which is selection bias. Selection bias occurs when your sample is not representative of the population, even if demographically they appear to be. It doesn’t matter if 10% of your sample are aged 15-24, and 10% of the population is aged 15-24, if the 15-24 year olds you sampled are fundamentally different to the 15-24 year olds in the population. Some people will tell you weighting fixes these kinds of problems but it doesn’t: there is no statistical solution to sampling the wrong people.

I often hear that this problem arises because polling companies only call landlines, and people with landlines are weirdos, but I checked and this isn’t the case: Ipsos for example samples mobile phones and 40-50% of its sample is drawn from mobile phones. This sample is still heavily biased though, because people who answer their phones to strangers are a bit weird, and people who agree to do surveys are even weirder. The most likely respondent to a phone survey is someone who is very bored and very politically engaged; and as time goes by, I think the people who answer polls are getting weirder and weirder. If your sample is a mixture of politically super-engaged young people and the bored elderly, then you are likely to get a heavy selection bias. One possible consequence of this could be a pro-ALP bias in the results: the young people who answer their mobile are super politically engaged, which in that age group means pro-ALP or pro-Green, and their responses are being given a high weight because young people are under sampled. It’s also possible that the weighting has been applied incorrectly, though that seems unlikely to be a problem across the entire range of polling companies.

I don’t think this is the main problem for these polls. There is a 2% over-estimate of the ALP two-party preferred vote but this could easily arise from misapplication of preferences. The slight under-estimate of the LNP primary vote could come from inaccuracies in the National Party estimate, for example from people saying they’re going to vote One Nation on the phone, but reverting to National or Liberal in the Booth. Although there could be a selection bias in the sampling process, I don’t think this selection bias has been historically pro-ALP. I think the problem in this election has been that the fragmentation of the major party votes on both the left (to Green/Indies) and on the right (to One Nation, UAP, Hinch and others) has made small errors in sampling and small errors in assignment of preferences snowball into larger errors in the two-party preferred estimate. In any case, this was a close election and it’s hard for polls to be right when the election comes down to toss-ups in a few local electorates.

What does this mean for political feedback processes in democracies?

Although I think the problem is exaggerated in this election, I do think this is going to be a bigger problem in future as the major parties continue to lose support to minor parties. One Nation may come and go but the Greens have been on a 10% national vote share for a decade now and aren’t going anywhere, and as they start to get closer to more lower house seats their influence on election surprises will likely grow – and not necessarily in the ALP’s favour. This means that the major parties are not going to be able to rely on opinion polls as a source of feedback from the electorate about the raw political consequences of their actions and that, I think, is a big problem for the way our democracy works.

Outside of their membership – and in the case of the ALP, the unions – political parties have no particular mechanism for receiving feedback from the general public except elections. Over the last 20 years opinion polls have formed one major component of the way in which political leaders learn about the reception their policies have in the general community. Sure, they can ask their membership for an opinion, and they’ll get feedback through other segments of the community (such as the environmental movement for the Greens, or the unions for the ALP), but in the absence of opinion polls they won’t learn much about how the politically disengaged think of their policies. But in Australia under compulsory voting the politically disengaged still vote, and they still get angry about politicians, and they still have political ideals. If this broader community withdraws completely so that their opinion can no longer be gauged – or worse still, politicians learn to believe that the opinions of those who are polled are representative of community sentiment in general – then politicians will instead learn about the reception their policies receive only through the biased filter of stakeholders, the media, and their own party organisms. I don’t see any of the major parties working to make themselves more accessible to community feedback and more amenable to public discussion and engagement, and I don’t think they will be able to find a way to do that even if they tried. Over the past 20 years instead politicians have gauged the popularity of their platform from polls, and used it to modify and often to moderate their policies in between elections. Everyone hates the political leader who simply shapes their policies to match the polls, but everyone hates a politician who ignores public opinion just as much. We do expect our politicians to pay attention to what we think in between elections, and to take it into account when making policy. If it becomes impossible for them to do this, then an important mode of communication between those who make the laws and those who don’t will be broken or worse still become deceptive.

It does not seem that this problem is going to go away or get better. This means that the major political parties are going to have to start finding new mechanisms to receive feedback from the general public – and we the public are going to have to find new ways to get through to them. Until then, expect more and nastier surprises in the future, and more weird political contortions as the major parties realize they haven’t just lost control of the narrative – they aren’t even sure what the narrative is. And since we the public learn what the rest of the public think from opinion polls as well, we too will lose our sense of what our own country wants, leaving us dependent on our crazy Aunt’s Facebook posts as our only vox populi.

As people retreat from engagement with pollsters, the era of the opinion poll will begin to close. We need to build a new form of participatory democracy to replace it. But, and how? And until we do, how confused will we become in the democracy we have? The strange dynamics of modern information systems are wreaking havoc in our democratic systems, and it is becoming increasingly urgent that we understand how, and what we can do to secure our democracies in this strange new world of fragmented information.

But as Scott Morrison stands up in the hottest, driest era in the history of the continent and talks about building more coal mines on the back of his mandate, I don’t hold out much hope that there will be any change.

 

And let me tell you something
Before you go taking a walk in my world,
…you better take a look at the real world
Cause this ain’t no Mr. Rogers Neighborhood
Can you say “feel like shit?”
Yea maybe sometimes I do feel like shit
I ain’t happy about it, but I’d rather feel like shit
…than be full of shit!

 

There are times in life when it’s necessary to turn to the original gurus of self-righteous self-inspiration, Suicidal Tendencies. Life getting you down, you feel you can’t keep going? Crank up ST and when the boys ask you “Are you feelin’ suicidal?” yell back “I’m suicidal!” and you’ll be back on track in no time. Been meandering through some shit, making mistakes you know are your own dumb fault, and need to kick yourself back onto the straight and narrow? Gotta kill Captain Stupid is what you need. Getting played by conmen who play on your better nature, maybe take you for a ride using your religious impulses? Then you can crank up Send Me Your Money and be reminded that “Here comes another con hiding behind a collar / His only God is the almighty dollar / He ain’t no prophet, he ain’t no healer / He’s just a two bit goddamn money stealer.” That’ll get your cynical radar working again! But the Suicidals’ most useful refrain, the one that applies most often and most powerfully in this shit-stained and terrible world, is the imprecation at the beginning of the second half of their skate power classic, You Can’t Bring Me Down:

Just cause you don’t understand what’s going on
…don’t mean it don’t make no sense
And just cause you don’t like it,
…don’t mean it ain’t no good

This pure reminder of the power of bullshit over mortal men came to me today when I began to delve into the background of the latest Sokal Hoax that has been visited on the social sciences. I’d like to explore this hoax, consider how it would have panned out in other disciplines, make a few criticisms, and discuss the implications of some of their supposedly preposterous papers. So as Mikey would say – bring it on home, brother doc!

The Latest Hoax

The latest hoax comes with its own report, a massive online screed that describes what they did, why they did it, how they did it and what happened. Basically they spent a year preparing a bunch of papers that they submitted to a wide range of social studies journals in a field they refer to as “grievance studies”, which they define by saying

we have come to call these fields “grievance studies” in shorthand because of their common goal of problematizing aspects of culture in minute detail in order to attempt diagnoses of power imbalances and oppression rooted in identity.

This definition of the field is easily the vaguest and most hand-wavey way to select a broad set of targets I have ever seen, and it’s also obviously intended to be perjorative. In fact their whole project could perhaps be described as having the “common goal of problematizing aspects of culture in minute detail” – starting with their definition of the culture.

The authors admit that they’re not experts in the field, but they spent a year studying the content, methods and style of the field, then wrote papers that they submitted to journals under fake names (one real professor gave them permission to use his name) from fake institutions. They submitted 20 papers over the year, writing one every 9 days, and got 7 published, one with a commendation; the other 13 were repeatedly rejected or still under review when somehow their cover was blown and they had to reveal the hoax.

The basic problem with the hoax

The papers they submitted are listed at the website and are pretty hilarious, and some of the papers that were published were obviously terrible (though they may have been interesting reading). Two of the papers they submitted – one on dog parks and one on immersive pornography – used fake data, i.e. academic misconduct, and two were plagiarized parts of Mein Kampf, with some words replaced to reverse them into a feminist meaning of some kind (I guess by replacing “Jew” with “men” or something).

Submitting an article based on fraudulent data is, let’s be clear, academic misconduct, and it is also extremely difficult for peer reviewers to catch. Sure it’s easy in retrospect to say “that data was fake” but when peer reviewers get an article they don’t get the raw data, they have to judge based on the summaries in the paper. This is how the Wakefield paper that led to the collapse in MMR vaccination got published in the Lancet – Wakefield made up his data, and it was impossible for the peer reviewers to know that. The STAPP controversy in Japan – which led to several scientists being disgraced and one suicide – involved doctored images that were only discovered when a research assistant blew the whistle. Medicine is full of these controversies in which data is faked or manipulated and only discovered after a huge amount of detective work, or after a junior staff member destroys their career blowing the whistle. Submitting fraudulent work to peer review – a process which at heart depends on good faith assumptions all around – is guaranteed to be successful. It’s not an indictment of anyone to do this.

Submitting a word-replaced Mein Kampf is incredibly tacky, tasteless and juvenile. Most academics don’t read Mein Kampf, and it’s not a necessary text for most sociological disciplines. If the journal doesn’t use plagiarism software or the peer reviewers don’t, then this is undoubtedly going to slide through, and while much of Mein Kampf is pernicious nonsense a lot of it is actually pretty straightforward descriptions of political strategies and contemporary events. Indeed the chapter they used (chapter 12 of volume 1) is really about organizing and political vision[1], with only passing references to Jewish perfidy – it’s the kind of thing that could be rendered pretty bland with a word replace. But from the description in their report one might think they had successfully published an exterminationist screed. I’m sure the hoaxers thought they were being super clever doing this, but they weren’t. Detecting plagiarism is a journal’s responsibility more than a peer reviewer’s, and not all journals can. It’s not even clear if the plagiarized text would have been easily detected by google searches of fragments if there was a suitable level of word replacement.

So several of their hoax papers were highlighting problems with the peer review process in general, not with anything to do with social studies. Of the remainder, some were substantially rewritten during review, and a lot were rejected or sent back for major revision. While people on twitter are claiming that “many papers” were accepted, in fact the most obviously problematic ones were rejected. For example the paper that recommended mistreating white students, ignoring their work and dismissing their efforts, to teach them about white privilege, was rejected three times, but people on twitter are claiming that the treatment of this paper shows some kind of problematic morality by the peer reviewers.

The next problem with the hoax is that the authors have misrepresented good-spirited, kind-hearted attempts at taking their work seriously with uncritical acceptance of their work. Consider this peer review that they report[2] on a paper on whether men commit sexual violence by masturbating to fantasies of real women (more on this below):

I was also trying to think through examples of how this theoretical argument has implications in romantic consensual relationships. Through the paper, I was thinking about the rise of sexting and consensual pornographic selfies between couples, and how to situate it in your argument. I think this is interesting because you could argue that even if these pictures are shared and contained within a consensual private relationship, the pictures themselves are a reaction to the idea that the man may be thinking about another woman while masturbating. The entire industry of boudoir photography, where women sometimes have erotic pictures taken for their significant other before deploying overseas in the military for example, is implicitly a way of saying, “if you’re going to masturbate, it might as well be to me.” Essentially, even in consensual monogamous relationships, masturbatory fantasies might create some level of coercion for women. You mention this theme on page 21 in terms of the consumption of non-consensual digital media as metasexual-rape, but I think it is interesting to think through these potentially more subtle consensual but coercive elements as well

This is a genuine, good-faith effort to engage with the authors’ argument, and to work out its implications. But this peer reviewer, who has clearly devoted considerable time to engaging with and attempting to improve this paper, now discovers that he or she was being punked the whole time, and the authors were laughing at her naivete for thinking their idea should be taken seriously. He or she did this work for free, as part of an industry where we all give freely of our time to help each other improve their ideas, but actually this good faith effort was just being manipulated and used as part of a cheap publicity stunt by some people who have an axe to grind with an entire, entirely vaguely-defined branch of academia. And note also that after all this peer reviewer’s work, this paper was still rejected – but the hoaxers are using it as ammunition for their claim that “grievance studies” takes preposterous ideas seriously. Is that fair, or reasonable? And is it ethical to conduct experiments on other academics without consent?

I would be interested to know, incidentally, if their little prank was submitted to institutional review before they did it. If I tried to pull this shitty little move in my field, without putting it through an IRB, I think my career would be toast.

But there is another problem with this hoax, which I want to dwell on in a little more detail: some of the papers actually covered interesting topics of relevance in their field, and the fact that the hoaxers think their theories were preposterous doesn’t mean they were actually preposterous. It’s at this point that the Suicidals’ most powerful rule applies: Just because you don’t understand what’s going on, don’t mean it don’t make sense.

The theoretical value of some of the hoax papers

Why don’t men use dildos for masturbation?

Let us consider first the paper the authors refer to as “Dildos”, actual title Going in Through the Back Door: Challenging Straight Male Homohysteria and Transphobia through Receptive Penetrative Sex Toy Use. In this paper the hoaxers ask why men don’t use dildos for masturbation, and suggest it is out of a fear of homosexuality, and transphobia. The hoaxers say that they wrote this paper

To see if journals will accept ludicrous arguments if they support (unfalsifiable) claims that common (and harmless) sexual choices made by straight men are actually homophobic, transphobic, and anti-feminist

But is this argument ludicrous? Why don’t men use dildos more? After all, we know that men can obtain sexual pleasure from anal insertion, through prostate stimulation. There is a genre of porn in which this happens (for both cismen and transgender women), and it is a specialty service provided by sex workers, but it is not generally commonly practiced in heterosexual intercourse or male masturbation. Why? Men can be pretty bloody-minded about sexual pleasure, so why don’t they do this more? There could be many reasons, such as that it’s impractical, or it’s dirty, or (for couple sex) that women have a problem with penetrating men, or because men see sex toys as fundamentally femininized objects – but it could also be out of a residual homophobia, right? This seems prima facie an interesting theory that could be explored. For example, the only mainstream movie I can think of where a woman penetrates a man is Deadpool, and so it should be fairly easy to study reactions to that movie and analyze them for homophobia (reddit should be pretty good for this, or MRA websites). Understanding the reasons for this might offer new ways for men to enjoy sex, and a new diversity of sex roles for women, which one presumes is a good thing. So why is this argument ludicrous?

Why do men visit Hooters?

Another article that was published was referred to by the hoaxers as “Hooters”, actual title An Ethnography of Breastaurant Masculinity: Themes of Objectification, Sexual Conquest, Male Control, and Masculine Toughness in a Sexually Objectifying Restaurant. The article argues that men visit “breastaurants” to assert male dominance and enjoy a particular form of “authentic masculinity,” presumably in contrast to the simpler motive of wanting to be able to look at tits. The authors say they did this article to

see if journals will publish papers that seek to problematize heterosexual men’s attraction to women and will accept very shoddy qualitative methodology and ideologically-motivated interpretations which support this

But again, this is basically an interesting question. Why do men go to restaurants with scantily-clad women? They could eat at a normal restaurant and then watch porn, or just read playboy while they eat. Or they could eat and then go to a strip club. So why do they need to be served in restaurants by breasty girls? And why are some men completely uninterested in these environments, even though they’re seriously into tits? The answer that this is something about performing a type of masculinity, and needing women as props for some kind of expression of dominance, makes sense intuitively (which doesn’t mean it’s right). It’s particularly interesting that this article is being presented as preposterous by the hoaxers now just as debate is raging about why Brett Kavanaugh insisted in sharing his non-consensual sexual encounters with other men, while Bill Cosby did his on the down-low. It’s almost as if Bill and Brett had different forms of masculine dominance to express! Forms of masculine dominance that need to be explored and understood! By academics in social studies, for example!

Also note here that the tone of the hoaxers’ explanation suggests that the idea that visiting breasty restaurants is problematic is obviously wrong and everyone believes them about this. In fact, many Americans of good faith from many different backgrounds don’t consider visiting Hooters to be a particularly savoury activity, and you probably won’t convince your girlfriend you’re not an arsehole by telling her she’s wrong to “problematize heterosexual men’s attraction to women” in the context of your having blown your weekly entertainment budget on a trip to Hooters. Understanding why she has problematized this behavior might help you to get laid the following week!

Do men do violence to women when they fantasize about them?

The hoaxers wrote an article that they refer to as “Masturbation”, real title Rubbing One Out: Defining Metasexual Violence of Objectification Through Nonconsensual Masturbation, which was ultimately rejected from Sociological Theory after peer review. I think this was the most interesting of their fake articles, covering a really interesting topic, with real ethical implications. The basic idea here is that when men fantasize about women without women’s consent (for example when masturbating) they’re committing a kind of sexual violence, even though the woman in question doesn’t know about this. They wrote this article to test

To see if the definition of sexual violence can be expanded into thought crimes

But this way of presenting their argument (“Thought crimes”) and the idea that the definition of sexual violence hasn’t already been expanded to thought crimes, is deeply dangerous and stupid. To deal with the second point first, in many jurisdictions anime or manga that depicts sex with children is banned. But in these comics nobody has been harmed. So yes, sexual violence has been extended to include thought crimes. But if we don’t expand the definition of sexual violence into thought crimes we run into some very serious legal and ethical problems. Consider the crime of upskirting, in which men take secret videos up women’s skirts and put them onto porn sites for other men to masturbate to. In general the upskirted woman has no clue she’s been filmed, and the video usually doesn’t show her face so it’s not possible for her to be identified. It is, essentially, a victimless crime. Yet we treat upskirting as a far more serious crime than just surreptitiously taking photos of people, which we consider to be rude but not criminal. This is because we consider upskirting to be a kind of sexual violence exactly equivalent to the topic of this article! This is also true for revenge porn, which is often public shaming of a woman that destroys her career, but doesn’t have to be. If you share videos of your ex-girlfriend naked with some other men, and she never finds out about it and your friends don’t publicize those pictures, so she is not affected in any way, everyone would agree that you have still done a terrible thing to her, and that this constitutes sexual violence of some kind. I’ve no doubt that in many jurisdictions this revenge porn is a crime even though the woman targeted has not suffered in any way. Indeed, even if a man just shows his friend a video of a one night stand, and the friend doesn’t know the woman, will never meet her, and has no way to harm her, this is still considered to be a disgusting act. So the fundamental principle involved here is completely sound. This is why porn is made – because the women are being paid to allow strangers to watch them have sex. When people sext each other they are obviously clearly giving explicit permission to the recipient to use the photo for sexual gratification (this is why it is called sexting). Couples usually don’t sext each other until they trust each other precisely because they don’t want the pictures shared so that people they don’t know can masturbate to them without their consent. We also typically treat men who steal women’s underwear differently to men who steal other men’s socks at the coin laundry – I think the reason for this is obvious! So the basic principle at the heart of this paper is solid. Yet the hoaxers treat the idea underlying much of our modern understanding of revenge porn and illicit sexual photography as a joke.

I think the basic problem here is that while the hoaxers have mimicked the style of the field, and understand which theoretical questions to target and write about, they fundamentally don’t understand the field, and so things they consider to be ludicrous are actually important and real questions in the topic, with important and real consequences. They don’t understand it, but it actually makes sense. And now they’ve created this circus of people sneering at how bad the papers were, when actually they were addressing decent topics and real questions.

How would this have happened in other fields?

So if we treat these three papers as serious recognizing that two were published, and then discount the paper with fradulent data (dog park) and the paper that was plagiarized (feminist mein kampf) we are left with just three papers that were published that might be genuinely bullshit, out of 20. That’s 15%, or 22% if you drop the plagiarized and fraudulent papers from the denominator. Sounds bad, right? But this brings us to our next big problem with this hoax: there was no control group. If I submitted 20 papers with dodgy methods and shonky reasoning to public health journals, I think I could get 15% published. Just a week or two ago I reported on a major paper in the Lancet that I think has shonky methods and reasoning, as well as poorly-gathered data, but it got major publicity and will probably adversely affect alcohol policy in future. I have repeatedly on this blog attacked papers published in the National Bureau of Economics Research (NBER) archives, which use terrible methods, poor quality data, bad reasoning and poor scientific design. Are 15% of NBER papers bullshit? I would suggest the figure is likely much higher. But we can’t compare because the authors didn’t try to hoax these fields, and as far as I know no one has ever tried to hoax them. This despite the clear and certain knowledge that the R&R paper in economics was based on a flawed model and bad reasoning, but was used to inform fiscal policy in several countries, and the basic conclusions are still believed even though it has been roundly debunked.

The absence of hoaxes (or even proper critical commentary) on other fields means that they can maintain an air of inassailability while social studies and feminist theory are repeatedly criticized for their methods and the quality of their research and peer review. This is a political project, not a scientific project, and these hoaxers have gone to great lengths to produce a salable, PR-ready attack on a field they don’t like, using a method that is itself poorly reasoned, with shonky methodology, and a lack of detailed understanding of the academic goals of the field they’re punking. They also, it should be remembered, have acted very unethically. I think the beam is in their own eye, or as the Suicidals would say:

Ah, damn, we got a lot of stupid people
Doing a lot of stupid things
Thinking a lot of stupid thoughts
And if you want to see one
Just look in the mirror

Conclusion

This hoax shouldn’t be taken seriously, and it doesn’t say anything much about the quality of research or academic editing in the field they’re criticizing. Certainly on the face of it some of the papers that were published seem pretty damning, but some of them covered real topics of genuine interest, and the hoaxers’ interpretation of the theoretical value of the work is deeply flawed. This is a PR stunt, nothing more, and it does nothing to address whatever real issues sociology and women’s studies face. Until people start genuinely developing a model for properly assessing the quality of academic work in multiple fields, with control groups and proper adjustment for confounders, in a cross-disciplinary team that fully understands the fields being critiqued, these kinds of hoaxes will remain just stupid stunts, that play on the goodwill of peer reviewers and academics for the short-term political and public benefit of the hoaxers, but for no longer benefit to the community being punked, and at the risk of considerable harm. Until a proper assessment of the quality of all disciplines is conducted, we should not waste our time punking others, but think harder about how we can improve our own.

 


fn1: I won’t link, because a lot of online texts of Mein Kampf are on super dubious websites – look it up yourself if you wish to see what the punking text was.

fn2: Revealing peer reviews is generally considered unethical, btw

Uhtred son of Uhtred, regular ale drinker, who I predict will die of injury (but will go to Valhalla, unlike you you ale-sodden wretch)

There has been some fuss in the media recently about a new study showing no level of alcohol use is safe. It received a lot of media attention (for example here), reversed a generally held belief that moderate consumption of alcohol improves health (this is even enshrined in the Greek food pyramid, which has a separate category for wine and olive oil[1]), and led to angsty editorials about “what is to be done” about alcohol. Although there are definitely things that need to be done about alcohol, prohibition is an incredibly stupid and dangerous policy, and so are some of its less odious cousins, so before we go full Leroy Jenkins on alcohol policy it might be a good idea to ask if this study is really the bees knees, and does it really show what it says it does.

This study is a product of the Global Burden of Disease (GBD) project, at the Institute for Health Metrics and Evaluation (IHME). I’m intimately acquainted with this group because I made the mistake of getting involved with them a few years ago (I’m not now) so I saw how their sausage is made, and I learnt about a few of their key techniques. In fact I supervised a student who, to the best of my knowledge, remains the only person on earth (i.e. the only person in a population of 7 billion people, outside of two people at IHME) who was able to install a fundamental software package they use. So I think I know something about how this institution does its analyses. I think it’s safe to say that they aren’t all they’re cracked up to be, and I want to explain in this post how their paper is a disaster for public health.

The way that the IHME works in these papers is always pretty similar, and this paper is no exception. First they identify a set of diseases and health conditions related to their chosen risk (in this case the chosen risk is alcohol). Then they run through a bunch of previously published studies to identify the numerical magnitude of increased risk of these diseases associated with exposure to the risk. Then they estimate the level of exposure in every country on earth (this is a very difficult task which they use dodgy methods to complete). Then they calculate the number of deaths due to the conditions associated with this risk (this is also an incredibly difficult task to which they apply a set of poorly-accredited methods). Finally they use a method called comparative risk assessment (CRA) to calculate the proportion of deaths due to the exposure. CRA is in principle an excellent technique but there are certain aspects of their application of it that are particularly shonky, but which we probably don’t need to touch on here.

So in assessing this paper we need to consider three main issues: how they assess risk, how they assess exposure, and how they assess deaths. We will look at these three parts of their method and see that they are fundamentally flawed.

Problems with risk assessment

To assess the risk associated with alcohol consumption the IHME used a standard technique called meta-analysis. In essence a meta-analysis collects all the studies that relate an exposure (such as alcohol consumption) to an outcome (any health condition, but death is common), and then combines them to obtain a single final estimate of what the numerical risk is. Typically a meta-analysis will weight all the risks from all the studies according to the sample size of the study, so that for example a small study that finds banging your head on a wall reduces your risk of brain damage is given less weight in the meta-analysis than a very large study of banging your head on a wall. Meta-analysis isn’t easy for a lot of reasons to do with the practical details of studies (for example if two groups study banging your head on a wall do they use the same definition of brain damage and the same definition of banging?), but once you iron out all the issues it’s the only method we have for coming to comprehensive decisions about all the studies available. It’s important because the research literature on any issue typically includes a bunch of small shitty studies, and a few high quality studies, and we need to balance them all out when we assess the outcome. As an example, consider football and concussion. A good study would follow NFL players for several seasons, taking into account their position, the number of games they played, and the team they were in, and compare them against a concussion free sport like tennis, but matching them to players of similar age, race, socioeconomic background etc. Many studies might not do this – for example a study might take 20 NFL players who died of brain injuries and compare them with 40 non-NFL players who died of a heart attack. A good meta-analysis handles these issues of quality and combines multiple studies together to calculate a final estimate of risk.

The IHME study provides a meta-analysis of all the relationships between alcohol consumption and disease outcomes, described as follows[2]:

we performed a systematic review of literature published between January 1st, 1950 and Dec 31st 2016 using Pubmed and the GHDx. Studies were included if the following conditions were met. Studies were excluded if any of the following conditions were met:

1. The study did not report on the association between alcohol use and one of the included outcomes.

2. The study design was not either a cohort, case-control, or case-crossover.

3. The study did not report a relative measure of risk (either relative risk, risk ratio, odds-ratio, or hazard ratio) and did not report cases and non-cases among those exposed and un-exposed.

4. The study did not report dose-response amounts on alcohol use.

5. The study endpoint did not meet the case definition used in GBD 2016.

There are many, many problems with this description of the meta-analysis. First of all they seem not to have described the inclusion criteria (they say “Studies were included if the following conditions were met” but don’t say what those conditions were). But more importantly their conditions for exclusion are very weak. We do not, usually, include case-control and case-crossover studies in a meta-analysis because these studies are, frankly, terrible. The standard method for including a study in a meta-analysis is to assess it according to the Risk of Bias Tool and dump it if it is highly biased. For example, should we include a study that is not a randomized controlled trial? Should we include studies where subjects know their assignment? The meta-analysis community have developed a set of tools for deciding which studies to include, and the IHME crew haven’t used them.

This got me thinking that perhaps the IHME crew have been, shall we say, a little sloppy in how they include studies, so I had a bit of a look. On page 53-55 of the appendix they report the results of their meta-analysis of the relationship between atrial fibrillation and alcohol consumption, and the results are telling. They found 9 studies to include in their meta-analysis but there are many problems with these studies. One (Cohen 1988) is a cross-sectional study and should not be included, according to the IHME’s own exclusion criteria. 6 of the remaining studies assess fribillation only, while 2 assess fibrillation and fibrial flutter, a pre-cursor of fibrillation. However most tellingly, all of these studies find no relationship between alcohol consumption and fibrillation at almost all levels of consumption, but their chart on page 54 shows that their meta-analysis found an almost exponential relationship between alcohol consumption and fibrillation. This finding is simply impossible given the observed studies. All 9 studies found no relationship between moderate alcohol consumption and fibrillation, and several found no relationship even for extreme levels of consumption, but somehow the IHME found a clear relationship. How is this possible?

Problems with exposure assessment

This problem happened because they applied a tool called DISMOD to the data to estimate the relationship between alcohol exposure and fibrillation. DISMOD is an interesting tool but it has many flaws. Its main benefit is that it enables the user to incorporate exposures that have many different categories of exposure definition that don’t match, and turn them into a single risk curve. So for example if one study group has recorded the relative risk of death for 2-5 drinks, and another group has recorded the risk for 1-12 drinks, DISMOD offers a method to turn this into a single curve that will represent the risk relationship per additional drink. This is nice, and it produces the curve on page 54 (and all the subsequent curves). It’s also bullshit. I have worked with DISMOD and it has many, many problems. It is incomprehensible to everyone except the two guys who programmed it, who are nice guys but can’t give decent support or explanations of what it does. It has a very strange response distribution and doesn’t appear to apply other distributions well, and it has some really kooky Bayesian applications built in. It is also completely inscrutable to 99.99% of people who use it, including the people at IHME. It should not be used until it is peer reviewed and exposed to a proper independent assessment. It is application of DISMOD to data that obviously shows no relationship between alcohol consumption and fibrillation that led to the bullshit curve on page 54 of the appendix, that does not have any relationship to the observed data in the collected studies.

This also applies to the assessment of exposure to alcohol. The study used DISMOD to calculate each country’s level of individual alcohol consumption, which means that the same dodgy technique was applied to national alcohol consumption data. But let’s not get hung up on DISMOD. What data were they using? The maps in the Lancet paper show estimates of risk for every African and south east Asian country, which suggests that they have data on these countries, but do you think they do? Do you think Niger has accurate estimates of alcohol consumption in its borders? No, it doesn’t. A few countries in Africa do and the IHME crew used some spatial smoothing techniques (never clearly explained) to estimate the consumption rates in other countries. This is a massive dodge that the IHME apply, which they call “borrowing strength.” At its most egregious this is close to simply inventing data – in an earlier paper (perhaps in 2012) they were able to estimate rates of depression and depression-related conditions for 183 (I think) countries using data from 97 countries. No prizes to you, my astute reader, if you guess that all the missing data was in Africa. The same applies to the risk exposure estimates in this paper – they’re a complete fiction. Sure for the UK and Australia, where alcohol is basically a controlled drug, they are super accurate. But in the rest of the world, not so much.

Problems with mortality assessment

The IHME has a particularly nasty and tricky method for calculating the burden of disease, based around a thing called the year of life lost (YLL). Basically instead of measuring deaths they measure the years of your life that you lost when you died, compared to an objective global standard of life you could achieve. Basically they get the age you died, subtract it from the life expectancy of an Icelandic or Japanese woman, and that’s the number of YLLs you suffered. Add that up for every death and you have your burden of disease. It’s a nice idea except that there are two huge problems:

  • It weights death at young ages massively
  • They never incorporate uncertainty in the ideal life expectancy of an Icelandic or Japanese woman

There is an additional problem in the assessment of mortality, which the IHME crew always gloss over, which is called “garbage code redistribution.” Basically, about 30% of every country’s death records are bullshit, and don’t correspond with any meaningful cause of death. The IHME has a complicated, proprietary system that they cannot and will not explain that redistributes these garbage codes into other meaningful categories. What they should do is treat these redistributed deaths as a source of error (e.g. we have 100,000 deaths due to cancer and 5,000 redistributed deaths, so we actually have 102500 plus/minus 2500 deaths), but they don’t, they just add them on. So when they calculate burden of disease they use the following four steps:

  • Calculate the raw number of deaths, with an estimate of error
  • Reassign dodgy deaths in an arbitrary way, without counting these deaths as any form of uncertainty
  • Estimate an ideal life expectancy without applying any measure of error or uncertainty to it
  • Calculate the years of life lost relative to this ideal life expectancy and add them up

So here there are three sources of uncertainty (deaths, redistribution, ideal life expectancy) and only one is counted; and then all these uncertain deaths are multiplied by the number of years lost relative to the ideal life expectancy.

The result is a dog’s breakfast of mortality estimates, that don’t come even close to representing the truth about the burden of disease in any country due to any condition.

Also, the IHME apply the same dodgy modeling methods to deaths (using a method that they (used to?) call CoDMoD) before they calculate YLLs, so there’s another form of arbitrary model decisions and error in their assessments.

Putting all these errors together

This means that the IHME process works like this:

  • An incredibly dodgy form of meta-analysis that includes dodgy studies and miscalculates levels of risk
  • Applied to a really shonky estimate of the level of exposure to alcohol, that uses a computer program no one understands applied to a substandard data set
  • Applied to a dodgy death model that doesn’t include a lot of measures of uncertainty, and is thus spuriously accurate

The result is that at every stage of the process the IHME is unreasonably confident about the quality of their estimates, produces excessive estimates of risk and inaccurate measures of exposure, and is too precise in its calculations of how many people died. This means that all their conclusions about the actual risk of alcohol, the level of exposure, and the magnitude of disease burden due to the conditions they describe cannot be trusted. As a result, neither can their estimates of the proportion of mortality due to alcohol.

Conclusion

There is still no evidence that moderate alcohol consumption is bad for you, and solid meta-analyses of available studies support the conclusion that moderate alcohol consumption is not harmful. This study should not be believed and although the IHME has good press contacts, you should ignore all the media on this. As a former insider in the GBD process I can also suggest that in future you ignore all work from the Global Burden of Disease project. They have a preferential publishing deal with the Lancet, which means they aren’t properly peer reviewed, and their work is so massive that it’s hard for most academics to provide adequate peer review. Their methods haven’t been subjected to proper external assessment and my judgement, based on having visited them and worked with their statisticians and their software, is that their methods are not assessable. Their data is certainly dubious at times but most importantly their analysis approach is not correct and the Lancet doesn’t subject it to proper peer review. This is going to have long term consequences for global health, and at some point the people who continue to associate with the IHME’s papers (they have hundreds or even thousands of co-authors) will regret that association. I stopped collaborating with this project, and so should you. If you aren’t sure why, this paper on alcohol is a good example.

So chill, have another drink, and worry about whether it’s making you fat.


fn1: There are no reasons not to love Greek food, no wonder these people conquered the Mediterranean and developed philosophy and democracy!

fn2: This is in the appendix to their study

No this really is not “the healthy one”

Today’s Guardian has a column by George Monbiot discussing the issue of obesity in modern England, that I think fundamentally misunderstands the causes of obesity and paints a dangerously rosy picture of Britain’s dietary situation. The column was spurred by a picture of a Brighton Beach in 1976, in which everyone was thin, and a subsequent debate on social media about the causes of the changes in British rates of overweight and obesity in the succeeding half a decade. Monbiot’s column dismisses the possibility that the growth in obesity could be caused by an increase in the amount we eat, by a reduction in the amount of physical activity, or by a change in rates of manual labour. He seems to finish the column by suggesting it is all the food industry’s fault, but having dismissed the idea that the food industry has convinced us to eat more, he is left with the idea that the real cause of obesity is changes in the patterns of what we eat – from complex carbohydrates and proteins to sugar. This is a bugbear of certain anti-obesity campaigners, and it’s wrong, as is the idea that obesity is all about willpower, which Monbiot also attacks. The problem here though is that Monbiot misunderstands the statistics badly, and as a result dismisses the obvious possibility that British people eat too much. He commits two mistakes in his article: first he misunderstands the statistics on British food consumption, and secondly he misunderstands the difference between a rate and a budget, which is ironic given he understands these things perfectly well when he comments on global warming. Let’s consider each of these issues in turn.

Misreading the statistics

Admirably, Monbiot digs up some stats from 1976 and compares them with statistics from 2018, and comments:

So here’s the first big surprise: we ate more in 1976. According to government figures, we currently consume an average of 2,130 kilocalories a day, a figure that appears to include sweets and alcohol. But in 1976, we consumed 2,280 kcal excluding alcohol and sweets, or 2,590 kcal when they’re included. I have found no reason to disbelieve the figures.

This is wrong. Using the 1976 data, Monbiot appears to be referring to Table 20 on page 77, which indicates a yearly average of 2280 kCal. But this is the average per household member, and does not account for whether or not a household member is a child. If we refer to Table 24 on page 87, we find that a single adult in 1976 ate an average of 2670 kCal; similar figures apply for two adult households with no children (2610 kCal). Using the more recent data Monbiot links to, we can see that he got his 2,130 kCal from the file of “Household and Eating Out Nutrient Intakes”. But if we use the file “HC – Household nutrient intakes” and look at 2016/17 for households with one adult and no children, we find 2291 kCal, and about 2400 as recently as 10 years ago. These are large differences when they accrue over years.

This is further compounded by the age issue. When we look at individual intake we need to consider how old the family members are. If an average individual intake is 2590 kCal in 1976 including alcohol and sweets, as Monbiot suggests, we need to rebalance it for adults and children. In a household with three people we have 7700 kCal, which if the child is eating 1500 kCal means that the adults are eating close to 3100 kCal each. That’s too much food for everyone in the house, even using the ridiculously excessive nutrient standards provided by the ONS.  It’s also worth remembering that the age of adults in 1976 was on average much younger than now, and an intake of 2590 might be okay for a young adult but it’s not okay for a 40-plus adult, of which there are many more now than there were then. This affects obesity statistics.

Finally it’s also worth remembering that obesity is not evenly distributed, and an average intake of 2100 kCal could correspond to an average of 2500 in the poorest 20% of the population (where obesity is common) and 1700 kCal in the richest, which is older and thinner. An evenly distributed 2100 kCal will lead to zero obesity over the whole population, but an unevenly distributed 2100 kCal will not. It’s important to look carefully at the variation in the datasets before deciding the average is okay.

Misunderstanding budgets and rates

Let’s consider the 2590 kCal that Monbiot finds as the average intake of adults in 1976, including alcohol and sweets. This is likely wrong, and the average is probably more like 3000 kCal including alcohol and sweets, but let’s go with it for now. Monbiot is looking to see what has changed in our diet over the past 40 years to lead to current rates of obesity, because he is looking for a change in the rate of consumption. But he doesn’t consider that all humans have a budget, and that a small excess of that budget over a long period is what drives obesity. The reality is that today’s obesity rates do not reflect today’s consumption rates, but the steady pattern of consumption over the past 40 years. What made a 55 year old obese today is what they ate in 1976 – when they were 15 – not what the average person eats today. So rather than saying “we eat less today than we did 40 years ago so that can’t be the cause of obesity”, what really matters is what people have been eating for the past 40 years. And the stats Monbiot uses suggest that women, at least, have been eating too much – a healthy adult woman should eat about 2100 kCal, and if the average is 2590 then a woman in 1976 has been at or above her energy intake every year for the past 40 years. It doesn’t matter that a woman’s intake declined to 2100 kCal in 2016, because she has been eating too much for the past 35 years anyway. It’s this budget, not changes over time, which determine the obesity rate now, and Monbiot is wrong to argue that it’s not overeating that has caused the obesity epidemic. Unless he accepts that a woman can eat 2590 kCal every year for 40 years and stay thin, he needs to accept that the problem of obesity is one of British food culture over half a century.

What this means for obesity policy

Somewhat disappointingly and unusually for a Monbiot article, there are no sensible policy prescriptions at the end except “stop shaming fat people.” This isn’t very helpful, and neither is it helpful to dismiss overeating as a cause, since everyone in public health knows that overeating is the cause of obesity. For example, Public Health for England wants to reduce British calorie intake, and the figures on why are disturbing reading. Reducing calorie intake doesn’t require shaming fat people but it does require acknowledgement that British people eat too much. This comes down not to individual willpower but to the food environment in which we all make choices about what to eat. The simplest way, for example, to reduce the amount that people eat is not to give them too much food. But there is simply no way in Britain that you can eat out or buy packaged food products without buying too much food. It is patently obvious that British restaurants serve too much food, that British supermarkets sell food in packages that are too large, and that as a result the only way for British people not to eat too much is through constant acts of will – leaving half the food you paid for, buying only fresh food in small amounts every day (which is only possible in certain wealthy inner city suburbs), and carefully controlling where, when and how you eat. This is possible but it requires either that you move in a very wealthy cultural circle where the environment supports this kind of thing, or that you personally exert constant control over your life. And that latter choice will inevitably end in failure, because constantly controlling every aspect of your food intake in opposition to the environment where you purchase, prepare and consume food is very very difficult.

When you live in Japan you live in a different food environment, which encourages small serving sizes, fresh and raw foods, and low fat and low sugar foods. In Japan you live in a food environment where you are always close to a small local supermarket with convenient opening hours and fresh foods, and where convenience stores sell healthy food in small serving sizes. This means that you can choose to buy small amounts of fresh food as and when you need them, and avoid buying in bulk in a pattern that encourages over consumption. When your food choices fail (for example you have to eat out, or buy junk food) you will have access to a small, healthy serving. If you are a woman you will likely have access to a “woman’s size” or “princess size” that means you can eat the smaller calorific food that your smaller calorific requirements suggest is wisest. It is easy to be thin in Japan, and so most people are thin. Overeating in Japan really genuinely is a choice that you have to choose to make, rather than the default setting. This difference in food environment is simple, obvious and especially noticeable when (as I just did) you hop on a plane to the UK and suddenly find yourself confronted with double helpings of everything, and super markets where everything is “family sized”. The change of food environment forces you to eat more. It’s as simple as that.

What Britain needs is a change in the food environment. And achieving a change in food environment requires first of all recognizing that British people eat too much, and have been eating too much for way too long. Monbiot’s article is an exercise in denialism of that simple fact, and he should change it or retract it.

The journal Molecular Autism this week published an article about the links between Hans Asperger and the Nazis in world war 2 Vienna, Austria. Hans Asperger is the paediatric pscyhiatrist on whose work Asperger’s syndrome is based, and after whom the syndrome is known. Until recently Asperger was believed to have been an anti-Nazi, someone who resisted the Nazis and risked his own career to protect some of his developmentally delayed patients from the Nazi “euthanasia” program, which killed or sterilized people with certain developmental disabilities for eugenics reasons.

The article, entitled Hans Asperger, National Socialism, and “race hygiene” in Nazi-era Vienna, is a thorough, well-researched and extensively documented piece of work, which I think is based on several years of detailed examination of primary sources, often in their original German. It uses these sources – often previously untouched – to explore and rebut several claims Asperger made about himself, and also to examine the nature of his diagnostic work during the Nazi era to see whether he was resisting or aiding the Nazis in their racial hygiene goals. In this post I want to talk a little about the background of the paper, and ask a few questions about the implications of these findings for our understanding of autism, and also for our practice as public health workers in the modern era. I want to make clear that I do not know much if anything about Asperger’s syndrome or autism, so my questions are questions, not statements of opinion disguised as questions.

What was known about Asperger

Most of Asperger’s history under the Nazis was not known in the English language press, and when his name was attached to the condition of Asperger’s syndrome he was presented as a valiant defender of his patients against Nazi racial hygiene, and as a conscientious objector to Nazi ideology. This view of his life was based on some speeches and written articles translated into English during the post war years, in particular a 1974 interview in which he claims to have defended his patients and had to be saved from being arrested by the Gestapo twice by his boss, Dr. Hamburger. Although some German language publications were more critical, in general Asperger’s statements about his own life’s work were taken at face value, and seminal works in 1981 and 1991 that introduced him to the medical fraternity did not include any particular reference to his activities in the Nazi era.

What Asperger actually did

Investigation of the original documents shows a different picture, however. Before Anschluss (the German occupation of Austria in 1938), Asperger was a member of several far right Catholic political organizations that were known to be anti-semitic and anti-democratic. After Anschluss he joined several Nazi organizations affiliated with the Nazi party. His boss at the clinic where he worked was Dr. Hamburger, who he claimed saved him twice from the Gestapo. In fact Hamburger was an avowed neo-nazi, probably an entryist to these Catholic social movements during the period when Nazism was outlawed in Vienna, and a virulent anti-semite. He drove Jews out of the clinic even before Anschluss, and after 1938 all Jews were purged from the clinic, leaving openings that enabled Asperger to get promoted. It is almost impossible given the power structures at the time that Asperger could have been promoted if he disagreed strongly with Hamburger’s politics, but we have more than circumstantial evidence that they agreed: the author of the article, Herwig Czech, uncovered the annual political reports submitted concerning Asperger by the Gestapo, and they consistently agreed that he was either neutral or positive towards Nazism. Over time these reports became more positive and confident. Also during the war era Asperger gained new roles in organizations outside his clinic, taking on greater responsibility for public health in Vienna, which would have been impossible if he were politically suspect, and his 1944 PhD thesis was approved by the Nazis.

A review of Asperger’s notes also finds that he did send at least some of his patients to the “euthanasia” program, and in at least one case records a conversation with a parent in which the child’s fate is pretty much accepted by both of them. The head of the institution that did the “euthanasia” killings was a former colleague of Asperger’s, and the author presents pretty damning evidence that Asperger must have known what would happen to the children he referred to the clinic. It is clear from his speeches and writings in the Nazi era that Asperger was not a rabid killer of children with developmental disabilities: he believed in rehabilitating children and finding ways to make them productive members of society, only sending the most “ineducable” children to institutional care and not always to the institution that killed them. But it is also clear that he accepted the importance of “euthanasia” in some instances. In one particularly compelling situation, he was put in charge – along with a group of his peers – of deciding the fate of some 200 “ineducable” children in an institution for the severely mentally disabled, and 35 of those ended up being murdered. It seems unlikely that he did not participate in this process.

The author also notes that in some cases Asperger’s prognoses for some children were more severe than those of the doctors at the institute that ran the “euthanasia” program, suggesting that he wasn’t just a fairweather friend of these racial hygiene ideals, and the author also makes the point that because Asperger remained in charge of the clinic in the post-war years he was in a very good position to sanitize his case notes of any connection with Nazis and especially with the murder of Jews. Certainly, the author does not credit Asperger’s claims that he was saved from the Gestapo by Hamburger, and suggests that these are straight-up fabrications intended to sanitize Asperger’s role in the wartime public health field.

Was Asperger’s treatment and research ethical in any way?

Reading the article, one question that occurred to me immediately was whether any of his treatments could be ethical, given the context, and also whether his research could possibly have been unbiased. The “euthanasia” program was actually well known in Austria at the time – so well known in fact that at one point allied bombers dropped leaflets about it on the town, and there were demonstrations against it at public buildings. So put yourself in the shoes of a parent of a child with a developmental disability, bringing your child to the clinic for an assessment. You know that if your child gets an unfavourable assessment there is a good chance that he or she will be sterilized or taken away and murdered. Asperger offers you a treatment that may rehabilitate the child. Obviously, with the threat of “euthanasia” hanging over your child, you will say yes to this treatment. But in modern medicine there is no way that we could consider that to be willing consent. The parent might actually not care about “rehabilitating” their child, and is perfectly happy for the child to grow up and be loved within the bounds of what their developmental disability allows them; it may be that rehabilitation is difficult and challenging for the child, and not in the child’s best emotional interests. But faced with that threat of a racial hygiene-based intervention, as a parent you have to say yes. Which means that in a great many cases I suspect that Asperger’s treatments were not ethical from any post-war perspective.

In addition, I also suspect that the research he conducted for his 1944 PhD thesis, in addition to being unethical, was highly biased, because the parents of these children were lying through their teeth to him. Again, consider yourself as the parent of such a child, under threat of sterilization or murder. You “consent” to your child’s treatment regardless of what might be in the child’s best developmental and emotional interests, and also allow the child to be enrolled in Asperger’s study[1]. Then your child will be subjected to various rehabilitation strategies, what Asperger called pedagogical therapy. You will bring your child into the clinic every week or every day for assessments and tests. Presumably the doctor or his staff will ask you questions about the child’s progress: does he or she engage with strangers? How is his or her behavior in this or that situation? In every situation where you can, you will lie and tell them whatever you think is most likely to make them think that your child is progressing. Once you know what the tests at the clinic involve, you will coach your child to make sure he or she performs well in them. You will game every test, lie at every assessment, and scam your way into a rehabilitation even if your child is gaining nothing from the program. So all the results on rehabilitation and the nature of the condition that Asperger documents in his 1944 PhD thesis must be based on extremely dubious research data. You simply cannot believe that the research data you obtained from your subjects is accurate when some of them know that their responses decide whether their child lives or dies. Note that this problem with his research exists regardless of whether Asperger was an active Nazi – it’s a consequence of the times, not the doctor – but it is partially ameliorated if Asperger actually was an active resister to Nazi ideology, since it’s conceivable in that case that the first thing he did was give the parent an assurance that he wasn’t going to ship their kid off to die no matter what his diagnosis was. But since we now know he did ship kids off to die, that possibility is off the table. Asperger’s research subjects were consenting to a research study and providing subjective data on the assumption that the study investigator was a murderer with the power to kill their child. This means Asperger’s 1944 work probably needs to be ditched from the medical canon, simply on the basis of the poor quality of the data. It also has implications, I think, for some of his conclusions and their influence on how we view Asperger’s syndrome.

What does this mean for the concept of the autism spectrum?

Asperger introduced the idea of a spectrum of autism, with some of the children he called “autistic psychopaths” being high functioning, and some being low functioning, with a spectrum of disorder. This idea seems to be an important part of modern discussion of autism as well. But from my reading of the paper [again I stress I am not an expert] it seems that this definition was at least partly informed by the child’s response to therapy. That is, if a child responded to therapy and was able to be “rehabilitated”, they were deemed high functioning, while those who did not were considered low functioning. We have seen that it is likely that some of the parents of these children were lying about their children’s functional level, so probably his research results on this topic are unreliable, but there is a deeper problem with this definition, I think. The author implies that Asperger was quite an arrogant and overbearing character, and it seems possible to me that his assumption that he is deeply flawed in assuming his therapy would always work and that if it failed the problem was with the child’s level of function. What if his treatment only worked 50% of the time, randomly? Then the 50% of children who failed are not “low-functioning”, they’re just unlucky. If we compare with a pharmaceutical treatment, it simply is not the case that when your drugs fail your doctor deems this to be because you are “low functioning”, and ships you off to the “euthanasia” clinic. They assume the drugs didn’t work and give you better, stronger, or more experimental drugs. Only when all the possible treatments have failed do they finally deem your condition to be incurable. But there is no evidence that Asperger considered the possibility that his treatment was the problem, and because the treatment was entirely subjective – the parameters decided on a case-by-case basis – there is no way to know whether the problem was the children or the treatment. So to the extent that this concept of a spectrum is determined by Asperger’s judgment of how the child responded to his entirely subjective treatment, maybe the spectrum doesn’t exist?

This is particularly a problem because the concept of “functioning” was deeply important to the Nazis and had a large connection to who got selected for murder. In the Nazi era, to quote Negan, “people were a resource”, and everyone was expected to be functioning. Asperger’s interest in this spectrum and the diagnosis of children along it wasn’t just or even driven by a desire to understand the condition of “autistic psychopathy”, it was integral to his racial hygiene conception of what to do with these children. In determining where on the spectrum they lay he was providing a social and public health diagnosis, not a personal diagnosis. His concern here was not with the child’s health or wellbeing or even an accurate assessment of the depth and nature of their disability – he and his colleagues were interested in deciding whether to kill them or not. Given the likely biases in his research, the dubious link between the definition of the spectrum and his own highly subjective treatment strategy, and the real reasons for defining this spectrum, is it a good idea to keep it as a concept in the handling of autism in the modern medical world? Should we revisit this concept, if not to throw it away at least to reconsider how we define the spectrum and why we define it? Is it in the best interests of the child and/or their family to apply this concept?

How much did Asperger’s racial hygiene influence ideas about autism’s heritability?

Again, I want to stress that I know little about autism and it is not my goal here to dissect the details of this disease. However, from what I have seen of the autism advocacy movement, there does seem to be a strong desire to find some deep biological cause of the condition. I think parents want – rightly – to believe that it is not their fault that their child is autistic, and that the condition is not caused by environmental factors that might somehow be associated with their pre- or post-natal behaviors. Although the causes of autism are not clear, there seems to be a strong desire of some in the autism community to see it as biological or inherited. I think this is part of the reason that Andrew Wakefield’s scam linking autism to MMR vaccines remains successful despite his disbarment in the UK and exile to America. Parents want to think that they did not cause this condition, and blaming a pharmaceutical company is an easy alternative to this possibility. Heritability is another alternative explanation to behavioral or environmental causes. Asperger of course thought that autism was entirely inherited, blaming it – and its severity – on the child’s “constitution”, which was his phrase for their genetic inheritance. This is natural for a Nazi, of course – Nazis believe everything is inherited. Asperger also believed that sexual abuse was due to genetic causes (some children had a genetic property that led them to “seduce” adults!) Given Asperger’s influence on the definition of autism, I think it would be a good idea to assess how much his ideas also influence the idea that autism is inherited or biologically determined, and to question the extent to which this is just received knowledge from the original researcher. On a broader level, I wonder how many conditions identified during the war era and immediately afterwards were influenced by racial hygiene ideals, and how much the Nazi medical establishment left a taint on European medical research generally.

What lessons can we learn about public health practice from this case?

It seems pretty clear that some mistakes were made in the decision to assign Asperger’s name to this condition, given what we now know about his past. It also seems clear that Asperger was able to whitewash his reputation and bury his responsibilities for many years, including potentially avoiding being held accountable as an accessory to murder. How many other medical doctors, social scientists and public health workers from this time were also able to launder their history and reinvent themselves in the post-war era as good Germans who resisted the Nazis, rather than active accomplices of a murderous and cruel regime? What is the impact of their rehabilitation on the ethics and practice of medicine or public health in the post-war era? If someone was a Nazi, who believed that murdering the sick, disabled and certain races for the good of the race was a good thing, then when they launder their history there is no reason to think they actually laundered their beliefs as well. Instead they carried these beliefs into the post war era, and presumably quietly continued acting on them in the institutions they now occupied and corrupted. How much of European public health practice still bears the taint of these people? It’s worth bearing in mind that in the post war era many European countries continued to run a variety of programs that we now consider to have been rife with human rights abuse, in particular the way institutions for the mentally ill were run, the treatment of the Roma people (which often maintained racial-hygiene elements even decades after the war), treatment of “promiscuous” women and single mothers, and management of orphanages. How much of this is due to the ideas of people like Asperger, propagating slyly through the post-war public health institutional framework and carefully hidden from view by people like Asperger, who were assiduously purging past evidence of their criminal actions and building a public reputation for purity and good ethics? I hope that medical historians like Czech will in future investigate these questions.

This is not just a historical matter, either. I have colleagues and collaborators who work in countries experiencing various degrees of authoritarianism and/or racism – countries like China, Vietnam, Singapore, the USA – who are presumably vulnerable to the same kinds of institutional pressures at work in Nazi Germany. There have been cases, for example, of studies published from China that were likely done using organs harvested from prisoners. Presumably the authors of those studies thought this practice was okay? If China goes down a racial hygiene path, will public health workers who are currently doing good, solid work on improving the public health of the population start shifting their ideals towards murderous extermination? Again, this is not an academic question: After 9/11, the USA’s despicable regime of torture was developed by two psychologists, who presumably were well aware of the ethical standards their discipline is supposed to maintain, and just ignored them. The American Psychological Association had to amend its code in 2016 to include an explicit statement about avoiding harm, but I can’t find any evidence of any disciplinary proceedings by either the APA or the psychologists’ graduating universities to take action for the psychologists’ involvement in this shocking scheme. So it is not just in dictatorships that public policy pressure can lead to doctors taking on highly unethical standards. Medical, pscyhological and public health communities need to take much stronger action to make sure that our members aren’t allowed to give into their worst impulses when political and social pressure comes to bear on them.

These ideas are still with us

As a final point, I want to note that the ideas that motivated Asperger are not all dead, and the battle against the pernicious influence of racial hygiene was not won in 1945. Here is Asperger in 1952, talking about “feeblemindedness”:

Multiple studies, above all in Germany, have shown that these families procreate in numbers clearly above the average, especially in the cities. [They] live without inhibitions, and rely without scruples on public welfare to raise or help raise their children. It is clear that this fact presents a very serious eugenic problem, a solution to which is far off—all the more, since the eugenic policies of the recent past have turned out to be unacceptable from a human standpoint

And here is Charles Murray in 1994:

We are silent partly because we are as apprehensive as most other people about what might happen when a government decides to social-engineer who has babies and who doesn’t. We can imagine no recommendation for using the government to manipulate fertility that does not have dangers. But this highlights the problem: The United States already has policies that inadvertently social-engineer who has babies, and it is encouraging the wrong women. If the United States did as much to encourage high-IQ women to have babies as it now does to encourage low-IQ women, it would rightly be described as engaging in aggressive manipulation of fertility. The technically precise description of America’s fertility policy is that it subsidizes births among poor women, who are also disproportionately at the low end of the intelligence distribution. We urge generally that these policies, represented by the extensive network of cash and services for low-income women who have babies, be ended. [Emphasis in the Vox original]

There is an effort in Trump’s America to rehabilitate Murray’s reputation, long after his policy prescriptions were enacted during the 1990s. There isn’t any real difference between Murray in 1994, Murray’s defenders in 2018, or Asperger in 1952. We now know what the basis for Asperger’s beliefs were. Sixty years later they’re still there in polite society, almost getting to broadcast themselves through the opinion pages of a major centrist magazine. Racial hygiene didn’t die with the Nazis, and we need to redouble our efforts now to get this pernicious ideology out of public health, medicine, and public policy. I expect that in the next few months this will include some uncomfortable discussions about Asperger’s legacy, and I hope a reassessment of the entire definition of autism, Asperger’s syndrome and its management. But we should all be aware that in these troubled times, the ideals that motivated Asperger did not die with him, and our fields are still vulnerable to their evil influence.

 


fn1: Note that you consent to this study regardless of your actual views on its merits, whether it will cause harm to your child, etc. because this doctor is going to decide whether your child “rehabilitates” or slides out of view and into the T4 program where they will die of “pneumonia” within 6 months, and so you are going to do everything this doctor asks. This is not consent.

Next Page »