The Fan That Knew Too Much

A friend of mine, who runs a small architecture practice out of a converted bungalow in Indiranagar, called me a few weeks ago about a ceiling fan. The fan had been installed in the upstairs studio in late February, replacing an older one that had developed a wobble. It was, he said, a perfectly ordinary fan, except that it was a smart fan, and except that he was not entirely sure what "smart" was, in this context, doing.

He had bought it because it was on sale. The salesman at the showroom had pointed out that the new model came with an app, which would let him control the fan from his phone, set schedules, and (in a phrase the salesman had repeated three times) "learn the preferences of the household." My friend, who is sixty-one and does not own a smart watch, had thought this last feature charming in the abstract and probably useless in practice. He had bought the fan, installed it, paired it with the app, and then forgotten about it.

What had brought him back to it, three months later, was a small thing. He had been showing a client around the studio. The client had remarked, in passing, that the fan was running at a very particular speed — slower than he would have expected, given the warmth of the afternoon. My friend had looked up. The fan was, in fact, running at a speed he had not personally set. He had not used the app in weeks. The wall switch was the only control he interacted with, and the wall switch had four positions: off, low, medium, high. The fan was running at something that was, by his eye, between low and medium.

He called me because he is a careful man, and because he has known me long enough to know that I find this kind of small inconsistency interesting.

The shape of a question

The first thing I did, when I went over to the studio the following Saturday, was put the fan on the wattmeter. The fan was drawing 18 watts. The datasheet, which I had downloaded from the manufacturer's website that morning, listed three speeds: 12 watts, 22 watts, and 38 watts. 18 watts was not on the menu.

The second thing I did was pull out a laptop, put the home Wi-Fi into a span-port configuration, and watch what the fan was talking to. Over the next four hours, the fan opened TCP connections to two servers. One was an MQTT broker hosted, by the certificate chain, in a data centre operated by a large Indian cloud provider in Hyderabad. The other was an HTTPS endpoint at a subdomain of the manufacturer's primary commerce site. The fan was extremely talkative. It opened a new MQTT connection every ninety seconds, sent a small payload, received an acknowledgement, and closed the connection. It POSTed, every fifteen minutes, to the HTTPS endpoint, a payload of about 480 bytes.

The MQTT payload was opaque. The HTTPS payload was opaque. I could see, from the packet timing and sizes, that the fan was sending something, but I could not, without breaking the TLS, see what.

I put down the laptop and looked at the fan.

A small confession

I did not, in the end, break the TLS. I have done it before, on devices I owned outright and intended to take to pieces, but the fan was not mine, and my friend wanted, more than anything, to keep using it. What I did instead was put the fan into pairing mode, watch what credentials it asked the phone for, and look up the manufacturer's developer documentation, which was, to their credit, posted on a public-facing developer portal with a register-and-read login.

The documentation was the most interesting document I had read all month. It described, in seventy-eight pages of careful technical English, exactly what the fan was sending to the cloud and why. The MQTT payloads were keepalives, encrypted with a per-device key, carrying telemetry about the fan's current speed, its temperature sensor reading, its motor current, and a small bitfield of operational flags. The HTTPS payloads were richer. They carried, every fifteen minutes, a summary of the fan's behaviour over the preceding interval: the speeds it had been set to, by whom (the wall switch was logged separately from the app), the ambient temperature, the inferred occupancy of the room (derived, the documentation said, from "subtle variations in motor load that correspond to door movements and human presence"), and a small, anonymous identifier that the documentation referred to as the "household fingerprint."

The household fingerprint, the documentation explained, was generated by hashing together the fan's network identifiers, the times it was switched on and off, the typical speed at which it was run, and a small set of derived behavioural features. It was, the documentation said, used to "tailor the user experience" and "improve the product roadmap." It was not, the documentation took care to specify, personally identifiable. It was, however, unique to the household, and persisted across firmware updates.

The 18 watts the fan was drawing, I now understood, was a speed it had decided on by itself. It had observed that on Saturday afternoons, when the studio was occupied but quiet, the fan had typically been set to a low-medium speed. It had inferred, from the temperature and the time of day, that an intermediate setting between the cataloged speeds would be appropriate. It had silently moved to that setting, awaiting either an instruction from the app to override its choice or a flick of the wall switch that would re-anchor its preferences.

The fan was, in the precise sense, learning. It was also, in a sense that I find harder to name, watching.

The economics of a telemetry stream

I want to be careful, here, about not slipping into the register of moral indignation, because the fan is in many ways less interesting as a privacy story than as an economic one. Somebody, at some point in the design of this product, made a series of decisions. Somebody decided that the fan should be connected to the internet. Somebody decided that the fan should send, every fifteen minutes, a summary of its observations to a cloud service. Somebody decided that the cloud service should derive, from the aggregated observations, a model of the typical household, and that the model should be used to push back updates that would cause individual fans to adjust their behaviour. Somebody decided that the development cost of all of this — the firmware, the cloud infrastructure, the certificates, the maintenance, the data scientists — was worth it.

It is not obvious, to me, that any of these decisions were wrong from the manufacturer's point of view. The data, in aggregate, is genuinely useful. A manufacturer that knows how its fans are actually used can build better fans. A manufacturer that can ship a fan that learns the household and adjusts itself can credibly claim to be ahead of its competitors. The cost of the cloud infrastructure, distributed across the several million fans this manufacturer ships every year, is, I would guess, somewhere in the low single digits of cents per fan per year. The marginal cost of sending an extra packet is essentially nothing. Once you have built the pipeline, the temptation to send more rather than less is overwhelming.

What is less obvious is what the homeowner is, in any meaningful sense, agreeing to. My friend did not, at any point, sit down and read a privacy policy. He clicked through a paired-device confirmation in the app. The app told him, in cheerful copy, that the fan would "learn from your usage." He took this to mean that the app would remember his schedules. He did not take it to mean that a server in Hyderabad would, every fifteen minutes, receive an inferred occupancy estimate for his studio.

What the fan does now

We did not, in the end, return the fan. My friend is not the sort of person to return a thing that works, and the fan, in its uncomplicated mechanical functions, works very well. What we did instead was put the fan on a VLAN by itself, on the studio router, with an outbound firewall rule that allows it to talk only to the local subnet. The fan now does not phone home. It also no longer learns from the household, which my friend tells me, with some satisfaction, he does not miss.

The fan still works. The wall switch still has four positions. The fan no longer thinks for itself, which is, I suspect, what the original ceiling fan was good at all along.

I have been thinking, since, about what to recommend to my friend's other clients. The honest answer is that almost every appliance sold in this category, by almost every major manufacturer, does something similar. The variation is in the granularity of what they send and the language they use to describe it. The pattern — connect, observe, summarise, learn — is now the default. The fan that does not phone home is increasingly unusual, and getting harder to find.

What I cannot quite shake, after all of this, is the small mental image of the fan at 18 watts, on a Saturday afternoon, sitting between low and medium, having reasoned its way to a setting nobody had asked for. It was, in its modest way, a competent decision. It is also the kind of decision I would rather a fan did not, on my behalf, make.

The Fan That Knew Too Much ​

The shape of a question ​

A small confession ​

The economics of a telemetry stream ​

What the fan does now ​

The Fan That Knew Too Much

The shape of a question

A small confession

The economics of a telemetry stream

What the fan does now