As always, I’m making the raw data and more details on the testing method publicly available, including humidity data.
Should we subtract the 10 micron particles?
One reasonable question is, should we take into account the 10 micron particles? Remember the standard Dylos gives readings for 1 micron and 10 microns. In the formula that Dylos gives for the Pro model, they subtract the particles above 2.5 microns:
Pro formula: PM2.5 ug/m3 = (0.5 micron particles – 2.5 micron particles)/.01
The theory there is that the 0.5 micron readings are 0.5 microns and above, including particles above 2.5 microns. And because the official PM2.5 readings are 2.5 microns and below, we’ll be more accurate if we cut out the particles above 2.5 microns.
However, when I analyzed the Pro data in my other test data, the difference was negligible for deleting or not deleting the numbers. For the standard Dylos data, the 1 micron correlated with the Consulate r = .587; the 1 micron minus 10 microns correlated r = .580. Thus, it doesn’t seem to make much difference.
Does humidity make a difference?
Should humidity matter? Engineering students at Drexel University argued that it should for the Dylos. The argument is that particles absorb water in the air, which “inflates” them.
In the second round of testing, we have humidity data for every datapoint. And importantly, the data has a wide range of humidity values, from a low of 49% to a high of 98%. Thus, we can test whether taking humidity into account improves accuracy.
To test the question, I computed a regression with humidity and an interaction term between humidity and 1 micron particles. The interaction term was significant (p < .001, B = -0.001 in a regression with no constant, so the line intersects at 0,0).
What does the interaction mean? I graphed it out in the program R. For high humidity levels, the relationship between the Dylos 1 micron (X axis) and Consulate PM2.5 (Y axis) is relatively flat:
But at low humidity levels, the relationship is much stronger:
OK, OK, so what does all that mean? It suggests that when humidity is high, the Dylos is “overcounting.” So the interaction term is discounting the Dylos numbers when humidity is high.
And then when humidity is low, the Dylos is NOT overcounting. Thus, we should count those numbers more strongly.
How much better does the formula do if we add in humidity? Here’s how well the simple formula tracked the US Consulate:
And here’s how well the formula with humidity did:
The simple formula had average error of 11 micrograms. The formula with humidity had an average error of 6.8 micrograms. Thus, taking into account humidity seems to help.
But keep in mind that this is a relatively small dataset. We also don’t have any humidity data below 49%. Humidity can go a lot lower. For example, Las Vegas is currently at 7% humidity.