Okay, this book has been on my to-read list for a long while. I bought it I don't know when, with the intent of reading it, but it just lingered. When Suzanne said her session of the book club she and Bob are in, is reading it, and that the session would meet the Monday I'm there, well, I pulled out the book and started reading.
It was not what I was expected, which is fine, really, most books aren't quite what I was expecting. This one, however, caught me more off-guard than I was expecting. I thought this was a book describing the mathematics used in the different ways big-data affects society. Instead, it is a book describing the ways mathmatics is used in big data to disadvantage the already disadvantaged.
It is, at its core, a book about the growing unfairness of big-data in our lives. It is about the ways the poor are kept poor, the rich can stay rich, the powerful abuse their power, and society continues to stratify, all with the help of numbers and math and statistics and data.
The first session of the book club summed up the book as, "It reads like a novel, and is mostly about the unfairness of big-data, it's a social justice book." Bob commented, "Yep, we're done, I don't think there's anything else to talk about." I agree. The book was example after example of the ways big-data is problematic. The examples are important to know. I recommend the book.
Math provided a neat refuge from the messiness of the real world.
I can understand this desire to leave the messiness for the beauty of mathematics.
There would always be mistakes, however, because models are, by their very nature, simplifications. No model can include all of the real world’s complexity or the nuance of human communication.
A model’s blind spots reflect the judgments and priorities of its creators.
Here we see that models, despite their reputation for impartiality, reflect goals and ideology.
Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics.
Racism, at the individual level, can be seen as a predictive model whirring away in billions of human minds around the world. It is built from faulty, incomplete, or generalized data.
Whether it comes from experience or hearsay, the data indicates that certain types of people have behaved badly. That generates a binary prediction that all people of that race will behave that same way. Needless to say, racists don’t spend a lot of time hunting down reliable data to train their twisted models. And once their model morphs into a belief, it becomes hardwired. It generates poisonous assumptions, yet rarely tests them, settling instead for data that seems to confirm and fortify them.
Consequently, racism is the most slovenly of predictive models. It is powered by haphazard data gathering and spurious correlations, reinforced by institutional inequities, and polluted by confirmation bias.
The second false assumption was that not many people would default at the same time. This was based on the theory, soon to be disproven, that defaults were largely random and unrelated events. This led to a belief that solid mortgages would offset the losers in each tranche. The risk models were assuming that the future would be no different from the past.
I was already blogging as I worked in data science, and I was also getting more involved with the Occupy movement. More and more, I worried about the separation between technical models and real people, and about the moral repercussions of that separation. In fact, I saw the same pattern emerging that I’d witnessed in finance: a false sense of security was leading to widespread use of imperfect models, self-serving definitions of success, and growing feedback loops. Those who objected were regarded as nostalgic Luddites.
And this is the a-ha moment, the inception, for this book.
In a system in which cheating is the norm, following the rules amounts to a handicap. The only way to win in such a scenario is to gain an advantage and to make sure that others aren’t getting a bigger one.
Emphasis mine. Welcome to human nature.
the University of Phoenix targeted poor people with the bait of upward mobility. Its come-on carried the underlying criticism that the struggling classes weren’t doing enough to improve their lives. And it worked.
If it was true during the early dot-com days that “nobody knows you’re a dog,” it’s the exact opposite today. We are ranked, categorized, and scored in hundreds of models, on the basis of our revealed preferences and patterns. This establishes a powerful basis for legitimate ad campaigns, but it also fuels their predatory cousins: ads that pinpoint people in great need and sell them false or overpriced promises. They find inequality and feast on it. The result is that they perpetuate our existing social stratification, with all of its injustices.
A 2012 Senate committee report on for-profit colleges described Vatterott’s recruiting manual, which sounds diabolical. It directs recruiters to target “Welfare Mom w/ Kids. Pregnant Ladies. Recent Divorce. Low Self-Esteem. Low Income Jobs. Experienced a Recent Death. Physically/ Mentally Abused. Recent Incarceration. Drug Rehabilitation. Dead-End Jobs—No Future.”
Why, specifically, were they targeting these folks? Vulnerability is worth gold. It always has been.
Once the ignorance is established, the key for the recruiter, just as for the snake-oil merchant, is to locate the most vulnerable people and then use their private information against them. This involves finding where they suffer the most, which is known as the “pain point.”
Because there is always someone willing to exploit the less-well-off.
But zero tolerance actually had very little to do with Kelling and Wilson’s “broken-windows” thesis. Their case study focused on what appeared to be a successful policing initiative in Newark, New Jersey. Cops who walked the beat there, according to the program, were supposed to be highly tolerant. Their job was to adjust to the neighborhood’s own standards of order and to help uphold them.
Standards varied from one part of the city to another. In one neighborhood, it might mean that drunks had to keep their bottles in bags and avoid major streets but that side streets were okay. Addicts could sit on stoops but not lie down. The idea was only to make sure the standards didn’t fall.
The cops, in this scheme, were helping a neighborhood maintain its own order but not imposing their own.
In this sense, PredPol, even with the best of intentions, empowers police departments to zero in on the poor, stopping more of them, arresting a portion of those, and sending a subgroup to prison. And the police chiefs, in many cases, if not most, think that they’re taking the only sensible route to combating crime. That’s where it is, they say, pointing to the highlighted ghetto on the map. And now they have cutting-edge technology (powered by Big Data) reinforcing their position there, while adding precision and “science” to the process. The result is that we criminalize poverty, believing all the while that our tools are not only scientific but fair.
While looking at WMDs, we’re often faced with a choice between fairness and efficacy.
Our legal traditions lean strongly toward fairness.
The Constitution, for example, presumes innocence and is engineered to value it. From a modeler’s perspective, the presumption of innocence is a constraint, and the result is that some guilty people go free, especially those who can afford good lawyers. Even those found guilty have the right to appeal their verdict, which chews up time and resources. So the system sacrifices enormous efficiencies for the promise of fairness.
The Constitution’s implicit judgment is that freeing someone who may well have committed a crime, for lack of evidence, poses less of a danger to our society than jailing or executing an innocent person.
Back when arguing meant discourse, not bullying or outight lies. Go fig.
They try in vain to measure “friendship” by counting likes and connections on Facebook. And the concept of fairness utterly escapes them. Programmers don’t know how to code for it, and few of their bosses ask them to.
The question is whether we as a society are willing to sacrifice a bit of efficiency in the interest of fairness. Should we handicap the models, leaving certain data out?
But a crucial part of justice is equality. And that means, among many other things, experiencing criminal justice equally. People who favor policies like stop and frisk should experience it themselves. Justice cannot just be something that one part of society inflicts upon the other.
What’s more, for supposedly scientific systems, the recidivism models are logically flawed. The unquestioned assumption is that locking away “high-risk” prisoners for more time makes society safer. It is true, of course, that prisoners don’t commit crimes against society while behind bars. But is it possible that their time in prison has an effect on their behavior once they step out? Is there a chance that years in a brutal environment surrounded by felons might make them more likely, and not less, to commit another crime? Such a finding would undermine the very basis of the recidivism sentencing guidelines. But prison systems, which are awash in data, do not carry out this highly important research. All too often they use data to justify the workings of the system but not to question or improve the system.
This was one of the pillars of the original “broken-windows” study. The cops were on foot, talking to people, trying to help them uphold their own community standards. But that objective, in many cases, has been lost, steamrollered by models that equate arrests with safety.
Even putting aside the issues of fairness and legality, research suggests that personality tests are poor predictors of job performance.
Frank Schmidt, a business professor at the University of Iowa, analyzed a century of workplace productivity data to measure the predictive value of various selection processes. Personality tests ranked low on the scale—they were only one-third as predictive as cognitive exams, and also far below reference checks.
This is particularly galling because certain personality tests, research shows, can actually help employees gain insight into themselves. They can also be used for team building and for enhancing communication. After all, they create a situation in which people think explicitly about how to work together.
That intention alone might end up creating a better working environment. In other words, if we define the goal as a happier worker, personality tests might end up being a useful tool.
But instead they’re being used as a filter to weed out applicants. “The primary purpose of the test,” said Roland Behm, “is not to find the best employee. It’s to exclude as many people as possible as cheaply as possible.”
The practice of using credit scores in hirings and promotions creates a dangerous poverty cycle. After all, if you can’t get a job because of your credit record, that record will likely get worse, making it even harder to land work. It’s not unlike the problem young people face when they look for their first job—and are disqualified for lack of experience. Or the plight of the longtime unemployed, who find that few will hire them because they’ve been without a job for too long.
It’s a spiraling and defeating feedback loop for the unlucky people caught up in it.
Employers, naturally, have little sympathy for this argument. Good credit, they argue, is an attribute of a responsible person, the kind they want to hire. But framing debt as a moral issue is a mistake.
Plenty of hardworking and trustworthy people lose jobs every day as companies fail, cut costs, or move jobs offshore. These numbers climb during recessions. And many of the newly unemployed find themselves without health insurance. At that point, all it takes is an accident or an illness for them to miss a payment on a loan.
Even with the Affordable Care Act, which reduced the ranks of the uninsured, medical expenses remain the single biggest cause of bankruptcies in America.
(Wealthy travelers, by contrast, are often able to pay to acquire “trusted traveler” status, which permits them to waltz through security. In effect, they’re spending money to shield themselves from a WMD.)
To which, as someone who has been sexually assaulted by a TSA officer, yes, I will pay my government $20 a year not to sexually assault me.
THAT said, I STILL get the "random" checks when I go through the TSA Pre line. "Random" really means, "we will prove we are not racially profiling travellers by always targeting the slender, big boobed, conservatively dressed woman in cords, because hey not a Muslim dude."
Not all WMD backlashes are accurate either.
As we saw in recidivism sentencing models and predatory loan algorithms, the poor are expected to remain poor forever and are treated accordingly—denied opportunities, jailed more often, and gouged for services and loans. It’s inexorable, often hidden and beyond appeal, and unfair.
According to a report in Forbes, institutional money now accounts for more than 80 percent of all the activity on peer-to-peer platforms.
For big banks, the new platforms provide a convenient alternative to the tightly regulated banking economy. Working through peer-to-peer systems, a lender can analyze nearly any data it chooses and develop its own e-scores. It can develop risk correlations for neighborhoods, zip codes, and the stores customers shop at — all without having to send them embarrassing letters explaining why.
Of course it is. Banks go where the money is.
Quoting this book is becoming tiring, tbh. Might be easier to read the book than all my extracted quotes.
Hoffman’s analysis, like many of the WMDs we’ve been discussing, was statistically flawed. He confused causation with correlation, so that the voluminous data he gathered served only to confirm his thesis: that race was a powerful predictor of life expectancy. Racism was so ingrained in his thinking that he apparently never stopped to consider whether poverty and injustice might have something to do with the death rate of African Americans, whether the lack of decent schools, modern plumbing, safe workplaces, and access to health care might kill them at a younger age.
Nearly a half century later, however, redlining is still with us, though in far more subtle forms. It’s coded into the latest generation of WMDs. Like Hoffman, the creators of these new models confuse correlation with causation. They punish the poor, and especially racial and ethnic minorities.
How can that be?
Mathematicians didn’t pretend to foresee the fate of each individual. That was unknowable. But they could predict the prevalence of accidents, fires, and deaths within large groups of people.
... for the first time, the chance to pool their collective risk, protecting individuals when misfortune struck.
The move toward the individual, as we’ll see, is embryonic. But already insurers are using data to divide us into smaller tribes, to offer us different products and services at varying prices. Some might call this customized service. The trouble is, it’s not individual. The models place us into groups we cannot see, whose behavior appears to resemble ours. Regardless of the quality of the analysis, its opacity can lead to gouging.
In other words, how you manage money can matter more than how you drive a car.
And in Florida, adults with clean driving records and poor credit scores paid an average of $ 1,552 more than the same drivers with excellent credit and a drunk driving conviction.
Emphasis not mine.
But consider the price optimization algorithm at Allstate, the insurer self-branded as “the Good Hands People.” According to a watchdog group, the Consumer Federation of America, Allstate analyzes consumer and demographic data to determine the likelihood that customers will shop for lower prices. If they aren’t likely to, it makes sense to charge them more.
And that’s just what Allstate does. It gets worse. In a filing to the Wisconsin Department of Insurance, the CFA listed one hundred thousand microsegments in Allstate’s pricing schemes. These pricing tiers are based on how much each group can be expected to pay.
The stated goal of this surveillance is to reduce accidents. About seven hundred truckers die on American roads every year. And their crashes also claim the lives of many in other vehicles. In addition to the personal tragedy, this costs lots of money. The average cost of a fatal crash, according to the Federal Motor Carrier Safety Administration, is $3.5 million.
When I talk to most people about black boxes in cars, it’s not the analysis they object to as much as the surveillance itself. People insist to me that they won’t give in to monitors. They don’t want to be tracked or have their information sold to advertisers or handed over to the National Security Agency. Some of these people might succeed in resisting this surveillance. But privacy, increasingly, will come at a cost.
At some point, the trackers will likely become the norm. And consumers who want to handle insurance the old-fashioned way, withholding all but the essential from their insurers, will have to pay a premium, and probably a steep one. In the world of WMDs, privacy is increasingly a luxury that only the wealthy can afford.
Insurance is an industry, traditionally, that draws on the majority of the community to respond to the needs of an unfortunate minority.
In the villages we lived in centuries ago, families, religious groups, and neighbors helped look after each other when fire, accident, or illness struck. In the market economy, we outsource this care to insurance companies, which keep a portion of the money for themselves and call it profit.
As insurance companies learn more about us, they’ll be able to pinpoint those who appear to be the riskiest customers and then either drive their rates to the stratosphere or, where legal, deny them coverage. This is a far cry from insurance’s original purpose, which is to help society balance its risk. In a targeted world, we no longer pay the average. Instead, we’re saddled with anticipated costs. Instead of smoothing out life’s bumps, insurance companies will demand payment for those bumps in advance.
Once companies amass troves of data on employees’ health, what will stop them from developing health scores and wielding them to sift through job candidates? Much of the proxy data collected, whether step counts or sleeping patterns, is not protected by law, so it would theoretically be perfectly legal.
As we’ve seen, they routinely reject applicants on the basis of credit scores and personality tests. Health scores represent a natural—and frightening—next step.
The national drugstore chain CVS announced in 2013 that it would require employees to report their levels of body fat, blood sugar, blood pressure, and cholesterol—or pay $ 600 a year.
It gives companies an excuse to punish people they don’t like to look at—and to remove money from their pockets at the same time.
In fact, the greatest savings from wellness programs come from the penalties assessed on the workers. In other words, like scheduling algorithms, they provide corporations with yet another tool to raid their employees’ paychecks.
By sprinkling people’s news feeds with “I voted” updates, Facebook was encouraging Americans — more than sixty-one million of them — to carry out their civic duty and make their voices heard.
Studies have shown that the quiet satisfaction of carrying out a civic duty is less likely to move people than the possible judgment of friends and neighbors.
Facebook is more like the Wizard of Oz: we do not see the human beings involved. When we visit the site, we scroll through updates from our friends. The machine appears to be only a neutral go-between. Many people still believe it is.
Using linguistic software, Facebook sorted positive (stoked!) and negative (bummed!) updates. They then reduced the volume of downbeat postings in half of the news feeds, while reducing the cheerful quotient in the others. When they studied the users’ subsequent posting behavior, they found evidence that the doctored new feeds had indeed altered their moods.
Their conclusion: “Emotional states can be transferred to others…, leading people to experience the same emotions without their awareness.” In other words, Facebook’s algorithms can affect how millions of people feel, and those people won’t know that it’s happening.
The engines they used were programmed to skew the search results, favoring one party over another. Those results, they said, shifted voting preferences by 20 percent. This effect was powerful, in part, because people widely trust search engines. Some 73 percent of Americans, according to a Pew Research report, believe that search results are both accurate and impartial.
Trying to please everyone is one reason most political speeches are boring (and Romney’s, even his supporters groused, were especially so).
Basking in the company of people he believed to be supportive and like-minded, Romney let loose with his observation that 47 percent of the population were “takers,” living off the largesse of big government. These people would never vote for him, the governor said—which made it especially important to reach out to the other 53 percent.
I find it interesting that this incident is in this book, as it is in Dark Money, too. Different spins happening, though.
Modern consumer marketing, however, provides politicians with new pathways to specific voters so that they can tell them what they know they want to hear. Once they do, those voters are likely to accept the information at face value because it confirms their previous beliefs, a phenomenon psychologists call confirmation bias. It is one reason that none of the invited donors at the Romney event questioned his assertion that nearly half of voters were hungry for government handouts. It only bolstered their existing beliefs.
In late 2015, the Guardian reported that a political data firm, Cambridge Analytica, had paid academics in the United Kingdom to amass Facebook profiles of US voters, with demographic details and records of each user’s “likes.” They used this information to develop psychographic analyses of more than forty million voters, ranking each on the scale of the “big five” personality traits: openness, conscientiousness, extroversion, agreeableness, and neuroticism.
The scoring of individual voters also undermines democracy, making a minority of voters important and the rest little more than a supporting cast. Indeed, looking at the models used in presidential elections, we seem to inhabit a shrunken country. As I write this, the entire voting population that matters lives in a handful of counties in Florida, Ohio, Nevada, and a few other swing states. Within those counties is a small number of voters whose opinions weigh in the balance.
Instead of targeting people in order to manipulate them, it could line them up for help. In a mayoral race, for example, a microtargeting campaign might tag certain voters for angry messages about unaffordable rents. But if the candidate knows these voters are angry about rent, how about using the same technology to identify the ones who will most benefit from affordable housing and then help them find it?
Because, you know, money.
Change that objective from leeching off people to helping them, and a WMD is disarmed—and can even become a force for good.
At the federal level, this problem could be greatly alleviated by abolishing the Electoral College system. It’s the winner-take-all mathematics from state to state that delivers so much power to a relative handful of voters. It’s as if in politics, as in economics, we have a privileged 1 percent. And the money from the financial 1 percent underwrites the microtargeting to secure the votes of the political 1 percent. Without the Electoral College, by contrast, every vote would be worth exactly the same. That would be a step toward democracy.
This might need to be my next mission.
Along the way, we’ve witnessed the destruction caused by WMDs. Promising efficiency and fairness, they distort higher education, drive up debt, spur mass incarceration, pummel the poor at nearly every juncture, and undermine democracy. It might seem like the logical response is to disarm these weapons, one by one. The problem is that they’re feeding on each other. Poor people are more likely to have bad credit and live in high-crime neighborhoods, surrounded by other poor people.
Our national motto, E Pluribus Unum, means “Out of Many, One.” But WMDs reverse the equation. Working in darkness, they carve one into many, while hiding us from the harms they inflict upon our neighbors near and far. And those harms are legion.
We cannot count on the free market itself to right these wrongs.
Indeed, all too often the poor are blamed for their poverty, their bad schools, and the crime that afflicts their neighborhoods.
Big Data processes codify the past. They do not invent the future. Doing that requires moral imagination, and that’s something only humans can provide. We have to explicitly embed better values into our algorithms, creating Big Data models that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit.
Clearly, the free market could not control its excesses. So after journalists like Ida Tarbell and Upton Sinclair exposed these and other problems, the government stepped in. It established safety protocols and health inspections for food, and it outlawed child labor.
These new standards protected companies that didn’t want to exploit workers or sell tainted foods, because their competitors had to follow the same rules. And while they no doubt raised the costs of doing business, they also benefited society as a whole.
The government, using tax dollars, attempts to compensate for it, with the hope that food stamp recipients will eventually be able to fully support themselves. But the lead aggregators push them toward needless transactions, leaving a good number of them with larger deficits, and even more dependent on public assistance.
And the same is often true of fairness and the common good in mathematical models. They’re concepts that reside only in the human mind, and they resist quantification.
And since humans are in charge of making the models, they rarely go the extra mile or two to even try. It’s just considered too difficult. But we need to impose human values on these systems, even at the cost of efficiency.
To disarm WMDs, we also need to measure their impact and conduct algorithmic audits. The first step, before digging into the software code, is to carry out research. We’d begin by treating the WMD as a black box that takes in data and spits out conclusions.
There’s no fixing a backward model like the value-added model. The only solution in such a case is to ditch the unfair system.
In this case, it’s simply a matter of asking teachers and students alike if the evaluations make sense for them, if they understand and accept the premises behind them. If not, how could they be enhanced? Only when we have an ecosystem with positive feedback loops can we expect to improve teaching using data. Until then it’s just punitive.
They predict an individual’s behavior on the basis of the people he knows, his job, and his credit rating—details that would be inadmissible in court. The fairness fix is to throw out that data. But wait, many would say. Are we going to sacrifice the accuracy of the model for fairness? Do we have to dumb down our algorithms? In some cases, yes. If we’re going to be equal before the law, or be treated equally as voters, we cannot stand for systems that drop us into different castes and treat us differently.
Movements toward auditing algorithms are already afoot. At Princeton, for example, researchers have launched the Web Transparency and Accountability Project. They create software robots that masquerade online as people of all stripes—rich, poor, male, female, or suffering from mental health issues. By studying the treatment these robots receive, the academics can detect biases in automated systems from search engines to job placement sites.
Academic support for these initiatives is crucial. After all, to police the WMDs we need people with the skills to build them. Their research tools can replicate the immense scale of the WMDs and retrieve data sets large enough to reveal the imbalances and injustice embedded in the models. They can also build crowdsourcing campaigns, so that people across society can provide details on the messaging they’re receiving from advertisers or politicians. This could illuminate the practices and strategies of microtargeting campaigns.
Auditors face resistance, however, often from the web giants, which are the closest thing we have to information utilities. Google, for example, has prohibited researchers from creating scores of fake profiles in order to map the biases of the search engine.
Facebook, too. The social network’s rigorous policy to tie users to their real names severely limits the research outsiders can carry out there.
Of course they do. Again, money.
These regulations are not perfect, and they desperately need updating. Consumer complaints are often ignored, and there’s nothing explicitly keeping credit-scoring companies from using zip codes as proxies for race. Still, they offer a good starting point. First, we need to demand transparency. Each of us should have the right to receive an alert when a credit score is being used to judge or vet us. And each of us should have access to the information being used to compute that score. If it is incorrect, we should have the right to challenge and correct it.
Next, the regulations should expand to cover new types of credit companies, like Lending Club, which use newfangled e-scores to predict the risk that we’ll default on loans. They should not be allowed to operate in the shadows.
If we want to bring out the big guns, we might consider moving toward the European model, which stipulates that any data collected must be approved by the user, as an opt-in. It also prohibits the reuse of data for other purposes. The opt-in condition is all too often bypassed by having a user click on an inscrutable legal box.
But the “not reusable” clause is very strong: it makes it illegal to sell user data. This keeps it from the data brokers whose dossiers feed toxic e-scores and microtargeting campaigns. Thanks to this “not reusable” clause, the data brokers in Europe are much more restricted, assuming they follow the law.
I WOULD LOVE THIS.
Finally, models that have a significant impact on our lives, including credit scores and e-scores, should be open and available to the public. Ideally, we could navigate them at the level of an app on our phones. In a tight month, for example, a consumer could use such an app to compare the impact of unpaid phone and electricity bills on her credit score and see how much a lower score would affect her plans to buy a car. The technology already exists. It’s only the will we’re lacking.