Algorithms themselves are works of incredible technological engineering and mathematical principles, and differ in range and applicability. But no matter the algorithm, the success and accuracy of the algorithms depends foremost on the amount of data. More data generally means better results from an algorithm -no matter the algorithm.
Where does this data come from? Depending on the algorithm and its function, most data is collected from common society. Algorithms need upwards of millions and billions of pieces of data to reach top efficiency, and this data must come from common folk. Manually inputted data from surveys will usually be limited in scope. Interacting with a web service, software, or app on mobile sends data to the algorithms that are in our daily lives. Web searches, shopping sites, credit card and reward card usage, mobile apps and services. Some AI “bots” crawl the web and collect data from different sites automatically.
Data comes from the people, and since the people hold their own sets of beliefs, behaviors, and biases, and this data is used in algorithms, the algorithms are using data collected that may be biased. Sometimes collected data is only from a certain sample in the population, such as those who choose to use a service over another.
When an AI machine incorporates this data into its function, it can’t detect bias from non-bias. It doesn’t really know how to discriminate intentionally. When it analyzes the data and learns to update its knowledge banks and functions, it does so from what it learned. If it learned things from biased beliefs, the output of the AI algorithm can be biased.
One example of biases is the AI bot Tay that Microsoft released on the 23rd of March, 2016 via Twitter, which had to be pulled 16 hours later. Tay was an AI that was a sophisticated learning algorithm that was programmed to act and learn how to be a “hipster” and be part of the younger crowd. Because Tay is simply a programmed algorithm with no real morals or self-awareness, and was programmed to assimilate all data into its neural network, this is exactly what it did. People fed the algorithm data by interacting with it and hilarity ensued. Tay became a racist and slandering piece of AI and had to be pulled from the public web.
Another slightly different instance of bias in algorithms is when patients in Arkansas and Idaho saw a drastic reduction in care hours and benefit payments. A woman with cerebral palsy who could barely do anything on her own went from 54 care hours per week to 30. A mother who had to forego her job to take care of a disabled son. New algorithms were in charge to classify patients based on input variables. To automate the system failed to account for many circumstances. Some of the data in the Idaho instance was flawed and thrown out, while other data was kept, reducing the total data, and reducing efficiency. AI algorithms aren’t self-reflective and can’t [currently] judge situations critically and fairly. While this is perhaps not a standard or direct instance of bias in algorithms, it shows that algorithms are restricted to their datasets and programming. These algorithms classify patients based on input data and can vary wildly depending on the person inputting the data. Output (hours or payments) can also vary wildly depending on what category the person is put into. That’s what the algorithm does, it judges and categorizes people in a straightforward and un-reflective manner.
Some may even go to say that Biased AI are much more dangerous than killer robots. A biased AI is a silent killer if it is instantiated in something like the healthcare field. An AI tailoring medicine regimens, treatments, and dosages are susceptible to error. Other algorithms might kill people’s careers if they are used in a business sense or education such as in ranking students for honor roll or grades, scholarships, etc.. or used in the hiring and position promotion in the corporate world. If an algorithm mispredicts based on a formula that doesn’t account for a specific set of circumstances, it might miss labeling someone correctly.
The first algorithm research had hopes for purely logical and unbiased machines, but this is not the case. Algorithms are not perfect and still prone to bias based on its functioning from the data that it is given. If the dataset is flawed or biased, the algorithms’ outputs and functions can be biased. Although a lot of it has to do with the data, an algorithm can also be flawed in its programming and method of instantiation. All the above can contribute to a biased or flawed algorithm. We are mostly to blame for AI biases, though.
Featured image from https://aitrends.com/wp-content/uploads/2017/04/4-18Bias-2.jpg