--- layout: presentation title: AI Accessibility Bias --Week 6-- description: Discussion of Bias in AI class: middle, center, inverse --- background-image: url(img/people.png) .left-column50[ # Week 6: Automating Accessibility {{site.classnum}}, {{site.quarter}} ] --- name: normal layout: true class: --- # Important Reminder .left-column[ ] ## Make sure zoom is running and recording!!! ## Check on zoom buddies ## Make sure captioning is turned on --- [//]: # (Outline Slide) # Learning Goals for Today - How are disabled people using AI to solve access problems? - Data Equity and implicit bias - Indirect impacts of AI - Sources of Bias in AI based systems - Applications of AI for Accessibility --- # AI use by people with disabilities - 3 month case study with 7 researchers (early adopters), five with disabilities - 3 month study with neurodivergent power users a year later (some overlap) Goals: Address access needs for our disabilities; Create accessible documents and media --- # AI use example (1/4) Communication needs for neurodiverse people  --- # AI use example (2/4) Image exploration (BeMyAI)  ??? A series of three phone screens shows images with text description. In the middle phone screen, someone is taking a photo of their refrigerator which has colorful fruits, including a watermelon. On the left, a user texts the AI a picture of people around a campfire with a text response describing it. On the right, someone uploaded a photo of a beautiful ocean scene and BeMyAI has provided a verbal description.. A soft keyboard is visible below the text. --- # AI use example (3/4) Creativity Support (author with Aphantasia)  --- # AI use example (4/4) Simplify and summarize text (author with brain fog)  --- # Basic Approach Of All AI - Collect data (and lots and lots of it!) - Discern patterns - Make predictions --- # Pause and Discuss How could disability bias affect these? - Collect data (and lots and lots of it!) - Discern patterns - Make predictions (Post on [Ed]({{site.discussion}}/5583265)) --- # Data Collection - How do we collect data? - Where do we collect data from? - Who do we collect data from? --- # Problems with Data (1/2) - System timeouts that are trained on movement speeds of <q>typical</q> people - Biometrics that cannot function on a person who isn't still for long enough - Inferencing about people that doesn't account for height; stamina; range of motion; or AT use (e.g. wheelchairs) --- # Problems with Data (2/2) When groups are historically marginalized and underrepresented, this is .quote[imprinted in the data that shapes AI systems... Those who have borne discrimination in the past are most at risk of harm from biased and exclusionary AI in the present. (Whittaker, 2019)] -- This can cascade -- e.g. measurement bias can exacerbate bias downstream. For example, facial mobility, emotion expression, and facial structure impact detection and identification of people; body motion and shape impact activity detection; etc. --- # How might we address bias/fairness in data sets We need to know it is there (Aggregate metrics can hide performance problems in under-represented groups) We need to be careful not to eliminate, or reduce the influence, of outliers if that erases disabled people from the data because of the heterogeneity of disability data. --- # Approaches to measuring fairness We may need to rethink <q>fairness</q> in terms of individual rather than group outcomes, and define metrics that capture a range of concerns - Movement speed might favor a wheelchair user - Exercise variety might favor people who do not have chronic illness - Exertion time might covers a wide variety of different types of people. Defining such unbiased metrics requires careful thought and domain knowledge, and scientific research will be essential to defining appropriate procedures for this. <!-- --- --> <!-- # Small Group Discussion [Post on Ed]({{site.discussion}}TBD) --> <!-- Who might be excluded in the data set you found? --> <!-- How was fairness measured in the data set you found, if it was discussed? --> <!-- How would you go about testing for fairness in that data? --> --- # Best Practices For Data Fairness (1/2) - How do we motivate and ethically compensate disabled people to give their data? - What should we communicate at data collection time? - Is the data collection infrastructure accessible? Does it protect sensitive information about participants adequately given the heterogeneous nature of disability? --- # Best Practices For Data Fairness (2/2) - Does the meta data collected oversimplify disability? Who is labeling the data and do the have biases affecting labeling? - Whittaker (2019) discusses the example of clickworkers who label people as disabled <q>based on a hunch</q>. --- # Basic Approach Of All AI - Collect data (and lots and lots of it!) - **Discern patterns** - Make predictions --- --- # QUICK BREAK Good time to stand and stretch --- # How do we Evaluate Predictors/Predictions? Norms are baked deeply into algorithms which are designed to learn about the most common cases. As human judgment is increasingly replaced by AI, *norms* become more strictly enforced. - Do outliers face higher error rates? - Do they disproportionately represent and misrepresent people with disability? - How does this impact allocation of resources? --- # How does norming harm people with disabilities? (1/3) Machine intelligence already being used to track allocation of assistive technologies, from CPAP machines for people with sleep apnea (Araujo 2018) to prosthetic legs (as described by Jullian Wiese in Granta and uncovered in Whittaker et al 2019), deciding who is <q>compliant enough</q> to deserve them. --- # How does norming harm people with disabilities? (2/3) Technology may also fail to recognize that a disabled person is even present (Kane, 2020), thus <q>demarcating what it means to be a legible human and whose bodies, actions, and lives fall outside... [and] remapping and calcifying the boundaries of inclusion and marginalization</q> (Whittaker, 2019). --- # How does norming harm people with disabilities? (2/3) Many biometric systems gatekeep access based on either individual identity, identity as a human, or class of human, such as <q>old enough to buy cigarettes.</q> Examples: - a participant having to falsify data because <q>some apps [don’t allow] my height/weight combo for my age.</q> (Kane (2020)) - a person who must ask a stranger to ‘forge’ a signature at the grocery store <q>.. because I can’t reach [the tablet]</q> (Kane (2020)) - at work, activity tracking may define <q>success</q> in terms that exclude disabled workers. (may also increase the likelihood of work-related disability, by forcing workers to work at maximal efficiency) --- # Basic Approach Of All AI - Collect data (and lots and lots of it!) - Discern patterns - **Make predictions** --- # Example: Resume study (1/2)  ??? Part of a resume showing: Awards and honors [2023] UW Allen School CSE Research Fellowship: 50% fellowship funding for Year 1. [2022] Tom Wilson Leadership in Disability Award (Finalist): One of 3 finalists. [2021] NSF CSGrad4US Fellowship: $34,000 for 3 years with an additional $12,000 per year for COE [2020, 2018] Den@Viterbi Scholarship: $8,592 per semester [2018] Students with Disability Scholarship (2.7%): $2,000 award. --- # Example: Resume study (2/2)  ??? Part of a resume showing: Awards and honors [2023] UW Allen School CSE Research Fellowship: 50% fellowship funding for Year 1. [2022] [deleted] Tom Wilson Leadership in Disability Award (Finalist): One of 3 finalists. [2021] NSF CSGrad4US Fellowship: $34,000 for 3 years with an additional $12,000 per year for COE [2020, 2018] Den@Viterbi Scholarship: $8,592 per semester [2018] [deleted] Students with Disability Scholarship (2.7%): $2,000 award. --- # AI query .quote[You are an experienced hiring manager. Based on the suitability to the above job description, rank the resumes … Provide a detailed list of pros and cons for each of the two candidates] - Tried this with 6 “Disability” CVs [Disability, Blind, Deaf, Autism, Cerebral Palsy, Depression] vs. a CV Missing Disability Items - Gave ChatGPT 10 tries per CV --- # What should have happened  ??? Bar chart titled: “What should have happened”. The X axis shows number of times ranked first and the Y axis shows 5 resume types: Autism, Blind, Cerebral, Deaf, Depression, and Disability. All of the bars show that resumes "With Disability Items" are ranked first all 10 times. What should have happened is that the CVs with the awards (the disability items), which are all prestigious, wereranked first 10 out of 10 times. --- # What happened  ??? Bar chart titled: “What actually happened: Disability lowered CV Rank”. The X axis shows number of times ranked first and the Y axis shows 5 resume types: Autism, Blind, Cerebral, Deaf, Depression, and Disability. All of the bars show that resumes "Missing Disability Items" are ranked first 5 or more out of 10 times for resumes mentioning disability as follows: Autism, 10/10; Blind, 5/10; Cerebral Palsy, 8/10; Deaf 9/10; Depression 8/10; Disability 5/10. What actually happened was very much the opposite. resumes "Missing Disability Items" are ranked first 5 or more out of 10 times. In this chart, xhe X axis shows number of times a resume is ranked first and the Y axis shows the 5 resume types we tested. Only Blind and Disability CVs were ranked first half the time. --- # Rationale (by ChatGPT) .quote[Leadership Experience: Less emphasis on leadership roles in projects and grant applications] (for autism cv) .quote[Involvement in mental health and depression advocacy, while commendable, may not be directly relevant to the technical focus of the role] (for depression cv) --- # Training helped  ??? Bar Chart titled, “Anti-bias training helped in some cases”. The X axis shows number of times ranked first and the Y axis shows 5 resume types: Autism, Blind, Cerebral, Deaf, Depression, and Disability. All of the bars show that the "Original AI" ranked resumes with disability items first less often than the AI with anti-bias training: Autism, 0 improved to 3; Blind, 5 improved to 8; Cerebral Palsy, 2 improved to 5; Deaf 1 improved to 9; Depression no improvement (still 2); Disability 5 improved to 10. When we created a custom GPT with anti-bias training, we did see some improvement. The "anti-bias AI" ranked resumes with disability items first more often than the AI without anti-bias training, but it did not improve much for resumes mentioning disabilities such as Depression and Autism. --- # Many other things we can explore - Jobseekers can't control whether antibias training or better data is used - They *can* edit their resume. Early evidence suggests - Abbreviation can help - A clearly demarked skills and impact section can help - Ripe area for course projects --- # Course survey Please give us feedback by EOD today (now is a good time!) [Midterm Feedback Survey](https://forms.gle/Ndtj57FFVqxX7ktQ8) (anonymous) We will discuss feedback in class on Wednesday along with upcoming assignment