Why Humans Cannot Test

No, this is not a “click-bait” article or a science fiction essay, nor an AI written manifest. Think of this as a fast-approaching horizon, a wind of structural changes in IT, Development and Testing. A driverless UBER car picking you up on your next trip. A warning sticker we are all familiar with: “Doors open automatically”. Leave your worries behind, and step into AI-controlled world!

You and Me are Humans, we have flaws, and while we are certainly good at what we do, we are just not good enough as the machines. To vindicate our weaknesses, the term “human error” had arrived long-before computers or a rogue spaceship’s AI in Arthur C. Clarke’s “2001” A Space Odyssey” novel. Computers and algorithms fail too, mainly because of the imperfections designed and coded by humans. Hackers, dolts or politicians are to blame, and well deserved. Our logic is imperfect, our choices are based on a weak and limited human imagination, and on a subjective experience or a lack of it. Between 60 to 80% of all air accidents (*1) happen, when pilots or air-traffic controllers trust their instincts by disengaging “auto-pilot” or taking over navigational controls. The horrific Chernobyl Nuclear disaster happened as a result of a routine “safety test” (*2) gone wrong, and a recovery scenario nowhere to be found by the control room engineers


Our brains, with their praised neurons transactions ability, are just standalone fragile primitive computing systems, some of us could remember more chess moves, than the others, but as told by a popular myth (*3) most Humans are unlocking their brains potentials to 10% of capacity, with dolphins outpacing us in ultrasonic communications abilities, and even bloody mosquitos hunting in infrared. We, Humans, delegate difficult tasks to machines We teach the machines. We create algorithms. Meanwhile, we trust our senses to reduce the speed at that curve, or hoping that there is no police car ahead. We try our best. We override the rules at our will. We add “exceptions”, “exemptions”, “exclusions”, and “earmarks”. Every human’s success, being that a distinguished career, a remarkable wealth or a popular recognition - is often a result of us bending the rules, finding that “competitive advantage” or outright breaking the law. Often machines come to help. Give us those night goggles and radars to see the enemy, those supercomputers to outpace trading in securities, those GPS satellites and chips to know where everyone is, this browser’s cookies to track who is reading this article


The fact is, without the machines we are blind, moody, unruly, stinky and filthy animals [yes, Merry Christmas to you too 🙂] Used for good or for evil, machines are our walking sticks, our hearing aids, our prosthetic limbs, our vocal cords and most importantly our Shepherds and Big Brothers. They are smarter, faster, and stronger than us


So, why Humans can’t test? Testing software and hardware of the machines assumes Humans: understanding, seeing and predicting machine’s logic au par with the code and circuitry that code is running on. Yes, we may objectively argue that the machines are still inferior to Humans in the ways we feel, dream or create, but undeniably, the competition in processing power of “Human’s brain vs. The Machine” had been won by machines long time ago. Integrating AI in Humans vital activities and delegating AI the power to decide relieved Human brains from duties. AI had taken over in mining, analyzing, and building the data to not only deliver a weather forecast, car navigation, and health diagnostics, but also to influence consumers behavior, stock markets moves or election results. Let’s take a look at how it has affected Testing industry and SQA


Traditionally, test teams were composed of engineers with two types of backgrounds: Manual testers, QA or UAT, were often former business analysts or “subject matter” specialists, not savvy enough for deep technical tasks and not slick enough to swim in corporate politics and bureaucracy, their main task was to answer the question: “What went wrong?” to the management; and of Automation Testers, in somewhat of a supporting role to the main QA group, with particular technical tasks, and a question to answer: “Where it went wrong?” to the Developers. For Test managers the ultimate goal was to replace Manual Testers with Automation Engineers, to have shorter test cycles, and better quality in technical testing. The problem with that approach was not just finding the right staff or absorbing the costs of the automation tools, the main challenge was to integrate test automation into SDLC by demanding Business and Development to produce requirements and code integrated with data at the input and output of Automation Tests. There were some “out-of-box" bundled solutions, like HP Quality Center, but not only they were not open-source based and prohibitively expensive for most companies, they required a steep learning curve from all business units, and a complex conversion from existing practices. As an example: a dozen of existing requirements for a login page have had to be broken into a hundred of precise type-based (UI, Database, Functional and etc.) requirements, re-prioritized and re-linked to corresponding Code and Test Cases to form a valid, and dynamically updated Traceability Matrix. So, what does it take to build a test?


Software Testing Life Cycle (STLC) has multiple phases (*4), to stay in scope of this article let’s group those phases into 3 major challenges for a Human tester: Designing a test, Executing that test, and Analyzing/providing the results. Let’s start with the Design


Are we intelligent enough to design a test? Let’s see: How many test cases do you need for a simple 5 buttons lock? The answer is 1082! (*5) As a tester, are you going to write all those 1082 test cases? The very first question business users ask testers: What is our Test Coverage? The more important question [they rarely ask]: How do you define Test Coverage? In that “5 buttons lock” example, if you have had 40 use cases, and 108 test cases, covering all those use cases and then some scenarios, would you describe your test coverage as 100% or as 10% (since you are only testing 10% of all possible 1082 combinations)? In Software Test Development - Test Coverage term is often confused with Requirements Coverage. By implementing Requirements into the code, Developers are extending the functionality and opening a Pandora’s box of test scenarios. A skilled Tester would not just verify that all Requirements have been reflected in the code, but would focus on testing Code Coverage (*6). In other words, as testers: We should test ALL what we get. Requirements Coverage is a necessary testing condition, Code Coverage is a sufficient one.


One Human’s trick to reduce test combinations is to bundle. Think of those buttons as controls on a typical payment webpage... Last Name, First name, credit card info and etc. How many test combinations do we have there? That familiar Amazon “one-click” checkout, or a smartwatch instant payment or a fingerprint’s touch - those are not just some security or convenience or marketing features – they allow companies to streamline their processes, by bundling billions of combinations in a functional design. Additionally, using unique and standard properties of the objects is helping machine’s-based test automation to parse and execute the code. Bundling, standardizing, simplifying and etc. Humans can build a very basic test, but can we execute it?


What you see is what you get? Humans get trapped easily. Buy those fake Rolexes, swipe your credit card in a scammer, click on that fishing email link. Our logic is primitive. Beyond visual and functional verification, we really don’t know that there are new controls or scripts on a page, or what have been changed in objects behaviors, we can’t tell where a customer is real: “I am not a robot”. Not only we can’t execute any Performance, Stress, Load, Security tests without the machines, we can’t neither test any significant code changes nor assess the impact of functionality added with that code. A perfect test is a fluid algorithm that penetrates all possible paths, poured over by the AI. Humans, by nature of our curious minds, are all testers or “hackers”. When we get trapped by clicking “50 Best places to live” link, and then going one by one through those 50+ pages with ads, we realize that changing page number in the address line would bypass all those 47 sequential pages and get us to top 3. We notice a simple pattern and we exploit it. From advertiser’s point of view, it’s a lost business, so they implement a fix to hide that address line from your eyes, however, any decent automation would have noticed the code’s logic behind generating that link. That’s why, for example, Google Search could show you news or data from sites, bypassing their paid API, or customer traps, to the extent that many sites use Google Search to index and search their own data! Combinatorics of a Human’s mind are simple and slow, we think and communicate in Human Languages, machines and AI exchange code. When executing a test, Human testers only see a tip of an iceberg. We are not Neo from Matrix, we see concrete walls, not equations, graphs, objects and fluid lines of the code shaping our reality


“Ok AI, show me the results!” From test execution to test analysis: Expected results are dynamic! There are not “expected” anymore. While a railroad switchman job still has vacancies, Waze or any Navigation App are all great examples of how the results are fed into Machines self-learning, along with data from user-driven (make a wrong turn) or event-driven (accident ahead) scenarios. AI learns, re-calculates, and instantly re-routes you. At the same time AI is collecting your own data to adjust stats for future users. The ironical part is when AI asks Human for a feedback. “How would you rate your transaction? Did you enjoy your trip?” Will AI make adjustments based on your feedback or based on the data it has collected? Or maybe it is just a “lie-detector” test of us? In many cases Humans intervening in AI logic backfires. Several years ago, on my trip to LA, Google navigation had sent me literally nowhere, when I tried to go to the nearest spot to take photos of a Hollywood sign, later I read, that because of complaints from residents, Google Maps were not showing that closest spot on purpose (*7)


To sum up: all three phases of Designing, Executing and Analyzing Test results are handled better by machines than Humans. We are still in control, AI still needs us to build automation to push those buttons, or apply brakes in a car, or turn on power switch on appliances, or to update our apps, or to tame our sarcasm, and ask:

“Hey AI, why Humans can’t test?”


Links to materials:


(*1) www.faa.gov (*2) Chernobyl disaster (*3) www.scientificamerican.com 10% brain usage (*4) software testing life cycles (*5) math.bu.edu 5 buttons combinations (*6) Code coverage (*7) "No Hollywood sign for ya"