By Alvin Lee
SMU Office of Research & Tech Transfer – Imagine this: You are a paramedic attending to an emergency call from a patient with infrequently seen symptoms. Because of the rare manifestation, precious time is lost while determining a potentially life-threatening malady.
Pradeep Varakantham, Associate Professor of Computer Science at SMU’s School of Computing and Information Systems (SCIS), is currently working on a project to help train those in “safety-critical applications” to handle both expected and unexpected situations.
“If let's say someone has an illness related to the kidney, the way it manifests and the kind of symptoms you see might differ from person to person,” he explains. “What the system [I am working on] will do… is analyse all the past occurrences of kidney-related illness, and then it will try to mix and match them so that a feasible scenario is generated during training.”
What is an ATP?
Professor Varakantham is the Principal Investigator for the four-year project “Trust to Train and Train to Trust: Agent Training Programs for Safety-Critical Environments” funded by the National Research Foundation under its AI Singapore Programme. The theme of this project is “AI systems help by training non-experts”.
In partnership with the Singapore Civil Defence Force (SCDF), the project seeks to develop an Explainable and tRustworthy (ExpeRt) AI or Agent Training Program (ATP) that will be used in training simulators. Existing training simulators, noted Professor Varakantham:
“…employs few fixed (physical or virtual) scenarios or idealised simulators (consider expected scenarios). Such training can introduce bias in trainees (e.g., due to only encountering those few scenarios in training), does not guarantee learning of safe behaviours etc., resulting in trust deficit among organisations employing training simulators as well as trainees.”
With the end product, which format is yet to be decided, Professor Varakantham aims to cross-train existing SCDF staff who might not be performing frontline paramedic duties all the time, as well as paramedics who do. Reinforcement Learning will be a central feature of the ATP.
“The Agent Training Program employs reinforcement learning basically to generate scenarios,” Professor Varakantham elaborates. “Each scenario that it generates is like one action it takes. If it generates a scenario and the trainee person has learned a skill for that scenario, then that's plus one for the ATP. But it generates a scenario and the trainee did not learn, then it's zero.
“If the person unlearned the skill in the sense that they were unable to execute the skill back again at the same level that was learned before, then it's a negative. In that sense, what this ATP will do is generate scenarios in such a way that the learning is increased. The number of skills that the person is learning keeps going up.”
Describing each training session as an ‘episode’, Professor Varakantham explains that trainees score points for each successful completion of an episode, which requires demonstration of mastery of skills. He expects trainees who used the ATP to complete the training in fewer episodes than those who do not.
“We want to build simulators… automatically generating different scenarios based on a log of all the data of how different scenarios have happened in the past,” he tells the Office of Research and Tech Transfer. “The ATP will generate scenarios in such a way that the person playing it is able to learn faster in a safe, fair, and robust way.
“We want to compare humans who play this game without the help of our Agent Training Program, versus those who play with the Agent Training Program. The aim is for the learning to be faster and safe with ATPs.”
Explainable and tRustworthy
The ability to recreate scenarios virtually which cannot be recreated in real life is a big improvement over existing training programmes. While that is achieved by the ‘ATP’ part of the project, there is also the ‘ExpeRt’ or Explainable and tRustworthy aspects that must be considered.
“Current AI systems are [built on] neural networks – they take something as input, there's a lot of processing and then it generates an output,” Professor Varakantham explains. “The accuracy of predicting something, classifying something, is amazingly good. However, when you want it to explain, ‘Why did you identify an image as being that of a dog?’ there isn't much to go by. That's the explainability part: how do you explain the decisions that the system has taken? That's explainable AI.”
Trustworthy, on the other hand, is about biased data or simulations that can affect the output. Professor Varakantham cited the famous example of U.S. autonomous vehicles’ collision avoidance systems being less able to detect individuals with darker skin tones.
“Developing trustworthiness systems implies that we somehow account for these factors present in the data and we make sure that the algorithmic methods that we are using are able to counteract the impact of whatever bias or unfairness that is there in the data. That's essentially the trustworthiness part,” he says.
In addition, the project also involves the Maritime Port Authority (MPA) of Singapore, who had been working with Co-PI Akshat Kumar, Associate Professor at SCIS. The ATP in this case will help ship pilots “get acquainted with some of the tricky [situations] they might face, such as when they enter Singapore territorial waters [at a certain speed] there's a chance of collision”.
Professor Varakantham is also the Lab Director of the “Collaborative, Robust and Explainable AI-based Decision-making Lab” (CARE.AI Lab) which was created in July 2021. The lab has also secured an AISG grant with the National University of Singapore (NUS) on how AI systems help by providing situational assistance.
“It's like if you're trying to assemble IKEA furniture, then there's usually a lot of instructions,” he says, adding that the project aims “to coordinate multiple AI agents to help multiple humans that are working together on a complex task”. “Most of the time we forget which one should be placed before which, but let's say we had these glasses which are observing what you're doing and then provide assistance on the fly.
“For humans to trust what the agents are saying, we have to trust what they are saying. How can we achieve that trustworthiness and how can humans understand what the agents are saying? All that is part of we aim to find out though the lab’s research.”