Metabase is formed of thousands of medical concepts, which are like chapters that we never close. With the complex network of interactions built using AI algorithms, the medical knowledge database is in constant motion. That's why we test it again with each update, using automated regression tests.
The Medical Knowledge Base (Metabase) is like a complex library of medical concepts representing various conditions, symptoms, and their relationships. With over 680 conditions, almost twice as many symptoms, and a compound network of risk factors, we need to ensure the highest diligence and quality of inputs. We make every effort to avoid even the smallest mistakes, carefully creating a network of medical information that will guide patients to the best possible care.
That’s why we repeatedly conduct automated regression tests (as well as manual tests) that help us to spot all the interactions between the symptoms and conditions in our medical knowledge base and perfect them before they reach our end users.
What is automated regression testing?
Regression testing is a type of software testing that re-runs a series of tests to check whether recent changes in the application or system are negatively affecting existing structures. It takes its name from regression, which can occur when new changes in the application result in mistakes or slowdowns. In IT development, regression tests are carried out after bug fixes, software enhancements, configuration changes, or even submission of new hardware components. The larger the application is, the more time it takes to analyze its correctness, so automation processes are often involved.
Regression tests are usually used to test the correctness of an application, but they can also be used to track the quality of its outputs. This is how we use them with the medical content we store in Metabase.
Automated regression tests in healthcare solutions
Automated regression testing is a common technique in IT development and an effective one, so we decided to transfer this good practice to our medical database. The key difference is that instead of using a series of functional tests we use a series of diagnostic case vignettes prepared by our medical team. Diagnostic riddles, known as medical test cases, are the representation of every condition stored in our Medical Knowledge Base. Each of them is created as an integral element of the new conditions that our doctors successively add to Metabase, and thus expands the capabilities of our pre-diagnosis and triage system. The test cases describe detailed symptoms of real (but anonymous) patients and help us to verify whether these conditions or symptoms are error-free. Each condition has at least a few test cases. We construct patient cases, which later become test cases, on the basis of recognized and respected literature, such as the British Medical Journal, the New England Journal of Medicine, and Mayo Clinic Proceedings, as well as the in-person experience of our physicians.
"Whenever a doctor creates a new medical concept, for either a symptom or a condition, he creates at least two test cases associated with this symptom or condition. The test case must originate with a proven source so that we are sure it is trustworthy. Then we look carefully to see whether the newly created concept passes those test cases and how we should improve it," says Mateusz Palczewski, Clinical Validation Officer at Infermedica.
There is no room for compromise when patients’ safety is at stake. That is why before we publish any medical content, including test cases, it is reviewed in peer-to-peer consultations and evaluated by fellow editors. Whenever we find room for improvement, we take a step back, analyze it again, and perfect it. In addition to checking whether our medical concepts are in agreement with test cases, we also examine whether they yield appropriate triage recommendations.
Once a new medical concept (along with corresponding test cases) passes expert and technical reviews in Metabase, we add it to our products and applications (you can learn more about our release cycle in the content development process description). This is when our medical content validation cycle closes. Automated regression testing is used to verify whether the new concepts are in alignment with all conditions and test cases already published in Metabase, and then they become part of the regression testing process itself. Every medical concept added in the future will be tested against its correctness with concepts and test cases added in the past.
Automated regression tests are done with every single update of Metabase, sometimes as often as five to ten times a day. This testing process was created by our medical and QA teams, and as of now, it is automated within the Metabase system.
Regression tests as part of the quality assurance process
In time, regression testing proved to be the way to test every single element of the Infermedica database in a well-organized and time-optimized manner. They also became an integral part of our monitoring and quality assurance processes. We know that our medical concepts work seamlessly because we test them using real-patient clinical cases. In our experience, this is the most objective approach to measuring the accuracy of pre-diagnosis or triage solutions.
What is more, automated regression testing has proven to be a very powerful framework to check whether the system is improving over time. Now we use it before every release or update of the medical content base and make sure that the tests yield the highest possible results.
Complex testing methods are one of several elements, like trusted medical knowledge sources and adjusting content to patients, that make Metabase trustworthy and reliable. Nevertheless, we do not stop here. Even when our medical concepts or test cases have already been published in Metabase, we continue observing them, as do our partners, and keep looking for areas to improve, whether in construction or phrasing, as these concepts must be understandable to patients with no medical knowledge.
With automated regression testing, we can repeatedly analyze our medical content and ensure its outstanding quality. It is also an efficient way to ensure the stability of the system.
We continually test our Medical Knowledge Base and reasoning algorithm against new cases reported in journals, including complex Clinical Pathology Conference (CPC) cases prepared by Massachusetts General Hospital and published in the New England Journal of Medicine, to ensure that it performs for every possible clinical presentation.