AI in healthcare: navigating the noise

Demystifying AI in healthcare: the jargon buster

Please note that not all definitions have an associated example as in many cases an example would be repeating the definition.

The basics

Automation/robotic process automation (RPA)

The use of specialised software and technology to carry out repetitive tasks, following a set of instructions and workflows set out by humans. These tasks typically remain consistent over time and include actions like sending appointment reminders, missed appointment notifications, or even receipts after online purchases. If a task is not explicitly outlined in the instructions, the machine cannot perform it.

Example: In healthcare, automation extends to patient monitoring, medication management and administrative tasks in hospitals and clinics.

Algorithm

A set of well-defined rules or processes used by an AI system to conduct tasks such as discovering new insights, identifying patterns, predicting outcomes and solving problems.

Artificial intelligence (AI)

The capability of a computer system to mimic human cognitive functions such as learning, problem-solving, interpreting visual information, understanding, and responding to spoken or written language. AI uses maths, logic and patterns learned from data to simulate human reasoning and make decisions and recommendations.

Example: In healthcare, AI can be used to enhance diagnostic processes, personalise treatment plans and manage healthcare data efficiently.

Data

Any information that can be processed or analysed to gain insights. Data can take the form of numbers and statistics, text, symbols, or multimedia such as images, videos, sounds and maps.

Example: In the context of healthcare, data can encompass patient records, clinical studies and real-time health monitoring outputs.

Machine learning (ML)

A subset of AI that enables machines to automatically learn and improve from experience without explicit programming. By using set processes to analyse large amounts of data, ML systems can identify patterns, help make decisions, and improve their performance with little to no human intervention.

Example: In healthcare, ML applications include predicting disease progression, analysing medical images and optimising clinical workflows.

Model

A simple representation of an aspect of the real world. It is a programme that has been trained on a set of data to recognise certain patterns or make certain decisions without further human intervention.

Prompt (enginerring)

A prompt is a question, command or statement input into an AI model to initiate a response or action, facilitating interaction between a human and the AI to generate the intended output.

Interpreting models

Accuracy

Accuracy is a metric in machine learning that measures how often the model correctly predicts the outcome. It is the fraction of predictions that the model got right, indicating the overall correctness of the model's predictions. It shows how often a classification ML model is correct overall. Accuracy is useful when the classes are balanced (ie the number of instances in each class is roughly the same). However, it can be misleading in cases of imbalanced classes. Ideally, accuracy should be as close to 100 per cent as possible, however, 70-90 per cent are often cited as acceptable ranges. It is important to remember that 50 per cent accuracy means 50 per cent are classified as positive and 50 per cent as negative, which is essentially the same as random classification.

Bias

Bias occurs when an AI system produces results that are systematically prejudiced due to flawed assumptions in the machine learning process. This bias can reflect and perpetuate human biases and social inequalities present in the initial training data, the algorithm itself, or the predictions it generates.

Example: Pulse oximeters are less accurate for people with darker skin tones, meaning AI applied to this device can underestimate skin cancer in people with darker skin due to less data.

Explainability

A measure of how understandable, or explainable, the decisions of an AI system are to humans.

Example: An AI system may predict which patients are most in need of surgery but should be able to explain why it has prioritised patients in a certain way.

Explainable AI (XAI)

Where humans can understand how the results of an AI model were obtained.

Model drift

This is the degradation of a machine learning model's predictive accuracy over time, caused by changes in real-world environments or new input data differing from the data used during training.

Example: When a new bus route opens to a hospital making a model used to predict did-not -attends less accurate at predicting attendance patterns for patients, due to different trends compared to when the model was trained (prior to the new bus route).

Precision

Precision is the proportion of positive class predictions that were actually correct. For instance, if the model predicts 100 instances as positive and 70 of them are truly positive, the precision is 70 per cent. Precision shows how often an ML model is correct when predicting the target class.

Scalability

ML scalability refers to the capability of a machine learning system to handle increasing amounts of data and computational resources without compromising performance or precision. It involves the ability to process large datasets while still producing accurate results in a reasonable amount of time.

Sensitivity (recall)

Sensitivity is the proportion of actual positive class instances that the model correctly identified. For example, if a dataset has 100 positive instances and the model correctly identifies 60 of them, the recall is 60 per cent. Recall measures the ability of the machine learning model to identify all objects of the target class.

Specificity

Specificity indicates the model's ability to accurately predict true negatives for each category. In other words, specificity assesses how well the model correctly identifies instances that do not belong to the target class.

Training a model

Training a model in machine learning is the process of teaching a machine learning algorithm to make predictions or decisions based on data.

Example: In healthcare, this often involves training with clinical data to improve accuracy in diagnosis and treatment efficacy.

Types of data

Big data

Extremely large and rapidly growing collections of diverse data types including, structured and unstructured, which are so complex that traditional data processing software cannot handle them.

Example: In healthcare, big data can include processing multiple structured and unstructured data sources, such as genetic data, medical history, and lifestyle factors to support personalised medicine.

Structured data

Data that is organised and formatted in a specific way, making it easily readable and understandable by both humans and machines, allowing viewers to immediately recognise the type of data they are looking at.

Example: A patient's electronic health record (EHR) that includes fields for name, age, blood pressure and diagnosis codes is structured data.

Synthetic data

This is artificially generated data produced by computer algorithms or simulations, designed to mimic the patterns and characteristics of real-world data, and often used as an alternative to actual data.

Test data

A final check of an unseen dataset to confirm that the ML algorithm was trained effectively and validate that the model can make accurate predictions.

Training data

The data used to train machine learning models. Curated training datasets are fed to machine learning algorithms to teach them how to make predictions or perform a desired task.

Unstructured data

Data that does not have predefined structure or organisation. Unlike structured data, which is organised into neat rows and columns in a database, unstructured data is an unsorted and vast information collection.

Example: In healthcare, unstructured data often includes medical notes, audio recordings of patient interactions and images from various diagnostic procedures.

Validation data

Data not included in the training set of the model, allowing data scientists to evaluate how well (using metrics like accuracy, precision, sensitivity and specificity) the model makes predictions based on new data unseen by the model as it is being trained.

Types of machine learning

(Artificial) neural network

A neural network is a type of machine learning programme that makes decisions similarly to the human brain. It processes data using interconnected units called neurons, which work together to identify patterns, weigh options, arrive at conclusions and learn and improve over time. This method, inspired by how biological neurons function, teaches computers to handle complex problems by mimicking the brain's layered structure.

Reinforcement machine learning

A subset of machine learning that allows an AI-driven system to learn through trial and error, using feedback from its actions.

Example: This is particularly useful in personalized medicine, where systems learn to optimise treatments based on individual patient responses.

Semi-supervised machine learning

A type of machine learning that falls in between supervised and unsupervised learning. It is a method that uses a small amount of labelled data and a large amount of unlabelled data to train a model.

Example: This approach is beneficial for patient data where obtaining fully-labelled datasets can be costly or impractical.

Supervised machine learning

A category of machine learning where labelled datasets (each input has a known output) are used to train algorithms to predict outcomes or recognise patterns. By studying these datasets, the computer learns to predict the output given new input data. It is like teaching a computer by showing it many examples and letting it figure out how to do things correctly.

Example: In healthcare, this method is extensively used for diagnostics, such as identifying diseases from medical imaging data.

Unsupervised machine learning

A type of machine learning that does not need labelled data or human guidance. It works with unlabelled data to discover patterns and insights within the dataset. The algorithms explore the dataset without explicit instructions to find unknown relationships or insights independently. It is like letting the computer explore the dataset with the teacher allowing it to uncover patterns and structures by itself.

Types of model

Deep learning model

A form of machine learning that employs artificial neural networks, inspired by the human brain, to learn from vast amounts of data (including labelled and unlabelled, structured and unstructured data). These networks enable the digital systems to learn and make decisions automatically and independently without human intervention.

Example: These models are increasingly used in areas such as pathology, radiology, and genomics.

Foundation model

A machine learning model trained on a vast amount of data so that it can be easily adapted for a wide range of applications. A common type of foundation model is large language models, which power chatbots such as ChatGPT.

Human-in-the-loop

A system comprising a human and an AI component, in which the human can intervene in some significant way, such as by training, tuning or testing the system’s algorithm, so that it produces more useful results. It is a way of combining human and machine intelligence, helping to make up for the shortcomings of both.

Large language model (LLM)

A machine learning model capable of performing various natural language processing tasks. These tasks include generating and classifying texts and images, answering questions conversationally, translating between languages, predicting, and summarising content. It uses deep learning algorithms and a vast dataset to achieve these capabilities.

Example: In healthcare, these models assist in clinical decision support and patient interaction.

Multimodal model

This is a machine learning model that processes and combines different types of data, such as images, videos and text, to make more accurate determinations, draw insightful conclusions, or make precise predictions about real-world problems.

Example: Multimodal models can include data from an electronic health record, an image captured by an X-ray and a radiologists written description of an X-ray to derive conclusions around diagnoses.

Applications of AI

AI hallucination

An AI hallucination occurs when an AI, such as a large language model, produces false or misleading information that seems factual but is actually inaccurate or nonsensical. This can be through identifying patterns that do not exist in real life.

Example: For example, an AI model suggesting the wrong medication for a patient based on hallucinated data.

Ambient AI

Ambient AI is a type of AI that blends into the environment to improve human interaction without being noticeable. It works quietly in the background, using sensors to understand and predict human behaviours. It continuously collects data from these devices like sensors to make real-time decisions.

Example: In healthcare, ambient AI can be used to monitor patient conditions in real-time, optimise hospital operations and deliver personalised healthcare services, all while minimising the need for direct human command or intervention. It enhances patient care by predicting needs and intervening proactively, thereby improving patient outcomes and operational efficiency.

Computer vision

A field of AI that trains computers to interpret and understand the visual world. Machines can accurately identify and locate objects and then react to what they ‘see’ using digital images from cameras, videos and deep learning models.

Example: Computer vision is used in tools that automatically screen for diabetic retinopathy from retinal images.

Decision support system

A computer-based system that helps users make decisions by analysing large amounts of data, providing insights and suggesting possible courses of action. It combines data, analytical models and user-friendly software to support problem-solving and decision-making.

Example: In healthcare, this could include a machine learning algorithm that analyses radiology images to provide a diagnosis to support physician decision making.

Digital twin

A computer model that simulates an object in the real world, such as a biological system. Analysing the model’s output can tell researchers how the physical object will behave, helping them to improve its real-world design and/or functioning.

Generative AI (Gen AI)

Algorithms capable of creating new original content, including text, images, audio, simulations and software code, in response to user prompts or requests.

Example: In the medical field, generative AI is used to simulate patient data, develop virtual models for training, and generate synthetic biological data for research.

Natural language processing (NLP)

A branch of AI focused on enabling computers to comprehend, interpret and manipulate human language, aiming to make them understand and communicate in a similar way to humans.

Example: In healthcare, NLP can be used for extracting relevant structured information from the free text of clinical records.

Predictive analytics/predictive modelling

Predictive analytics is the process of using data to forecast future outcomes. The process uses data analysis, machine learning, AI, and statistical models to find patterns that might predict future behaviour.

Previous Next