IEC PAS 63621:2026 – Data management for AI-enabled medical devices

AI in medical devices is only as good as the data used to develop and maintain it. Discover how structured data management helps to mitigate bias and data leakage—and how IEC PAS 63621:2026 provides a practical framework for traceable data quality, governance and regulatory compliance across the full data life cycle.

This article outlines the core requirements of IEC PAS 63621:2026 and illustrates them with practical examples. The aim is to support manufacturers and other medtech stakeholders in understanding how data management for AI systems can be implemented in a manner that is both regulatory-compliant and operationally feasible.

Why is data management important for AI systems?

Artificial intelligence differs fundamentally from classic, deterministic software systems. AI systems do not generate outputs using fixed rules; instead, they rely on statistical models whose parameters are adjusted during training and subsequent tuning. As a result, the behaviour of an AI system is directly dependent on the data used. Data quality, structure and provenance therefore largely determine how reliable, robust and secure the system will be in use.

In practice:

An AI system learns patterns present in the underlying dataset—including errors, distortions and systematic bias.

Incomplete datasets, incorrectly labelled data or unrepresentative training data can directly affect the clinical performance of a medical device.

Without structured data management, there is also a risk of data leakage—for example where training and test data are not clearly separated—leading to overestimation of system performance.

Against this backdrop, data for AI systems must be planned, documented and managed with particular care across the full data life cycle. This includes clear responsibilities, defined data sources, version control, traceability of data changes and quality assurance mechanisms. This enables manufacturers to demonstrate which data were used for which purpose and how they influenced the resulting AI model.

From a regulatory perspective, the topic is gaining importance. The EU AI Act establishes an initial legal framework for the use of AI in Europe, while in the United States the FDA is also moving towards a more structured regulatory approach for AI-enabled medical devices (Guidance Artificial Intelligence-Enabled Device Software Functions: Lifecycle Management and Marketing Submission Recommendations; Guidance Marketing Submission Recommendations for a Predetermined Change Control Plan for Artificial Intelligence-Enabled Device Software Functions).

In addition, IEC PAS 63621:2026 “Artificial intelligence enabled medical devices – Data management was published on 23 March. The specification contributes to clarifying and standardising expectations for data management for AI-enabled medical devices, supporting compliance activities under the MDR and IVDR.

Basic requirements of IEC PAS 63621:2026

IEC PAS 63621:2026 addresses a key risk for AI-enabled medical devices: unmanaged or inconsistent handling of data. Its objective is to enable consistent, traceable and regulatory-compliant data management across the full life cycle of an AI system.

 

 

1) Consistent data management across the data life cycle

Manufacturers should define, implement and maintain a documented data management process. This process is intended to cover all phases of the data life cycle—from defining data requirements through planning, acquisition, development and provision, to decommissioning. An integral component is data risk management: the systematic identification, assessment and control of data-related risks that could directly affect the safety, performance or security of the AI system.

 

2) Definition and documentation of data requirements and data quality

The PAS expects data requirements to be defined explicitly at an early stage. This includes, for example:

  • Data types and sources
  • Data quality requirements and acceptance criteria
  • Relevant statistical properties of the data sets
  • Assessment of potential biases
  • Legal and regulatory constraints (e.g. data protection, purpose limitation)

These requirements should be documented and applied consistently across all project phases. Implicit assumptions or “organically grown” data practices are not sufficient.

3) Ensuring data quality through systematic processes

To support data quality, IEC PAS 63621:2026 calls for use of a data quality model and clearly defined processes for:

  • Data Quality Planning
  • Data Quality Improvement
  • Data Quality Verification
  • Data quality analysis

Deviations from defined quality objectives should be analysed, root causes identified, and appropriate corrective and preventive actions (CAPA) defined and documented. Data quality therefore becomes an actively controlled process rather than a one-off verification step.

4) Traceability, transparency and data governance

A central element of the specification is traceability of the data used. Manufacturers should ensure that data provenance, version control and purpose limitation are documented. The PAS also expects appropriate measures for data security and privacy, and consideration of fairness and ethical aspects where relevant.

5) Controlled use, provision and decommissioning of data

Data should only be used or shared once defined requirements are met. The need to prevent data leakage is emphasised—for example by clearly separating training, validation and verification datasets. Traceable and documented procedures should be established for decommissioning, retention and deletion of data to reduce regulatory and legal risk.

Practical example: AI system for detecting liver tumours

To illustrate the requirements of IEC PAS 63621:2026, we consider an AI system for detecting liver tumours using CT image data. The example demonstrates how the requirements can be implemented pragmatically, without unnecessary complexity.

1) Consistent data management across the data life cycle

All CT image data are managed within a centralised system from the initial project concept through to product discontinuation. Each dataset has a clearly defined status (e.g. “new”, “annotated”, “released” or “archived”). At any point, it is evident whether a dataset is still being prepared, is being used for training, or forms part of the final test set.

New image data—for example from an additional clinic—are handled as a separate dataset and are not incorporated into existing training data in an uncontrolled manner. Responsibilities are defined across roles (e.g. data collection, annotation and approval). Changes to records, such as correction of incorrect annotations, are documented with version control so that previous and current states remain traceable and comparable.

2) Definition and documentation of data requirements and data quality

Specific data requirements are defined at the start of the project. Only contrast-enhanced CT images with complete visualisation of the liver are used. The AI system is trained for adult patients only; paediatric cases are excluded from collection.

The required case mix is also defined to ensure adequate coverage of small, medium and large tumours. Qualitative criteria are specified, for example excluding images with excessive slice thickness or relevant artefacts. These rules are documented so that it remains clear, retrospectively, why particular data were accepted or rejected.

3) Ensuring data quality through systematic processes

New CT data undergo automated technical checks (e.g. to confirm completeness of the required image series), and tumour annotations are subject to random second review by another specialist.

Where recurring issues are identified—for example, inconsistent delineation of tumour margins—these findings are incorporated into updated annotation guidance. Datasets that do not meet the defined criteria are not accepted by default; they are excluded or reworked as appropriate. Following any changes, checks are repeated to confirm that requirements are met.

4) Traceability, transparency and data governance

For each trained model, the specific dataset version used is documented. The provenance and acquisition conditions of each image are traceable, as are the annotation method and the responsible specialist.

Permitted use of the data is defined (e.g. development and validation), with purpose limitations applied as appropriate. Role-based access controls ensure that sensitive image data are accessible only to authorised personnel.

5) Controlled use, provision and decommissioning of data

Training and test datasets are stored separately to prevent inadvertent mixing. Developers do not have access to the final test set during model optimisation. Where data are transferred between systems, controls are in place to ensure that image information is not lost, altered or corrupted.

Data from completed studies are archived or deleted in line with defined retention rules. Where consent is withdrawn, affected image data can be identified and handled in accordance with applicable data protection requirements and internal procedures.

Conclusion and implications

Data management is not a downstream formality; it is a critical success factor for AI-enabled medical devices. Because AI systems are non-deterministic and derive behaviour from the data used, robust and traceable data management is essential for safety, performance and regulatory acceptance.

IEC PAS 63621:2026 provides a structured, technology-oriented framework focused on data management for AI-enabled medical devices. It supports manufacturers in aligning data management practices with relevant regulatory expectations, including considerations arising from the EU AI Act. The specification encourages a life-cycle approach—from initial data collection to controlled decommissioning—making it difficult to justify undocumented assumptions, legacy data collections or “informal” training sets in a regulated environment.

For manufacturers, this can mean greater effort early in development: data requirements must be defined, roles assigned and processes documented. However, the approach typically pays dividends over time. Well-controlled data management reduces risks such as bias, data leakage and non-reproducible performance metrics, facilitates audits and reviews, and provides a sound basis for subsequent model updates and post-market activities.

At the same time, organisations increasingly need to treat data not as a by-product of development, but as a controlled asset with defined governance, quality metrics and managed risks. This requires close collaboration across software development, clinical affairs, regulatory affairs, quality management and data protection—often providing the greatest leverage for robust, secure and future-proof AI products.

In light of the EU AI Act, evolving FDA guidance and IEC PAS 63621:2026, structured data management is increasingly a differentiator. Manufacturers that establish appropriate processes early can strengthen regulatory confidence while retaining flexibility for future iterations of AI-enabled medical devices.