In recent years, with the advent of large language models (LLMs), we have witnessed a rapid evolution in how domain knowledge is leveraged to build more sophisticated and intelligent systems. Expectations have risen accordingly, driving a shift toward semantic web technologies, which are becoming de facto standards for new businesses and services. Today, data has emerged as an invaluable asset, more prominently than ever before.
This transformation has largely been fueled by freely available data on the web. Naturally, one might wonder: how much more could be achieved if knowledge creation could also flourish using highly protected, non-public data? Critical infrastructures are prime examples of such siloed domains, where finding a formula to enable the responsible use of these protected spaces is crucial.
The reasoning is straightforward: critical infrastructures comprise massive installations where resource usage often scales enormously. By their nature, they provide extensive benchmarks across various technologies, generating high-quality, unbiased, and representative feedback. This, in turn, can lead to the development of more accurate and better-targeted AI models. However, exposing the inner workings of such infrastructures introduces significant risks. Therefore, a new paradigm must be established—one that facilitates AI model development while ensuring maximum protection at every stage, from data discovery to training and subsequent model usage.
PAROMA-MED: A Secure Model for AI in Sensitive Domains
The PAROMA-MED project has pioneered a comprehensive protection framework in the medical domain, where patient data represent the most sensitive assets. Within the project’s solutions, currently under evaluation, protective measures extend across the entire data lifecycle and processing workflows.
A fundamental principle of the project is that data always remain within the security perimeter of their originating domain. This does not diminish their utility, as Data Space methodologies allow for controlled discovery and advertisement within well-defined ecosystems, enforced by rigorous attestation mechanisms. These mechanisms rely on hardware-assisted Trusted Execution Environments (TEEs) to provide verifiable evidence of integrity and compliance with predefined ecosystem requirements. As a result, data retain their intrinsic value while maximizing their utility within a trusted network.
Furthermore, all data processing occurs strictly within the storage domains, eliminating unnecessary data transfers. By adopting “code-to-data” paradigms such as federated learning, data remains undisclosed at all times. Even trained models, once produced, follow the same stringent principles: they are protected, never exposed to uncontrolled access, and safeguarded against inference attacks. During inference tasks, the model is deployed alongside the data it processes, ensuring that applications interact with it through APIs rather than integrating it directly into their runtime environment.
© Pixabay
Expanding the Model to Critical
Infrastructures and Supply Chains
This approach to safeguarding sensitive domains can be extended to critical infrastructure ecosystems, enabling controlled data exposure while ensuring that underlying resources remain secure. It also holds promise for full-scale supply chains, industrial environments, and other sensitive infrastructures, facilitating knowledge creation without compromising security or privacy.
By prioritizing verifiable adherence to policies and regulations, heterogeneous domains can collaborate safely without the need to relax security constraints. End-to-end workflows can be executed with hardware-backed assurances of software integrity, enabling a new era of trusted, privacy-preserving AI applications across high-security domains.
Further information
- PAROMA-MED project website: https://paroma-med.eu/