Internet Security Auditors Blog

Tokenization in environments with PCI DSS compliance requirements

Written by Javier Leonardo Robles Pallares | Dec 11, 2025 3:09:32 PM
When a company undertakes a PCI DSS compliance project, the first task is to determine and define the scope of the applicability area—in the words of the standard, to define the Cardholder Data Environment (CDE). The best way to reduce the PCI DSS applicability scope for systems that process, store, or transmit card data is quite simple: do not use card data. However, this is not always possible.

If it becomes necessary to store the Primary Account Number (PAN), it must be protected wherever it is stored (PCI DSS 4.0.1 requirement 3.5.1). PCI DSS establishes four methods to ensure PAN data remains unreadable:

- One-way hashes based on strong cryptography applied to the full PAN.

- Truncation.
- Index tokens.
- Strong cryptography with associated key management processes and procedures.

Starting with the sample card data shown below, we will review each method and the results of applying them.

These are cryptographic functions used to protect sensitive data, such as credit card data, by converting it into a fixed-length string that cannot be feasibly reversed to its original form; in other words, the original value cannot be obtained from the hash. Any small change in the input value generates a completely different hash.

To choose appropriate hashing algorithms for these implementations, it is important to rely on industry-accepted standards such as NIST SP 800-107 (Revision 1).

Example of hashing (one-way hash):

Truncation:

This method removes or replaces part of the card number so that the full number is not stored and therefore cannot be maliciously used if captured. It is important to indicate that hashes cannot be used to replace the truncated segment of the PAN. Unlike strong cryptography, which hides the PAN but allows its recovery with the key, or tokenization, which replaces it with a value mapped to the full PAN elsewhere, truncation involves permanently removing segments of the PAN in storage, making recovery impossible.

Example of truncation:

Token Index (Tokenization)

okenization is a security technique that replaces sensitive card information (especially the PAN) with a non-sensitive value called a token, which has no intrinsic value nor mathematical relationship to the original data, rendering it useless to an attacker in the event of a breach.

Example of tokenization:

Strong cryptography with associated key management processes and procedures

This method consists of implementing encryption with modern, secure algorithms, sufficiently long keys, and a strict and secure process for managing those keys, making card data useless to unauthorized parties. This is essential for protecting cardholders from potential fraud.

An industry-accepted guide for choosing cryptographic algorithms and mechanisms is NIST SP 800-175B (Revision 1). It is important to remember that a strong algorithm is useless if the keys are weak or compromised. If more detail is required on how to manage cryptographic keys, NIST SP 800-57 Part 1 (Revision 5) provides the necessary information.

Example of robust encryption:

If there is a business need to store the full card data for later use—for example, recurring payments, subscriptions, integration with multiple payment gateways, or reporting to oversight entities—then encryption or tokenization must be chosen. If PCI DSS provides four options, why not use hashing or truncation? The answer is simple: these last two methods provide irreversible protection of the data. From the hash or truncated card number, the full PAN can never be obtained; therefore, business needs would not be met.

Returning to the need to store full card data for future processes, the decision between strong cryptography or tokenization remains.

Which approach should be selected? Which is better?

Strong cryptography may be beneficial when:

▪️The applications have very high transactional volume; local encryption/decryption
      may be more efficient than calls to a tokenization service.

▪️Direct control over data and security processes is required; however, this demands
      mature security teams and processes.
▪️Integration with legacy systems is required and these systems adapt more easily to
      encryption than to the implementation of tokenization service calls.

Tokenization generally offers greater benefits in terms of:

▪️Reduction of PCI DSS scope: Tokenization is superior when the main objective is minimizing
      the compliance perimeter. By replacing cardholder data with tokens, systems handling tokens
      fall out of PCI DSS scope, provided they reside outside the CDE network.
▪️Risk mitigation: In case of a security breach, attackers only obtain tokens with no commercial
     value outside the system.
▪️Multi-system referencing: When multiple applications or systems need to reference the same
     card data, tokenization allows them to work with tokens while sensitive data remains
     centralized in one secure system.
▪️Compliance reduction via service providers: When a PCI DSS-certified tokenization service
     provider handles card data capture and associated processes.

Tokenization

Now that we have reviewed the benefits of tokenization in environments with card data, we will examine different implementation models and their main characteristics.

Vault-Based Tokenization

Vault-based tokenization refers to storing sensitive data in a secure vault, where it is encrypted and replaced with a token. This token acts as a reference to the original data but does not reveal any sensitive information. The vault becomes the central point for managing and controlling access to sensitive data. This is the most traditional method. In this system, the PAN is stored in a highly secure database called a "token vault".

Operation:

When the real card number is received, the tokenization service generates a unique token (usually alphanumeric). The real card data is stored encrypted using strong cryptography inside the vault along with its corresponding token. Only the token circulates through the organization’s systems.

To process payments, the vault is queried to retrieve the real card data.

Advantages:

▪️Maximum security by centralizing sensitive data.
▪️Full control over the tokenization process.
▪️Simplified PCI DSS compliance.
▪️Tokens are reversible when necessary.

Disadvantages:

▪️Requires robust and often costly infrastructure (hardware and software).
▪️The vault becomes a single point of failure.
▪️Higher operational complexity.
▪️Full responsibility for data security rests with the organization.

Vaultless Tokenization

Also known as cryptographic tokenization, this approach does not require a token vault database. Instead, secure cryptographic devices are used to replace sensitive data with a unique token. This method is often considered more efficient and secure due to the absence of a vault that could be compromised. Sensitive data is never stored; only the unique token is used for identification or recovery purposes, eliminating the risk of data breaches related to stored data.


Operation:

A cryptographic algorithm (such as AES) with a secret key is used. The card number is encrypted with this key to generate the token. Since no database storing token-to-PAN mappings exists, retrieving the original PAN requires decrypting the token with the same key. This highlights the critical role of cryptographic devices in vaultless tokenization.

Advantages:

▪️Preserves data format; suitable for systems needing original data structure (e.g., databases or
     legacy systems).
▪️No database to protect or that could be compromised.
▪️Less required infrastructure.
▪️Easier scalability.
▪️Lower operational cost.

Disadvantages:

▪️Security depends entirely on protecting cryptographic keys.
▪️If the key is compromised, all tokens are exposed.
▪️Less flexibility in token format.
▪️Key rotation may be more complex.

Vaultless Tokenization in Cloud Environments

Vaultless tokenization relies on Format-Preserving Encryption (FPE), a specialized method of encryption that maintains the original data format. In a tokenization process, FPE provides:

▪️Format preservation: Keeps the length and structure of the original data (e.g., a 16-digit PAN remains 16 digits).
▪️Reversibility: Unlike vault-based tokenization, FPE allows deterministic reversal using a cryptographic key.
▪️Security: High cryptographic security while maintaining data usability.

Common FPE standards:
▪️FF1 (NIST SP 800-38G)
▪️FF3 (revised version)

For organizations using public cloud infrastructures, each provider offers specific key-management and encryption services to facilitate vaultless tokenization.

The following table summarizes cloud-provider services relevant to vaultless tokenization:

Component AWS Microsoft Azure Google Cloud Platfomr
Cryptographic Key Management AWS Key Management Service (KMS)  Azure Key Vault   Cloud Key Management Service (Cloud KMS)
Secrets Management AWS Secrets Manager  Azure Key Vault   Secret Manager 
Parameter Storage AWS Systems Manager Parameter Store  Azure App Configuration   Cloud Runtime Configuration API 
Serverless Functions AWS Lambda  Azure Functions   Cloud Functions 
API Management Amazon API Gateway  Azure API Management   Cloud Endpoints 
Authentication and Authorization AWS IAM  Azure Entra ID   Identity and Access Management (IAM) 
Auditing and Logging AWS CloudTrail Azure Monitor / Azure Audit Logs  Cloud Audit Logs 
Monitoring and Metrics Amazon CloudWatch Azure Monitor   Cloud Monitoring 
Network and Security AWS VPC / Security Groups Azure Virtual Network / NSG   VPC / Firewall Rules 
Database (if required) Amazon DynamoDB / RDS  Azure Cosmos DB / SQL Database   Cloud Firestore / Cloud SQL 


Roles of the components in a vaultless tokenization environment:

▪️Cryptographic Key Management: Generates, stores, and rotates master keys used for
      tokenization algorithms.
▪️Secrets Management: Stores credentials, access tokens, and other secrets needed for service
      communication.
▪️Parameter Storage: Saves application configurations such as tokenization algorithms, token
     format policies, and operational parameters.
▪️Serverless Functions: Execute tokenization and detokenization logic.
▪️API Management: Exposes secure tokenization/detokenization endpoints.
▪️Authentication and Authorization: Controls access to tokenization services.
▪️Auditing and Logging: Records all tokenization operations for regulatory compliance.
▪️Monitoring and Metrics: Tracks performance, availability, and system health.
▪️Networking and Security: Isolates tokenization traffic and enforces network-level controls.
▪️Database (optional): Used minimally for logs or metadata, never for token-to-value mappings.

Each component works together to create a secure ecosystem in which sensitive data can be tokenized without maintaining mapping databases, using deterministic algorithms based on cryptographic keys.

Conclusions

Tokenization can reduce PCI DSS scope because systems storing tokens instead of real card data fall outside PCI requirements. However, this depends heavily on correctly isolating card data flows across all components of the environment, working alongside the QSA.

It is crucial to understand that although tokenization reduces PCI DSS scope, it does not eliminate compliance responsibilities. Organizations must still protect the initial data-capture process and ensure that tokenization systems comply with appropriate security standards.

When using tokenization in the cloud, the entire system depends on the cloud provider’s key-management service. If the KMS fails or is compromised, the entire solution is affected. The shared responsibility model must not be forgotten—the organization remains ultimately responsible for secure configuration, access management, and the correct implementation of cryptographic algorithms.



References:
🔗 Directive (EU) 2022/2555 of the European Parliament and of the Council of 14 December 2022 on measures aimed at ensuring a high common level of cybersecurity across the Union: https://eur-lex.europa.eu/eli/dir/2022/2555/oj?locale=es
🔗 Commission Implementing Regulation (EU) 2024/2690 of 17 October 2024 laying down implementing provisions of Directive (EU) 2022/2555 with regard to the technical and methodological requirements for cybersecurity risk management measures: https://eur-lex.europa.eu/legal-content/ES/TXT/?uri=CELEX:32024R2690
🔗 Draft Law on Cybersecurity Coordination and Governance

🔗 Unraveling the Main Updates of the NIS2 Directive
🔗 CCN Portal: https://www.ccn.cni.es/es/normativa/directiva-nis2
🔗 INCIBE Portal: https://www.incibe.es/incibe-cert/sectores-estrategicos/NIS2-necesitas-saber