In the first part of this article, we reviewed the timeline of the key milestones in the emergence of voice assistants, as well as the technologies on which they are based.
Current Uses of Voice Assistants
Voice assistants have become key tools in many areas of daily life, from the home to professional environments. In smart homes, these devices allow users to control lights, adjust temperature, and manage security, improving comfort and energy efficiency. For day‑to‑day organization, they offer reminders, alarms, and calendars, helping users manage tasks easily and effectively.
They are also an advantage in e‑commerce, where they simplify ordering and provide personalized recommendations, streamlining the shopping experience. In addition, they offer instant access to information and entertainment—such as news, music, and movies—tailored to the user’s preferences. Their role in education is increasingly relevant, supporting language learning, answering students’ questions, and assisting with various tasks. They also enable people with physical or visual disabilities to control their environment through voice commands, contributing to autonomy and inclusion.
New applications continue to emerge as well. For example, in translation, Google Assistant’s “interpreter mode” converts conversations in real time, which is particularly useful for travel and multicultural settings. Another developing area is the possibility of using voice assistants to facilitate contract signing: although most contracts currently require digital signatures or biometric verification, voice authentication may eventually enable identity verification for such purposes.
With these and other emerging uses, voice assistants are constantly adapting and expanding their capabilities to meet users’ needs across different areas, offering accessibility and personalization like never before.
Security and Privacy Risks
As voice assistants become increasingly integrated into daily life, security and privacy risks are becoming more evident:
➡️ Mass Data Collection: Voice assistants store commands and, in some cases, entire conversations, raising concerns about third‑party access to this information. In 2018, it was revealed that an Amazon Echo (Alexa) device accidentally recorded and sent a couple’s private conversation to one of their contacts [17]. This happened when Alexa misinterpreted a command and activated recording without the users’ knowledge. The couple only found out when the contact who received the audio notified them, highlighting serious concerns about privacy and data security in voice assistant ecosystems.
➡️ Voice Profile Creation: By analyzing unique vocal characteristics (tone, rhythm, intonation), these devices can create user profiles that may be used to track individuals, turning voice into a biometric identifier. Some companies already use voice biometrics for user authentication. For example, Endesa and Mutua Madrileña allow customers to identify themselves by voice in their contact center processes [12], reducing the average 90‑second verification time to just 5 seconds. Another example is Microsoft’s VALL‑E system, capable of cloning a voice consistently from a 3‑second sample [10][11][15]. The risk that such voice profiles could be exploited in attacks aimed at impersonating users raises serious concerns about identity security. This is especially alarming given that cybercriminals can now generate consistent voice patterns from samples under 5 seconds—often obtained without the victim’s awareness.
➡️ Vulnerability to Attacks: Network‑connected assistants and other smart devices can be targeted by cyberattacks, compromising personal data and even physical safety. Some of the most notable vulnerabilities include:
- - Voice Command Injection: Hidden within songs or even white noise, attackers can embed commands that go unnoticed by users but trigger unauthorized actions such as opening websites, making or recording calls, enabling airplane mode, or even altering driving parameters in smart vehicles. Students from various universities have been studying these vulnerabilities for years [4][13].
- - Voice Spoofing: Attackers can impersonate legitimate users by using cloned voices generated through different technologies. This can trick systems into performing unauthorized actions, similar to CEO fraud attacks [5].
- - Cloud Data Interception: Intercepting data uploaded to the cloud for processing
- could allow cybercriminals to carry out the types of attacks mentioned above.
➡️ Unauthorized Commercial Use: Without proper controls, voice data may be used for unauthorized commercial purposes such as personalized advertising or selling user profiles. This can violate existing regulations like the GDPR and directly impact user privacy. Recently, multinational companies such as Amazon have been fined up to 746 million euros for their data‑processing practices [16].
➡️ Future Threats Related to Quantum Computing: Quantum computing could provide the means to break current encryption algorithms such as RSA or AES, exposing sensitive data that has been or is being used by voice‑processing systems. More information on this topic can be found in a dedicated article previously published on this blog [14].
Protection Measures
Users can protect their privacy by taking the following actions, among others:
➡️ Adjusting voice assistant settings. For example, disabling continuous activation (“Hey Alexa” or “Ok Google”) when it is not needed.
➡️ Reviewing and periodically deleting stored recordings.
➡️ Limiting data collection through available privacy options.
➡️ Configuring secure networks (Wi‑Fi with strong passwords).
➡️ Enabling additional authentication such as multifactor verification.
➡️ Keeping device software up to date.
➡️ Securing access to recordings and verifying their legitimate use when interacting with assistants in legal or commercial processes.
Regulations
Some of the legal frameworks that affect the use and development of voice assistants in Spain include:
➡️ Intellectual Property Law (LPI) [6]➡️ Law on Information Society Services and Electronic Commerce (LSSI‑CE) [7]
➡️ General Data Protection Regulation (GDPR) [2]
➡️ Organic Law 3/2018 of December 5 on the Protection of Personal Data and Guarantee of Digital Rights [8]
➡️ Telecommunications Law [9]
➡️ ePrivacy Regulation (currently in proposal) [1]
➡️ Artificial Intelligence Regulation [3]
Conclusions
However, these technological advances have also brought a series of risks to both information security and individuals. The uncontrolled mass collection of personal data, the fraudulent creation of voice‑based profiles, and the vulnerability to attacks in the digital world are some of today’s most significant challenges. In addition, the unauthorized commercial use of our data and the increasing frequency of unauthorized access to sensitive information are becoming more common and concerning issues.
For this reason, while we benefit from these innovations, it is essential that we take steps to protect our privacy. Companies must ensure greater transparency and security, and users need to be aware of how to safeguard their personal data. At the same time, an updated legal framework is necessary to regulate these technological developments so that we can continue enjoying technology without putting our security at risk.
[2] Diario Oficial de la Unión Europea. (04 de mayo de 2016). REGLAMENTO (UE) 2016/679. Obtenido de https://www.boe.es/doue/2016/119/L00001-00088.pdf
[3] Diario Oficial de la Unión Europea. (12 de Julio de 2024). REGLAMENTO (UE) 2024/1689 por el que se establecen normas armonizadas en materia de inteligencia artificial. Obtenido de https://www.boe.es/buscar/doc.php?id=DOUE-L-2024-81079
[4] Guoming Zhang, C. Y. (30 de octubre de 2017). DolphinAttack: Inaudible Voice Commands. Obtenido de CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security: https://dl.acm.org/doi/pdf/10.1145/3133956.3134052
[5] INCIBE. (09 de enero de 2024). Suplantación del CEO utilizando la técnica de inteligencia artificial deepvoice. Obtenido de https://www.incibe.es/linea-de-ayuda-en-ciberseguridad/casos-reales/suplantacion-del-ceo-utilizando-la-tecnica-de-inteligencia-artificial-deepvoice
[6] Jefatura del Estado. (23 de abril de 1996). Real Decreto Legislativo 1/1996, de 12 de abril, por el que se aprueba el texto refundido de la Ley de Propiedad Intelectual, regularizando, aclarando y armonizando las disposiciones legales vigentes sobre la materia. Obtenido de https://www.boe.es/buscar/act.php?id=BOE-A-1996-8930&tn=1&p=20220330
[7] Jefatura del Estado. (12 de Julio de 2002). Ley 34/2002, de 11 de julio, de servicios de la sociedad de la información y de comercio electrónico. Obtenido de https://www.boe.es/buscar/act.php?id=BOE-A-2002-13758&tn=1&p=20230509
[8] Jefatura del Estado. (06 de diciembre de 2018). Ley Orgánica 3/2018, de 5 de diciembre, de Protección de Datos Personales y garantía de los derechos digitales. Obtenido de https://www.boe.es/buscar/act.php?id=BOE-A-2018-16673
[9] Jefatura del Estado. (29 de junio de 2022). Ley 11/2022, de 28 de junio, General de Telecomunicaciones. Obtenido de https://www.boe.es/buscar/act.php?id=BOE-A-2022-10757
[10] Lingwei Meng, L. Z. (Julio de 2024). Autoregressive Speech Synthesis without Vector Quantization. Obtenido de https://www.microsoft.com/en-us/research/publication/autoregressive-speech-synthesis-without-vector-quantization/
[11] Microsoft. (2024). VALL-E. https://www.microsoft.com/en-us/research/project/vall-e-x/.
[12] Navarra Capital. (26 de mayo de 2023). Veridas implanta su biometría de voz en Mutua Madrileña y Endesa. Obtenido de https://navarracapital.es/veridas-implanta-su-biometria-de-voz-en-mutua-madrilena-y-endesa/
[13] Nicholas Carlini, P. M. (10-12 de agosto de 2016). Hidden Voice Commands. Obtenido de 25th USENIX Security Symposium: https://people.eecs.berkeley.edu/~daw/papers/voice-usenix16.pdf
[14] Prieto Carballo, J. A. (19 de enero de 2023). La Seguridad de la Información en la era de la Computación Cuántica. Obtenido de https://blog.isecauditors.com/2023/01/seguridad-de-la-informacion-en-la-era-de-la-computacion-cuantica.html
[15] Sanyuan Chen, S. L. (junio de 2024). VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers. Obtenido de https://www.microsoft.com/en-us/research/publication/vall-e-2-neural-codec-language-models-are-human-parity-zero-shot-text-to-speech-synthesizers-2/
[16] SiliconANGLE. (30 de Julio de 2021). Amazon ordered to pay $887M fine over data misuse. Obtenido de https://siliconangle.com/2021/07/30/amazon-ordered-pay-887m-fine-data-misuse/
[17] The Guardian. (25 de mayo de 2018). Amazon's Alexa recorded private conversation and sent it to random contact. Obtenido de https://www.theguardian.com/technology/2018/may/24/amazon-alexa-recorded-conversation