Editor’s note: This is part three of a series on generative AI in cybersecurity.

Part one: Understanding and Leveraging Generative AI in Cybersecurity

Part two: The Intersection of Generative AI and Cybersecurity

Part four: CISO’s Guide: Six Steps to Start Adopting AI

As discussed in parts 1 and 2 of this series, generative AI has the potential to supercharge security operations. The benefits of applying AI can affect the full threat detection, investigation, and response lifecycle. As your organization looks to take advantage of this emerging technology, you should consider three key questions:

  • Is your organization prepared to architect a solution themselves or will you be looking for a partner to do the heavy lifting?
  • Are you planning to leverage generic models or instead build customized models infused with your own data?
  • How do you intend to balance the power of generative AI with potential risks, including output validation and data governance?

In this blog, we’ll explore these questions and their implications for security operations.

Should You Build or Buy?

There are two approaches to implementing a generative AI solution for security operations: build your own or pay for a third-party solution. Making this choice requires careful consideration. The primary factors in play here are cost, risk, and accuracy.

Cost areas fall into two buckets: expertise and operations.

  • In a perfect world, you would have internal resources with deep knowledge of both generative AI and security operations. However, these resources can be hard to come by since generative AI is a relatively new field. This is particularly true if your organization doesn’t have an internal data science team.
  • Operating your models can be costly as well, as there is a considerable amount of overhead involved, especially if you are leveraging generic models, which charge on a per-use basis. There are ways to mitigate this—we’ll dive into that below.

Risk is primarily focused on data—both in determining whether you have the right data to train the models and in maintaining the privacy of that data. In the context of security operations, this data may include historical incidents, high-value targets, network topologies, standard operating procedures, and more. Generally, this information is stored in a variety of different formats or may not be documented very well to begin with. As with any sensitive data, you must consider the privacy implications of sharing it with an untrusted partner or sending it to an external, generic model for processing and storage. There is an additional risk to consider in regulatory pressure, but it is not something we will be delving into within this blog.

Accuracy is a critical factor in any data science project, especially when it involves generative AI. As we discussed previously in this series, generative AI models have the potential to “hallucinate,” or essentially make up answers. While these hallucinations may take several forms, ultimately, they impact the model’s accuracy and the organization’s ability to trust its output.

Unfortunately, the reality is that 50% accuracy isn’t good enough. If the model is only correct half the time, it’s essentially useless in real-world applications. There’s a very high bar that must be met for the model to provide any value, and its accuracy must be continuously validated to ensure it remains above a specified threshold.

Our Take

When deciding between building your own generative AI solution or leveraging a third-party platform, you should first conduct a cost-benefit analysis. If you have the right in-house resources and expertise to create a tailored solution that meets your specific needs, then building it yourself might be the right option. However, if you lack resources or expertise, or if your cybersecurity needs are more general, choosing a third-party solution can save time, money, and effort. Ultimately, you must weigh the pros and cons based on cost, risk, accuracy, and your organization’s individual needs.

Generic vs. Customized Models


Having decided whether to build a solution in-house or purchase a third-party tool, you’ll now need to consider the underlying models being used. These can be broken into two categories: generic and customized. Generic models, such as GPT4, have been widely adopted as they are shared, publicly available resources. On the other hand, customized models are dedicated instances, hosted either locally or in your private cloud environment, and trained to increase accuracy for specific tasks.

Foundation models are the core building block to generative AI. Each model has a unique set of advantages and disadvantages. These all depend on how the model was built—the number of tokens, number of parameters, and the quantity and quality of the training data. Generally, larger foundation models provide more versatility (can be adapted to many use cases) but come at a higher operating cost and cannot be fine-tuned.

One workaround is to take a smaller, open-source model and conduct additional training to refine its output. This is what we mean by a “customized” model, as you can train it on an internal data set and for a particular use case. A significant benefit to customized models is the ability to infuse them with the institutional knowledge of your organization. Depending on the use case, it may make sense to have multiple customized models, all tuned for very specific tasks.

Let’s review several examples of when it makes sense to leverage both generic and customized models within the context of security operations.

When to Use Generic Models

  • Generic models, such as GPT-4, have a lower barrier to entry, as they can be quickly connected to via API. This makes them a great first step towards experimenting with generative AI.
  • Generic models are a shared resource, which means they are limited to their initial training data plus the small amount of context that can be added within the prompt itself. As a result, they are well suited for generalized tasks, such as incident summarization or as an interactive chat bot.
  • On the other hand, generic models do not allow for fine-tuning to specific tasks and may be cost prohibitive to operate at scale.

When to Use Customized Models

  • Customized models can be infused with internal data and fine-tuned to increase accuracy for specific tasks. This makes them a great option if your organization is looking to increase the performance of its large language models (LLMs) past what can be accomplished with in-context learning.
  • Smaller, task-specific customized models perform well as building blocks within a larger workflow or data pipeline. With each block tuned to its respective task, you can increase performance while decreasing operating costs. Examples of these tasks may include log parsing, historical correlation, incident deduplication and disposition.
  • While great for specific tasks, fine-tuning of these customized models may decrease general performance for tasks outside the scope of the tuning data set.

Generic vs. Customized: Our Take

Ultimately, models drive output and deliver the desired outcomes. With all the hype around generative AI and LLMs, not enough attention has been placed on the foundational components, including the underlying models. Both generic and customized models can provide value, but they must be used effectively. A hybrid, multi-model approach is likely best for complex use cases, though finding the right balance will be largely dependent on the task at hand.

Other Things to Consider

“With great power comes great responsibility.” Generative AI may not have much in common with Spider-Man, but this saying holds true. Leveraging AI technology responsibly requires careful consideration of the following:

Output Validation at Scale

As generative AI becomes further ingrained within daily operations and business-critical applications, validating its output becomes ever more necessary. To validate your output, you may want to determine a grading rubric to identify what “good” looks like, including measures of both accuracy and completeness. You could also deploy a second grading model that has the sole purpose of generating these scores. To combat hallucinations, you should incorporate detection and removal of into the larger workflow. This process will essentially compare input and output pairs, checking for anomalies and responding accordingly.

Data Governance

One of generative AI’s greatest abilities is to ingest, synthesize, and reference massive amounts of data. While incredibly useful, this ability can also pose a significant risk.

To improve data security, avoid using free platforms like ChatGPT, particularly for sensitive information. Instead, you can use either generic or customized models, provided they are dedicated instances rather than a shared resource. This further mitigates the risk of potential data leakage.

Measuring Success

Ultimately, the objective of any generative AI project should be to benefit the organization in a clear, measurable way. For security operations specifically, this may take the form of increased analyst efficiency, improved threat detection, and reduced MTTR. This can either be measured against historical data or achieved through A/B testing with a subset of users. User feedback is extremely valuable as it increases buy-in and overall adoption. This feedback can also be used for reinforcement learning and further fine-tuning of the models.

Wrapping Up

Generative AI can greatly improve security operations by speeding up threat detection, investigation, and response. When creating an AI strategy, consider aspects such as building your own solution or using a third-party platform, selecting between generic or customized models, and handling AI-related risks. Due to the expenses, challenges, and risks of developing a generative AI strategy and building it yourself, most organizations are likely to benefit from adopting a security operations platform infused with generative AI, such as ReliaQuest GreyMatter. This approach saves time and while ensuring you follow best practices in this evolving field.