SOI.Solutions-logo
Search
Close this search box.

From Raw to Insightful: Expert Advice for Transforming Data with OCSF and Cribl

BLUF: Protips to expedite the implementation of OCSF with Cribl in your SOC.

The Open Cybersecurity Schema Framework (OCSF) is an open-source data model for common security events and logs. Its development is sustained by the significant efforts of contributors (personas) who are seeking a user-friendly and transferable cybersecurity schema. 

Centralizing data to a standard schema simplifies queries and interactions with the data, leading to quicker insights and decision-making. While it requires considerable work, it is beneficial once established.

Keep reading to learn more about the OCSF “personas” and how the Mapper persona utilizes Cribl to map fields to OCSF swiftly. Let’s get started.

What are OCSF Personas?

Personas refer to the individuals who perform the tasks – also known as users, air breathers, or human beings. We  expect to see AI taking over some of these functions in the future, but for now, we rely on carbon based lifeforms to perform these functions:  

Author – Members of the open-source community who manage the classes, categories, and objects that constitute the OCSF.

Producer – Contributors who produce events in the OCSF format.
Note: AWS has done an excellent job mapping their events, but you’ll need to use their Security Data Lake to access this feature.

Mapper – Data Engineers with a blend of Security Engineer and Threat Detection skills. They understand event context and can determine where events fit in the schema.

Consumer – Detection Engineers and SOC Analysts who reap the benefits of a single taxonomy to query logs and get results.

How OCSF and Cribl Work 

In this architecture, we assume there is a working and configured Cribl instance. If not, SOI can help get you there. Here are the high level steps to implement the solution; 

  • OCSF Field Mappings – Map the source data to OCSF; Both Class and Category mappings as well as field level mappings. 
  • Cribl Preprocessing Pack – Tag all data with the Class and Category for more efficient routing and scalability. 
  • Vendor Packs – Create vendor specific packs for field level transformation to OCSF Data Structures. 
  • Destinations – Make sure your destinations support OCSF and use schema validation practices. 

OCSF Field Mapping in 3 Steps

These initial three steps are performed in a knowledge repository tool such as Confluence or Notion and serve as your Data Catalog. This documentation identifies the Class and Category, then maps the source fields to OCSF fields.

  1. Collect Sample data
  2. Review data content and map to OCSF Class and Category.
  3. Review events and map source fields to OCSF Class Fields and Objects.

Context is crucial here. For example, the F5 product line is typically mapped to the “Network Activity” category since it’s a network device. However, we want to thoroughly analyze the events as the F5s will provide Web Application Firewall events which would be mapped to “Findings” class. Also, the F5 will generate console logs, none of which are Network Activity and map closer to Authentication. Taking the time to thoroughly review the context of the logs and how they are generated is critical to effective mappings. 

Once all your fields are mapped, it’s time to set up Cribl!

OCSF Preprocessing Pack

Assuming you have a working Cribl environment and can receive data today there is a pre-processing pack that allows you to quick-tag data based on a “key” field for the proper Class and Category.

Pro-Tips: 

  • The goal of the pre-processing is assigning cribl internal fields __ocsf_class and __ocsf_category to each event for routing through the rest of the pipeline.
  • For events where the “key” field is the Splunk sourcetype, create a single pipeline and use the lookup function to add OCSF fields.
  • Certain events require special handling. For instance, AWS eventNames are mapped to their respective categories and classes where the key is extracted and additional logic is applied. This process is handled in a separate pipeline where a REGEX extract function is used for the eventName, and then a lookup is used to map the value to its class and category.
  • When data is in a structured format, use the Parser function to parse early so that more fields are in use for routing and filtering. 

Vendor Packs

Each vendor has its own pack for field level mappings. Within each pack is a pipeline associated with the OCSF Category. This makes logical mapping easier and provides easy review and documentation.

 A few notes:

  • Map source field names to OCSF where possible. This is when the data is in JSON, CSV, or KV and can be extracted by the Parser function. 
  • Leveraging a pre-built model can expedite the processes for sources without field names (i.e. Cisco ASA). The Splunk Technology Add-on parses all the data to the Splunk Common Information model, which can then be mapped to the OCSF fields. 
  • Most OCSF objects (e.g. Enrichment, Metadata, User, etc.) can be copied from one pack to another. 
  • To standardize the process use a templated Cribl with common functions. 

Now Get to the Chopper!

Once the data comes out of the Vendor Pack, it is routed to a destination for analytical value. Just a few more pro-tips to consider when writing things out:

  • When writing to a filesystem destination with a partition, consider using the following partition for better management:
/${__ocsf_category}/${__ocsf_class}/${C.Time.strftime(ContextTimeStamp ? ContextTimeStamp : Date.now()/1000, '%Y/%m/%d/%H/${sourcetype}
  • When writing to Splunk:
    •  Consider changing the sourcetype by appending :ocsf to the end.
    • Use the JSON format for easier Splunk parsing.
  • Utilize an “unmatched” route where anything that has ocsf_category as null goes to a repository to be reviewed at a later date.

If you’re looking to implement complex data solutions to support security and risk mitigation, SOI Solutions can bridge the gap from ideas to successful integration. We work with numerous companies implementing highly scalable and cost effective solutions to get data from raw to insightful.

Reach out to our experts to schedule a demo now.

Share this:

Engage With Us!

General Questions? Want to discuss more? Ask us anything. We are here to help you do more with your data. Send us your info and we will be in touch.

Popular Categories

Let's Get Started

We are excited to hear from you!
Please fill out the form below and add a comment or question and we will be in touch.