But my ‘XYZ’ just added SOAR capability

How many vendors at RSA 2019 magically now do SOAR… I lost count (and I’m reeeally good at counting).

Unfortunately this trickled down into people minds, and I hear “but my endpoint will do SOAR”  (sorry, I’m not picking on endpoint specifically).

So let’s analyse the reality of ‘we added SOAR’

Source Agnostic

SOAR should be agnostic for where an alert/alarm/trigger comes from.

Example – If your Endpoint is also your SOAR platform, is it still as functional when the alert is generated in Amazon GuardDuty alerts, or from Jira tickets?

Integration Count

Ok great your ‘me too’ platform can integrate with MISP, ePO, ActiveDirectory, Cuckoo. But who has those exact technologies? Your toolsets will change and grow over time.

Example – A SOAR platform has hundreds on integrations. Anything less means a gap, and you will still do all the work yourself.


If solution ‘XYZ’ has workflow built in, is it designed around the functionality specific to that product?

Example – Would a Deception technology with SOAR understand and support workflow needed for Vulnerability management?


So if your “me too” SOAR solution…

  • can’t trigger from multiple sources
  • only integrates with 30% if your security stack
  • can’t handle half the workflows

…how can you get any meaningful reporting out of it?


I’ve heard of vendors saying ‘we do SOAR’ when in reality they just have integrations, and maybe you can change the order, but that’s not SOAR.

And I’ve not even covered: customisation, load balancing, RBAC, multi tenancy, threat intelligence tracking, custom IOC definitions, collaborative work spaces, and dozens more.


Removing insider threat from processes

Here is another interesting chat at RSA Conference this year. A gentleman approached me asking if we could help with their problem of moving data and insider threat.

His organisation policy makers were happy to use Cloud for standard business services, but not storing their sensitive data (he wouldn’t tell me the specifics). Anytime they wanted to move data from that ‘area’ of the network to the cloud they were refused by policy in case there was data leakage… you know… just to be safe.

His first problem I couldn’t help with, apparently encrypted VPN isn’t safe enough for transmission. Maybe they will end up with sneakernet and a suitcase + handcuffs.

The second problem though was a great use case for SOAR, and not one I’ve come across yet. The data source and data destination were from different vendors with no existing integration together. This means the process is very manual and potentially exposes sensitive data to the insider threat / operators.

So I demonstrated our playbook execution and how we communicate with end users. The final pseudo design we agreed on was:

  • A playbook that can be initiated by a schedule or by an inbound request
  • The playbook automatically restricts permissions of the ticket. Access is only granted with 2 pairs of eyes.
  • The playbook fetched the data from vendorA
  • The playbook then did some basic pattern matching against the data, file type checking, maybe push it through a DLP, and many more.
  • If the data was sensitive we can stop the process, flag the ticket, etc.
  • If the data was good we push it to the remote system and close the ticket.
  • However If the data was neither definitely good or definitely bad we can use CommunicationTasks to email a manager and the original ticket requester asking what to do? Proceed or stop?
  • Using our ComTask we can interactively engage the end user without exposing the data in question (see above)

To summarise, they can still do the process (quicker than before and with fewer mistakes), they’ve removed visibility to the data, but their workers still have the control to initiate and control the workflow. Pretty cool.

Thought not predominantly a SOC incident type, it shows that automation is automation, be as creative as you like.


RBAC != MultiTenancy

RBAC means having all data in one pile, relying on a policy you create, and sharing resources (e.g. execution).

Multi Tenancy means dedicating resources, storage to particular clients.

For SOAR, we find there are 3 main use cases for Multi Tenancy

  1. One organisation using MT for internal purposes having each Tenant per business unit
  2. A MSSP ‘black box’ offering
  3. A MSSP offering that allows end clients access to the data/processes

Depending on your use cases RBAC might be acceptable for #1

If the data is not sensitive, and you are 100% confident of never making mistakes RBAC might be acceptable for #2

But RBAC should not be used for #3. 

Imagine the consequence one of these quite simple mistakes

  • inbound data differs from normal is not mapped correctly
  • an analyst fat fingers a ticket and removes a tag/role
  • the platform config is reverted to a state where the rags/roles were different
  • a dev mistakenly maps an automation task to the wrong API credentials (pointing to the wrong destination server)
  • …..etc

If your data, processes, credentials are sensitive, make sure whatever you deploy is actually Multi Tenant, and not simply using RBAC.


(Credit Dan)

Getting Married with SOAR (Part1)

Last year, I proposed, and she said yes. The best day ever!

But later the same year ‘we’ agreed I was taking responsibility for the invites.

Part 1 of this series – Data, tickets and QR Code
Part 2 of this series – Printed invites for end users
Part 3 of this series – Reporting and dashboards

Data, tickets and QR Code

She said

You need to get all the address

It’s around 100 people

Email the data to the printers

Track responses

Are you even listening??

I heard

Collect data

Standardised schema



“wow Andy you’re so great”

I’ve previously covered “what makes a good playbook” and for me this ticks all the boxes

I thought about the final design and ended up at:

  • Interactive web frontend to using API for all the information for guests
  • Interactive RSVP
  • This will require each person use a unique code/password
  • Though creating online accounts is too complicated for Great Aunt Betty…
  • …so use a minimal URL including a unique code
  • And for mobile device convenience, a QR Image with the code baked in

With the needs identified I mocked up this workflow.

So here in part 1 of this 3 part series, are all the steps along with images!

Create a Demisto instance specifically for my Wedding

Design the schema for “guest” and map these to a “new ticket” form

Time to add all the guests (originally I started off typing these in manually, but I realised I’m lazy so I created a CSV, and then wrote a playbook to import and process each row)

The above playbook executes once creating dozens of tickets, one for each invitee. Each ticket then runs a playbook to prepare and process itself

The QR Code task calls the QRCodeMonkey integration. Here we give it a string “https://__url__/__uniquecode__” and it returns an image

Here is a QR Code generated by the automation (dummy data)

When I was happy the data looked right. I wrote a playbook to loop through every invitee and send an email to the printing company that contained the invitee names, address, unique URL and QRCode image as an attachment

Playbook which loops with counter
The loop

And here is the output, lots emails, each specific to a specific invitee.

Here is the real benefit. Should we make any changes to the URL, email, QRCode, or individual invitee we can make 1 simple change and then execute the playbooks again. All data is regenerated, reprocessed and automatically emailed out… all with 1 click !

Stay tuned for Part2 and Part3


Does your RFI demand a ‘good API’?

I’m starting to see a few of these, and I love it !!

True it’s a vague question to answer or measure, but as a customer the best way to be heard is at RFI stage. Vendors take note of every single RFI “non compliance”.

I recently linked to an article to explain the importance of API to senior leadership:

But I’m now seeing several RFI that demand it. I see too many solutions that still don’t have an API or an API that is a cheap after thought.

It’s 2019 everyone, keep putting that pressure on vendors to help us make our own lives easier.


Unit Testing Operational Playbooks

I was recently asked about unit testing, using a SOAR platform to routinely test its own playbooks for errors. I hope to implement each of these in the future as a POC, I’ll update with links when this happens.

Breaking it down I see three uses cases here.

In this article ticket and incident are used interchangeably

1: Unit testing a particular task

Individual tasks can be tested in isolation away from the full production playbook, we define specific input for each test and we know the expected output. Two ways to test a Task

i) check the task output (or context location) to see if the data exists/has length/has correct values
ii) for tasks with no output, check immediately after a task whether Exit Status is 0 or 1 (similar to Linux)

For this we need

  • The task to test
  • Immediately after we check ExitStatus
  • A task to validate output
    • This process is wrapped in a parent playbook #A that loops, each time passing in new test criteria
    • Playbook #A summarises the results and updates itself for future reporting

2: Unit Testing a Subplaybook

Whilst not technically mandatory, sub playbooks will have input/output defined (otherwise what’s the point in them?). We can call this subplaybook in our Test playbook, specifying the input, then watching the Subplaybook for it’s formal output.

For this we need

  • The subplaybook to test (with input and output configured) #B
    • A parent playbook #C that calls #B with new input and tests each round of output
      • #C is called from playbook #D that defines the array of test data

3: Unit Testing a ‘parent’ playbook

As a ‘parent’ playbook is never called from another playbook it’s input will be passed at ticket creation (meaning one ticket per test unlike above where 1 ticket can iterate through many values in a loop). Likewise at the end data is not passed out anywhere so we have to query the ticket to test execution.

To test, depending on the playbook, we might consider:

  • Initiate the incident, wait 2 minutes and check for status to see whether the playbook finished, or if it’s stuck (Error or human input required)
  • Like above we can check some ticket fields for expected output
  • When the ticket has ended we check the execution run time for any anomalies

For this we need

  • The playbook to test #E
    • A parent playbook #F that can create other tickets setting input parameters
      • #F is called from playbook #G that defines the array of test data and is scheduled to run once every day.

Hopefully in the not too distant future I’ll get these working and uploaded here for reference.