Does your RFI demand a ‘good API’?

I’m starting to see a few of these, and I love it !!

True it’s a vague question to answer or measure, but as a customer the best way to be heard is at RFI stage. Vendors take note of every single RFI “non compliance”.

I recently linked to an article to explain the importance of API to senior leadership:

But I’m now seeing several RFI that demand it. I see too many solutions that still don’t have an API or an API that is a cheap after thought.

It’s 2019 everyone, keep putting that pressure on vendors to help us make our own lives easier.


Unit Testing Operational Playbooks

I was recently asked about unit testing, using a SOAR platform to routinely test its own playbooks for errors. I hope to implement each of these in the future as a POC, I’ll update with links when this happens.

Breaking it down I see three uses cases here.

In this article ticket and incident are used interchangeably

1: Unit testing a particular task

Individual tasks can be tested in isolation away from the full production playbook, we define specific input for each test and we know the expected output. Two ways to test a Task

i) check the task output (or context location) to see if the data exists/has length/has correct values
ii) for tasks with no output, check immediately after a task whether Exit Status is 0 or 1 (similar to Linux)

For this we need

  • The task to test
  • Immediately after we check ExitStatus
  • A task to validate output
    • This process is wrapped in a parent playbook #A that loops, each time passing in new test criteria
    • Playbook #A summarises the results and updates itself for future reporting

2: Unit Testing a Subplaybook

Whilst not technically mandatory, sub playbooks will have input/output defined (otherwise what’s the point in them?). We can call this subplaybook in our Test playbook, specifying the input, then watching the Subplaybook for it’s formal output.

For this we need

  • The subplaybook to test (with input and output configured) #B
    • A parent playbook #C that calls #B with new input and tests each round of output
      • #C is called from playbook #D that defines the array of test data

3: Unit Testing a ‘parent’ playbook

As a ‘parent’ playbook is never called from another playbook it’s input will be passed at ticket creation (meaning one ticket per test unlike above where 1 ticket can iterate through many values in a loop). Likewise at the end data is not passed out anywhere so we have to query the ticket to test execution.

To test, depending on the playbook, we might consider:

  • Initiate the incident, wait 2 minutes and check for status to see whether the playbook finished, or if it’s stuck (Error or human input required)
  • Like above we can check some ticket fields for expected output
  • When the ticket has ended we check the execution run time for any anomalies

For this we need

  • The playbook to test #E
    • A parent playbook #F that can create other tickets setting input parameters
      • #F is called from playbook #G that defines the array of test data and is scheduled to run once every day.

Hopefully in the not too distant future I’ll get these working and uploaded here for reference.


What makes a good playbook

I’ve asked myself this many times, when is a automation a good solution? When is it not?

Let’s look analyse how an analyst might perform a Phishing query:

  • Copy paste the email into a new ticket
  • Inform the end user their request was acknowledged
  • Extract the URLs, copy paste them into Threat Intel platforms
  • Extract the IPs, copy paste them into Threat Intel Platforms
  • Get the file hash, copy paste them into Threat Intel Platforms
  • Copy the file, upload to a sandbox, paste results back into the ticket
  • Check if the URL in the email was similar to your corporate domain
  • If not malicious, email the end user and close the ticket
  • If malicious then we do more….
  • Email the end user saying “yes”, update the ticket
  • Update severity to “high”
  • Query SIEM, query other mailboxes for known IOC

Let’s analyse the workflow:

  • Most of this is simply copy pasting
  • The data structure is always the same
  • The process is always the same (and it’s extremely repetitive and boring!)
  • Analyst is logged on and is interacting with half a dozen different interfaces (leads to eye fatigue)
  • All the involved solutions (ticket systems, threat intel, SIEM, mail… ) all have APIs.

Let’s now analyse a potential workflow for insider data theft:

  • Analyse evidence to identify the Threat Actor
  • Identify scope of breach across network
  • Attempt to identify the intent
  • Inform the relevant parties (HR, Legal, etc), maybe include the Law
  • When appropriate, lock the user out of the appropriate system
  • Analyse potential losses, whether that’s IP theft, PII theft, financial, reputational and act accordingly
  • etc

Let’s analyse the workflow:

  • This is a process that happens infrequently (hopefully)
  • Whilst this process is standardised at a conceptual level, which can be represented in a playbook for process definition…
  • …every run through will be completely different
  • You will often interact with different data, in different systems
  • Original notifications to teams will be the same, but every communication will be completely unique
  • Humans don’t have API
  • Intent, reputation and loss need dedicated human input to determine.

To be clear, there is a great value in formally mapping this second scenario to a playbook, but I wouldn’t call it a primary use case for SOAR.


So what tasks are no-brainers for a SOAR platform? I believe it’s a process that…

  • …is always the ~same
  • …has many steps
  • …takes time to do
  • …is boring!
  • …can access/utilise API

I originally listed “…works across multiple platforms”, and whilst that’s fascinating to see in motion, I’ve taken it out because even SOAR enriching and empowering a single isolated technology can be a great solution for the right usecase.


OCR in a playbook

I recently hosted a demo where a Playbook analysed a phishing email, and amongst other things (e.g. enriching using Active Directory, detonating files in many sandboxes, interacting with the end user…) we compared IOCs against ThreatIntel.

I was asked “but what happens if the phishing link is actually an image of a URL?”

Keeping in mind that SOAR is not an anti phishing platform, and that our demo was more about automating and orchestrating across many different solutions…..

….I accepted the challenge.


First I had to find a platform that did OCR for me. You might have an enterprise instance in house, however I needed a free API cloud instance and OCR.Space does exactly that [1]

Next was to build the integration, which this case is lovely and simple (the full file is attached, but here are the two real lines of code)

    encoded = base64.b64encode(open(demisto.getFilePath(entryID)['path'], 'rb').read())
    r = urllib2.urlopen(url, urllib.urlencode({"apikey":apikey, "base64Image": "data:image/png;base64," + encoded }))

Now I need an image of a URL

Now I need a playbook. It should allow you to upload an image, decode the image, then set the output to something useful.

We can see the output of the OCR

And see how that ParsedText is then Set to the Details section of the Incident 🙂

Other uses

For my demo I actually integrated this into the main phishing playbook for links.

How about tactical testing of data exfiltration?

How about swapping out OCR Space for a de-steganography tool?

All data decoded should also go through IOC extraction (hash, IP, domain, machine name, custom regex structures) and be indexed against other tickets.

[1] Note – As I do not have a contract with OCR.Space I would never send any images that potentially contain sensitive data.