Bringing the Security Analyst into the Loop: From Human-Computer Interaction to Human-Computer Collaboration

Share Share Share Share Share
[s2If !is_user_logged_in()] [/s2If] [s2If is_user_logged_in()]

Researchers spent three days at the first of these two SOCs, and one day at the other. While visiting the two SOCs, researchers shadowed six different security analysts and met with one security leader and his direct reports. Visits, with permission, were taped using an audio recorder and transcribed afterward. Researchers did take pictures, although these cannot be shared as they contain client data. Research goals centered on the following three objectives:

  • Understand how security analysts currently monitor threats and analyze, diagnose, and triage security incidents and what drive these behaviors;
  • Understand how analysts are and are not using Advisor today to help them meet their objectives and why; and
  • Identify how the team might improve Advisor so that security analysts can complete their investigations more efficiently.

Findings and recommendations from the ethnographic research were used to fuel an internal workshop that led to the identification of three user goals to guide the design and development of the next major Advisor release. The following user goal drove the reinterpretation of the graph: An L1 security analyst can view what really happened in their network for a potential incident and complete triage five times more efficiently.

After the workshop, additional user research was conducted to “validate” user needs associated with each of the identified goals and assess how different design concepts developed by the team did or did not help users achieve the stated goal. Most relevant to this case study are interviews with five security analysts recruited through respondent.io that focused on gathering user feedback on a set of alternative concepts, as well as discussions with eight additional security leaders and analysts from five different Advisor clients regarding the final design concept.

KEY FINDINGS

Competing for Analyst Mindshare

Finding #1: Security analysts are reticent to incorporate new tools into familiar work routines, especially if they trust their existing tools and are effective in using them.

Security analysts have many tools and resources – open source, public, and commercial – at their disposal to help them monitor network traffic for suspicious behavior and activity. Besides QRadar, the research team witnessed analysts using an array of network security devices (e.g., antivirus, firewalls, intrusion detection and intrusion prevention systems), threat intelligence feeds, anomaly detection and user behavior analytics, network access controls, and application-, network-, host- and infrastructure-related log collection. Information overload is a real problem for security analysts, especially because many of these tools and data sources are not well integrated, forcing analysts to manually dig through these sources of data and correlate them.

With all the data they must collate and dig through, security analysts have developed their own practices and strategies, strategies which include the use of popular free tools and data. QRadar Advisor competes with these existing tools and resources in the minds of analysts, and it doesn’t always win.

“I don’t know if I really use it [Advisor] that much, because I have so many other tools that I’m looking at on a daily basis.” — Security Analyst

The Need for the Human Element

Finding #2: Security analysts rely on their own personal experience and knowledge of their network to assess if an offense is evidence of a breach or a “false positive.”

The QRadar offenses investigated by analysts often are complicated, and the tools that they use are imperfect. Prior to starting an investigation, security analysts want to know which offenses to work on first. Offenses are not all equal in how critical they are to an organization, and not all offenses represent an actual security breach. Critical offenses are those that represent great harm to an organization, its reputation and digital assets. They often involve privileged users with system privileges or data access rights that others in the company don’t have. Imagine if a phishing attack successfully compromised the Chief Financial Officer’s laptop. That would be a critical security incident.

Sometimes offenses are “false positives,” however, meaning a breach did not actually occur. There are a number of reasons why false positives happen, including: the rules are not tuned well enough to be able to recognize an action or event as benign, an application does not have access to all of the internal security data that is generated by a large network, and threat intelligence is not nuanced to distinguish URLs that are fine but are hosted on an IP address deemed malicious. As one security analyst told the team:

“I’ve had in the past where you guys have flagged legitimate traffic as, you know, malicious, and once I go down to the URL level, and I look at your threat intelligence, you guys have flagged a different site. It’s hosted on the same IP, but I get 20 false positive offenses because there’s some article about some celebrity hosted on some website in India where it’s hosted on the same IP. And we operate in India, I’ve got staff, they’re allowed to read the news, and when they come online, they share the story … and I get a flood of offenses, and I go wild thinking like, ‘Oh crap, we’re getting like a mass infection event or something.’ And it turns out it’s not incorrect intel, but intel being incorrectly applied.” – Security Analyst

Security analysts believe that there is no solution, powered by AI or not, that can completely know their network like they do. Not surprisingly, then, security analysts are suspicious of claims around automation and of AI omniscience: “Trust but verify” is a mantra the team has heard over and over in working with security analysts. Security analysts recognize that software is imperfect, and they see themselves as filling in the gaps of their security tools by providing the “human element.”

“You have rules that caused the action to fire. In most any kind of programming, you cannot account for all variables. That’s why you still have to have the human element to this, because it could be a benign thing between local and local. But it could easily be remote to local or local to remote with the same type of activity.” — Security Analyst

Prioritizing Immediate Versus Potential Threats

Finding #3: Security analysts are more focused on protecting their organization’s security posture from immediate threats than hunting down potential threats.

In conducting ethnographic research, Advisor researchers discovered that security analysts focus more on identifying “what really happened” during a security incident than “what could have happened.” The work of analysts consists of “putting together the trail to determine what happened or caused the issue.” Things that “might have happened” or “could have happened but didn’t” are simply of secondary importance for them.

“That’s the whole point of the [SIEM] analyst. You have to analyze this data and come up with what’s going on. You have to be an archaeologist of IT as you mine the information.” — Security Analyst

“In my field, ultimately it’s making sense of a lot of information and trying to glean what caused the incident generally after the fact. It’s a lot of firefighting.” — Security Analyst

Because analysts are so focused on the highest priority incidents, most of them do not feel that they have the time (or the mandate) to hunt for threats in their network proactively. This prioritization of immediate over potential threats has had a direct impact on security analysts’ approach to Advisor and its knowledge graph. At the time of the research, analysts perceived Advisor as a tool for “threat hunters” that “have the time … to keep delving.”

“This here [graph] gives the customer … the chance to look at these other IPs because they have time, they have resources, to look at this and further research it. We are dealing with events that are occurring.” — Security Analyst

In the eyes of security analysts, their job is different than that of threat hunters”: “An analyst’s job is purely to look at the security posture, the security stance. Was that a breach? Was there an issue?”

A Confusing Knowledge Graph

Finding #4: Security analysts, especially less experienced analysts, do not know how to interpret the graph and thus do not understand the value it brings to their work.

Spending time in the SOCs, the research team concluded that limited adoption and usage of Advisor was the result of not one but several factors. Unfortunately, not all of these variables could be addressed by the Advisor team. For example, network topologies are often out-of-date, and, as a result, QRadar does not have an accurate or comprehensive view of the entire network. Solutions to this challenge were deemed out of scope for the project. The research team, however, did believe that there was one issue that could be addressed to great effect. Security analysts, the lead researcher argued, did not see value in the graph because the graph was confusing and didn’t present information in a way that answered the questions analysts pose in determining the nature and extent of a possible breach.

On the one hand, security analysts’ decision not to launch an Advisor investigation can be seen to be the result of their interpretation of how Advisor works and the information it provides.

“My understanding is that it’s an assistant to pull QRadar info in so you don’t have to go through all of this QRadar information …so with QRadar being pulled in, if you get this message here [in the Insight paragraph of Advisor] saying we found nothing, then you’re not clicking on Investigate, it’s all working background.” – Security Analyst

On the other hand, the research also suggests that analysts are hesitant to use Advisor because of the complexity of the knowledge graph and their difficulty in knowing how to use and interpret the contained information.

Analysts, the research team discovered, want a solution that brings together all of the disparate information they usually have look up manually and presents it in such a way that they can quickly answer the following questions:

  • Was a connection made from inside the network (by a computer, a device, an application, etc.) to an IP or URL that is associated with threat actors and attacks, or was it blocked?
  • If a connection was made, is it a local-local connection or a local-external connection?
  • If a local-external connection was made, what local assets are involved, and are they critical assets (e.g., the computer of the company’s Chief Financial Officer)?
  • If a local-external connection was made, was malware actually executed by a user?
  • What type of attack (e.g., malware; phishing, denial of service) is being used against the network?
  • Is this an evolving attack or something that has been contained?
[/s2If]

Pages: 1 2 3 4

Leave a Reply