Learning from Equifax: information security and corporate culture

in business •  7 years ago 

Before it's a technical problem, information security is a business problem. The recent breach at Equifax makes me wonder about the company's corporate culture. Here are some musings...


Introduction

[Image Source: pixabay.com, License: CC0, Public Domain]

The other day, Equifax announced that one of their web applications had been compromised from May until July, with a potential impact to 143 million Americans. Given that the population of the US is in the neighborhood of 320 million, and many of those 320 million are children, it's probably a safe guess to suggest that nearly every family in the US was impacted by this breech. As a result of this breach, bankinfosecurity writes:

The scale of the Equifax breach means that every SSN in the United States - together with the accompanying name - must be presumed to be public knowledge, and thus should not be used to validate anyone's identity, ever again, security experts warn.

It has been reported that the hackers may have made use of a known and patched vulnerability, that Equifax knew (or should have known) about the vulnerability, but that they hadn't yet undertaken the effort that would have been required to patch it. About that possibility, bankinfosecurity writes:

If attackers exploited an unpatched flaw, there's no excuse, Viewpost's Pierson says. "This all comes back to sound security development coding practices, active application scanning and testing, and integrating security into the engineering and development processes to make web applications more resilient," he says. "Really, it is the back to the basics of mitigating the OWASP Top 10 and SANS Top 20 vulnerabilities in your web application and make security the job of every engineer backed by a robust security and infrastructure team."

Having spent much of my time in the last 5 years reading the academic literature about technical debt, I quickly recognized the scenario. To a security professional like the above-quoted Pierson, there may be "no excuse", but in the real world - where people are balancing competing risks and benefits, I don't think it's particularly surprising at all to find an example of deferred maintenance in a large business. The literature is replete with these cases. For example, in his 2012 ACM article, Managing Technical Debt, Eric Allman (of sendmail fame) tells us the story of a crisis that was caused by deferred maintenance on UC Berkeley's mail server:

For example, when U.C. Berkeley’s CalMail system went down in November 2011, the problem was traced to deferred maintenance—in particular, the decision to postpone updating the system even though it was known to be near capacity. One disk in a RAID died, shortly followed by a second, and the cost of rebuilding the array reduced capacity sufficiently to create a crisis. Murphy’s law needs to be taken into consideration when deciding how much technical debt to accept. In the CalMail case, individual hardware failures were expected in the base design, but multiple failures, happening during a historically high usage spike, created a condition that was not quickly resolvable. According to Berkeley’s associate vice chancellor for information technology and chief information officer, Shelton Waggener, “I made the decision not to spend the million dollars to upgrade CalMail software for only 12 months of use given our plan to migrate to new technology. We were trying to be prudent given the budget situation, (but) in retrospective [sic] it would have been good to have invested in the storage upgrade so we would have avoided this crisis.” This is a case where technical debt was taken on intentionally but turned out to be a bad gamble. Had the system survived that 12-month window, the school likely would have saved $1 million during a budget crunch.

I don't know a lot about the particulars of what happened to Equifax, but as a veteran IT worker, I would like to explore some common business practices that might have contributed to the failure. So let's assume that the compromise was caused by deferred maintenance. Here we are in 2017, and everyone knows about the importance of patching, and yet it seems that Equifax may have neglected this activity. How can this still happen?

To explore that question, I will discuss three topics in the following sections: staffing levels, sales commitments, and application stakeholders and decision-makers.

Note that none of the information that follows reflects on my current employer. I am merely synthesizing my knowledge from the academic literature, from the industry at large, and from experiences with previous employers.
Image Source: pixabay.com, License: CC0, Public Domain

Staffing Levels

Resolving technical debt depends on having the people and tools to get the job done. There is a tension, however, between the cost of staffing for the "worst case" and the cost of staffing to keep the lights on. To explain by way of analogy, in the 1990s, I was involved in procuring some frame relay circuits for one of my employers. The vendor's sales team went to great lengths to explain to us that there were three information rates to consider:

  • Committed information rate: The rate of traffic that the vendor would guarantee.
  • Burst information rate: The rate of traffic that your network would transmit during times of peak usage.
  • Mean information rate: The average rate of traffic on your network.

A common purchasing mistake was to pay for a committed information rate that would accommodate your network's mean information rate, but not it's burst information rate. This would often lead to intolerable slow downs during times of peak traffic (ironically, the times when network reliability is needed most). When deciding on capacity, to keep your users happy, you had to make sure that you purchased a committed rate that was adequate to handle nearly all of your anticipated peak traffic periods.

I think that there is a similar phenomenon that happens with IT staffing in some businesses. In many businesses, it is my impression that staffing levels are controlled not by IT management, but by accounting and HR departments. When they set their staffing levels, they look at what is needed to handle the day to day needs of the business, not the times of crisis. This means that, like the business who purchased inadequate bandwidth for peak traffic, when a crisis arises, the staffing level is insufficient.

In contrast, a secure business will set their staffing levels to handle the peaks, not the averages.

[Image Source: pixabay.com, License: CC0, Public Domain]

Sales Commitments

A common theme of frustration among co-workers at several past employers has been a persistent habit of sales teams to make promises that the operations teams cannot fulfil. For example, at one former employer, people were constantly frustrated by unrealistic uptime demands on a particular customer's network devices.

When this happens, the operations team experiences sustained pressure to rebalance work loads in order to meet the unrealistic commitment, and amazingly, they usually get it done. However, this usually has two negative side effects:

  • The solution that the operations team devises often meets the letter of the unrealistic commitment, but not the spirit. This leaves a customer feeling cheated, and workers feeling unsatisfied.
  • By redirecting staff and resources to meet the unrealistic commitment, other customers with less stringent commitments often experience a decline in levels of service.

Coupled with the staffing constraints discussed in the first section, regulatory requirements in the IT security playing field such as Sarbanes Oxley and PCI (Payment Cards Industry) Data Standard might also trigger this sort of maladjustment. For example, is it possible that Equifax's security efforts were hyper-focused on satisfying regulatory requirements and this particular web application took a back seat because it wasn't in the area of focus?

A secure business needs to make sure that sales commitments and regulatory requirements do not divert resources and attention away from systems that aren't covered by premium sales commitments or regulations.

[Image Source: pixabay.com, License: CC0, Public Domain]

Application Stakeholders and Decision-Makers

I think it's safe to assume that deferred maintenance was not caused by laziness or a desire for inferior security. Instead, there must have been some soft of competing incentives in play. Simply put, the business rules dictated that other priorities were more important than applying patches on the particular application that was compromised, and as a result, business activities before the breach were misdirected.

One common competing need is the need for availability. In most cases, down time is needed in order to apply and test new patches. It's possible that the business rules that pertain to availability were in conflict with the business desire for security. The people who are in charge of scheduling and approving down time are often agents for the customers - who want the system to be up. These are frequently not the people who own platform security. To carry it a step further, were there any regulatory availability requirements that interfered with down time for maintenance?

Another common competing business need is the desire to apply patches in test environments before applying them in production. This practice creates a lag time between when the patch is released, when it's applied in one or several test environments, and when it can finally be applied in production. And of course, this scheduling delay can be compounded by staffing shortages and stringent availability requirements.

Finally, as in Allman's e-mail server example, the competing need may be simple dollars and cents. Did the equipment in question have all of the the prerequisites it needed to be able to accept the patches? If not, the task is more complicated than just patching the application. Instead, the application needs to be redesigned and replaced. This process can be expensive in terms of capital and personnel.

A secure business needs to ensure that competing business needs don't discourage or prevent good security practices.

Conclusion

One final point that I'd like to make is that even with ideal security practices, it is still possible for a system to be compromised. With the continuing effectiveness of social engineering attacks like phishing and pretexting, I concluded after the Target and Yahoo breaches hit the news that any business, governmental entity, or other organization of a certain size should now assume that it has already been compromised, and make contingency plans to deal with the time when the intrusion is eventually discovered. This Equifax announcement does nothing to disabuse me of that notion.

Having no inside knowledge, this discussion is necessarily speculative, but I think it's worthwhile to spend time thinking about. It's very easy to think of information security as a technology problem, but as I try to get across here, technology is only one factor. Every organization that depends on information security should spend some time examining its culture, incentives, processes, and practices for places where security efforts are being discouraged, impeded, or precluded.

I have no idea what textbook I used, but in the 1980s, my "Systems Analysis" text book listed the life-cycle of a computer system as follows:

  • Survey
  • Study
  • Define
  • Select
  • Acquire
  • Design
  • Construct
  • Deliver
  • Maintain

Unfortunately, to this day, it seems that too many decision makers still have not learned that "maintain" needs to be included in the costing plans for a computer system.

In addition to examining staffing practices, sales commitments, and application stakeholders and decision makers, there are many other practices that a forward-looking, security conscious business should examine. These three aspects of business are just the ones that struck me as important at the moment.


Thank you for your time and attention.

Steve Palmer is an IT professional with three decades of professional experience in data communications and information systems. He holds a bachelor's degree in mathematics, a master's degree in computer science, and a master's degree in information systems and technology management. He has been awarded 3 US patents.
Follow: @remlaps
RSS for @remlaps, courtesy of streemian.com.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

You know, if Apple, Google and Samsung had access to people's names, accounts, contacts, fingerprints, irises and voices, this breach would be minuscule.

Oh, wait, they already do.