How DevOps and Models Enhance Behavioral Detection

By Aaron Botsis

In an earlier article, Behavioral Threat Monitoring Without Models, I explained how you could use our Cloud Sight product to deploy a pre-trained behavior model on newly deployed systems. For the fourth installment of this SecDevOps series, I’m going to talk about how to further integrate security into DevOps processes and how these models work together in the bigger picture.

What are these models?

Ok, ok. I keep saying “models”, but what are they? How do they work? And most importantly, why do they matter?

We can use models to detect changes in system behavior with algorithms and math; Cloud Sight actually builds several different types of these models. The great thing is that the models don't even need to be that complicated. Why?

If you have any data scientist friends:

1. They’ll tell you that more data beats a better algorithm.

2. Wait, data scientists have friends?

So what can we do with more data? Let’s start with “processes with network activity”. For any group of servers, Cloud Sight builds a list of processes that are talking on the network. Once it’s finished learning, it starts to monitor for new processes. This is a simple but extremely effective technique to identify behavior variations. In fact, it’s so effective that in a 28M event sample set of accept(2) and connect(2) system calls, we saw just 321 unique executable names across our customers! We can apply similar techniques for other data such as process owner, parent process name, etc.

Why this is Good

Back in the dark ages, it was difficult to ensure a group of systems that were functionally similar actually behaved in a consistent and similar way. But then there was light. Thanks to DevOps and configuration management, system behavior is now a fairly consistent (and measurable) thing. Web servers that are all configured the same actually do the exact same thing, the exact same way. This is an epic win for security, my hipster brethren!

“I took this system to its maximum potential. I created the perfect system!”  

“I took this system to its maximum potential. I created the perfect system!”


“Epic Win?"

Totally. Here’s why: Imagine these models can be created, destroyed and tested programmatically, alongside your existing development processes.

We can start by training these models during our continuous integration tests. We know the environment is pristine, and we’re already testing all of the things. Why not train our models which behavior is “good” while we’re at it? It’s like a self-generating, infrastructure-wide whitelist.

Now we can apply those behaviors to systems we deploy for production. Anything that deviates from what we tested is likely an intrusion. But even if it’s not, it could inform us of imperfections in the system. Maybe we forgot to test something. Maybe there’s a corner case that only affects production for some reason. Maybe something’s running away because of an unidentified failure elsewhere in the system, consuming precious elastic resources.

Finally, once we’ve iterated and ironed everything out, we can add automated chaos-monkey style remediation to the mix. When a system deviates from it’s expected behavior, quarantine and replace it automatically.

Bringing it all Together

It used to be that “deploy” meant “run make install”. The number of interaction points between applications was minimal and easy to grok. Today's infrastructures are more complex than ever, and DevOps is showing huge value in quick iteration. Thanks to configuration management, applications and the infrastructure supporting them are more consistent than ever. So it only makes sense to leverage behavioral monitoring to iterate quickly without forgetting lessons learned from the past while protecting the infrastructure at the same time.

Stay tuned for next week’s SecDevOps blog post featuring Chris Wysopal, Veracode’s CTO, on code analysis as part of CI.

Who Gets Access to Production?

By Sam Bisbee, CTO

This is the third installment in our new series of weekly blog posts that dives into the role of SecDevOps. This series looks into why we need it in our lives, how we may go about implementing this methodology, and real life stories of how SecDevOps can save the Cloud.

Remote access to production machines is a long contested battlefield that has only gotten uglier since the rise of Software as a Service, which has obliterated the line between building the system and running the system. This caused new methodologies to be enacted, the most popularly touted being DevOps, which is really just an awful way of communicating that everyone is responsible for running the system now. One critical implementation detail that smaller SaaS companies have always understood due to hiring constraints is that the entire technical staff is required to be on call. Yes, even the engineers, developers, or whatever else you call them.

The New Policy

“Lock out the developers” is not an acceptable policy anymore. Developers inherently build better systems when they experience running them. Who would allow a bug to linger if it continuously woke them up throughout the night? This pain was not felt widely enough in the previous “throw it over the wall to operations” world. I can sense desperation rising from the PMs over their kanban story velocity, “If an engineer is on call, then they won’t be able to write code!” While this statement is factually accurate, the sentiment is not.

First, operations has an equally important and lengthy work queue. Second, those paging alerts are likely the most important bugs regardless of whether they’re an uncaught exception (engineering issue) or RAID alarm (operational issue). This typically confounds those new to the SaaS world because they have not fully grasped the ramifications of the Service with a capital “S”. The Service is always on and is the product through which you deliver value. This is one of the best examples of how SaaS companies are so much different culturally and operationally than companies that “ship” product. You are not running an IT department.

Don’t Over Correct

This remote access policy may seem like an over correction, which is why proper controls are critical. One of the most cited fears for granting more people access is the lack of change control. When you apply this fear to developers, what people really mean is that they are afraid of hot patches. This is completely and utterly reasonable.

Hot patches decrease visibility into the system, slowing down or outright preventing the ability to debug. The worst-case scenario is a hot patch actually damaging the system or corrupting user data, which is exponentially more likely due to the lack of testing. The technical community should fully understand by now that “it worked on my laptop” or “it shouldn’t do that” are not reasonable statements when releasing. The only true prevention for hot patching, especially when implementing a populist remote access policy, is to create a frictionless release mechanism. Make it trivial for your teams to build, test, and initiate a staggered release into any of your environments. Ideally your build server is testing every push to your master git branch and anyone can promote a successful build from that server.

Trust but Verify

If frictionless releases are our trust, then accordingly we must verify. Enter monitoring. Techniques such as the Pink Sombrero are good (digital sombreros are better), but you must introduce continuous security monitoring into your environment. For ages there have been tools and techniques that do this, but most teams do not employ them because of their complexity, outdated implementation (taking hashes of your entire multi-TB filesystem in an IO bound cloud or virtual environment is asinine), and volume of false positives. It does not have to be so complicated though. For example, alerting when a user other than chef changes files in your production server’s application directory is an easy first step that a team of any size can easily grasp.

For those who are concerned about access to customer data, whether it be PII or something less toxic, this remote access policy does not apply to that data, as it should live in a segregated environment. They are also likely concerned with passing audits, and the prospect of listing their entire technical team as having production access is not intriguing. In such scenarios, non-operators should be locked out of production unless they are on rotation. Adding and revoking their SSH public key from the gateway on-demand can make controlled access easier.

You Get What You Need

All of this is to say that collectively we are still trying to figure out the security balance in the technical community. Too often people want security, but see it as prohibiting productivity so they punt. This is unfortunate for the obvious reasons, but also because properly operationalized security begins to enhance the developer’s and operator’s experience. Tools are leveraged that make the system easier to run and control. Different monitoring solutions are installed that make the system easier to debug and verify. And, everyone gets access to production.

Stay tuned next Wednesday for our fourth installment in this series as we continue to dive deeper. Until then, be sure to check out our first and second posts in the series.

Threat Stack Names Executives As Company Brings Innovative Cloud Security Service to Market

We're excited to announce today that we have added several key members to our management team.  Sam Bisbee has joined as CTO; Chris Gervais as VP, Engineering; and Pete Cheslock as Senior Director, Operations and Support.

“Threat Stack is at a really exciting point in time, as we come off a highly successful beta program and prepare to launch Cloud Sight into the market. Our management team has deep experience across enterprise, cloud, SaaS and security, and a track record for successfully bringing innovation to market.  Were thrilled to have attracted an all-star team.”

- Doug Cahill, CEO of Threat Stack

About the Executive Team

Sam Bisbee is a senior technologist that brings experience and expertise in delivering highly scalable distributed systems via SaaS.  Most recently Sam was CXO of Cloudant, a leader in Database as a Service (DBaaS) technology; before that he held key technology positions at Bocoup and Woopid.

Chris Gervais has led technology teams developing large, scalable, enterprise-grade solutions and bringing SaaS offerings to market.  Before Threat Stack, Chris was CTO and SVP, Engineering at LifeImage, a platform for securely sharing medical images; and VP, Engineering at Enservio, a SaaS application and analytics platform for insurance carriers.

Pete Cheslock has a record of supporting SaaS customers with highly reliable and scalable solutions.  Pete was previously Director of DevTools at Dyn, a provider of network traffic management and assurance solutions; and before that he was Director of Technical and Cloud Operations for Sonian, a cloud-based archiving platform.

“This team is a testament to Threat Stacks unique technology and the big problem that it addresses. Elastic and dynamic infrastructures, and the services that run on them, are really difficult to monitor and protect.  Threat Stack has cracked the code.” 

- Chris Lynch, Chairman of the Board and a partner at Atlas Venture

Our flagship product, Cloud Sight™, is the first and only intrusion detection SaaS offering purpose-built to provides elastic cloud infrastructure with comprehensive protection, detection and response against malicious threats.  Cloud Sight has been in a highly active beta program which resulted in multiple customer case studies, including Populi, a cloud-based college administration platform, and University of Hawaii at Manoa CollegeBeta participation ranged from SaaS vendors to MSPs and enterprises running in most major cloud service providers including Amazon Web Services and Rackspace, as well as in private and hybrid-cloud deployments. Cloud Sight will be commercially available this fall.

Interested in trying out Threat Stack? Request an invite to our beta: 



The Case for Continuous Security

By Pete Cheslock

This is the second post in our new series of weekly blog posts that dives into the role of SecDevOps. This series looks into why we need it in our lives, how we may go about implementing this methodology, and real life stories of how SecDevOps can save the Cloud.

DevOps is a term that has absolutely blown up in the last 5 years.  As someone who’s been involved with that community from the earlier days, it’s been interesting to watch the conversations around DevOps evolve over time.  For many people, they had an immediate adverse reaction towards Yet Another Buzzword -- especially when the core concepts that people described as being “DevOps” were things that many people had already been doing for years.  (I’m not going to bother getting into the specifics of “what is DevOps” since there is already a plethora of blog posts that you can easily find on it.)  

One of the core tenets of what people consider to be “DevOps” is to shorten the feedback loop in your development cycles.  By reducing the amount of time for those feedback loops, your teams can iterate more quickly on changes and ship those features to your customers sooner. This tenet ties in directly with Agile methodologies utilized by software engineering teams. With the advent of easily accessible cloud infrastructure, and with the various operational tooling around those new infrastructure providers reaching a new level of maturity, we are now seeing a world where “DevOps” is mainstream.  For companies starting new product development initiatives, using some form of Configuration Management is now table stakes to iterate quickly. Additionally, we see more and more companies shed their physical data center presence in order to leverage the flexibility and accessibility of public compute resources provided by companies like Amazon, Microsoft and Google.  

The inherent nature of these IaaS providers is to make it as easy as possible to provision systems to meet your infrastructure needs -- and to do so very quickly.  Speed to market is a major competitive advantage that many companies are leveraging through the concept of Infrastructure as Code.  Provisioning hundreds or thousands of compute instances in mere minutes is now considered an everyday activity.  Everyone wants to move fast.  

Continuous Integration. Continuous Deployment.  But who (or what) is continually monitoring the state of your operational security?

We now have a world where your junior system administrator is able to make a small change to a Chef Recipe, Puppet Manifest, or maybe an Ansible Playbook, and deploy it to production within minutes.  But what is the scope of that change?  System Administrators don’t want to be slowed down by the security team.  They don’t want their configuration management changes to be passed through a Change Control Board.  They want to change a variable, open a pull request, and once merged, they want their operational tooling to do the rest.  They want their change to hit production servers as soon as possible.  

Screen Shot 2014-07-16 at 2.19.35 PM.png

This is where SecDevOps, or SecOps, comes into play. (Let’s ignore the fact that it’s just as silly of a buzzword as “DevOps”). If DevOps seeks to value empathy between teams that traditionally had different incentives for their positions (Devs valuing constant change, Ops valuing stability), SecDevOps seeks to evoke the same outcome with your Security teams and the rest of the business.  

When you are in a world where you are continually deploying change, you need to move towards a world where you are continually monitoring the security implications for those operational changes.  Often times, there is no single person at your company that is able to say with absolute certainty which changes to your infrastructure have additional risks towards your security posture.  And if you have a traditional network security organization that is manually reviewing and approving changes to production, you’ve now introduced the newest bottleneck in your organization.  

It’s this conversation that excites me the most about joining Threat Stack.  As a technical operations veteran of the last 15 years, this is the most important (and exciting) problem to solve in many organizations.  Having the opportunity to help build a product that will enable companies to continue to break down operational silos while improving the speed in which they are able to track and respond to security incidents is an absolute dream job for me.

I see SecDevOps as the qualifier for this discussion.  How do you improve your security monitoring and response times, while maintaining your ability to continually deploy changes? These are hard problems to solve, and we are all excited to be in this unique position where can actively help companies solve this problem.  

Stay tuned next Wednesday for our third installment in this series as we dive deeper into the technical integrations that make SecDevOps happen. And in case you missed it, you can check out our inaugural post here.

About Pete

Pete Cheslock is the Senior Director of Operations and Support at Threat Stack.  Previously, he was the head of automation and release engineering at Dyn, managing and deploying to mission critical global DNS infrastructure.  Prior to Dyn, Pete was the Director of Technical Operations for Amazon-Backed cloud archiving company Sonian. You can follow Pete at @petecheslock on Twitter.




Why SecDevOps Will Save The Cloud

This is the first part of a new series of weekly posts that will dive into the role of SecDevOps. This series looks into why we need it in our lives, how we may go about implementing this methodology, and real life stories of how SecDevOps can save the Cloud.

The world has changed. I feel it in the water, I feel it in the Earth. I feel it in the packet loss. This is the age of “the Cloud”. We were not without our skeptics, but we knew what was happening. A revolution was on our doorstep, and we wanted it all. We wanted it yesterday.

Configuration management, automation, orchestration, continuous integration and delivery. New concepts were born, titles were given, and philosophies of win floated around the web like confetti after a New Years Eve celebration. We weren’t sure where we were going, but we knew where we didn’t want to be: Configuration drift, tedious provisioning of systems, lack of acceptance and unit tests. Our fears were real, and we sought answers.

DevOps is born. “This is the solution we’ve been searching for,” we proclaimed! As if millions of voices suddenly cried out in terror, and were suddenly silenced… The end is far from nigh. In fact, it has only just begun.

Screen Shot 2014-07-09 at 8.29.56 AM.png

“What is a DevOp?” you may ask. We’ve all heard the jargon, the marketing pitches, but what is it really? The answer, at its core, is quite simple. “DevOps” is not a team, nor is it an organizational role. “DevOps” is a philosophy of collaboration.

“In the long history of humankind (and animal kind, too) those who learned to collaborate and improvise most effectively have prevailed." Charles Darwin

For years we sectioned off teams. Developers to the left, Operations to the right. Security teams... where did they hide? Who knows, really? (Reports have been made of Security Engineers haunting the halls of office buildings, lamenting about zero-day exploits and security patch upgrades.) Applications and services were developed and passed over the wall to the Operations team where they did what they could to piece things together and create a working environment. It was how we “got shit done.” Yet, something had always been missing. Where was the bottleneck? How do we optimize our development and deployment pipelines? Things need to be faster! Mush! Mush! Fellow Engineers!

DevOps unite: Infrastructure as code took the community by storm. Various Configuration Management solutions started making themselves available, code was written, and progress was made. But something was still missing -- something of incredible value. With all of these new tools at our disposal, teams began pushing out code changes faster than ever before, but something was astray.

Security! Where have you been, where have you gone? Were we foolish enough to believe that these progressive methodologies would save us from something so integral to our success? Why have we forsaken you?

Screen Shot 2014-07-09 at 8.31.25 AM.png

The cloud has left us questioning our surroundings. Who has access, what are the controls, what services are publicly available and which are safely kept behind “locked” doors? What is our risk, and how efficiently was it assessed? If you have yet to ask yourself these questions, it will only be a matter of time before you are one of the Lost.

Suddenly, a new methodology appeared in the distance… SecDevOps. What is it? Where did it come from? Is this just another silly marketing initiative? No, this is natural progression. The move to the Cloud leaves many of us with questions, and the most important one of them being: Without complete ownership of our systems and their supporting environments, how do we protect ourselves?

SecDevOps, or SecOps, is a natural extension of DevOps. The rate of change we see today leaves very little room for Security teams to properly assess risk in our application and infrastructure code. While the mission of our Security colleagues has never been to slow down the process, without bringing them into the fold, we will continue to be at risk of ever-looming threats.

By integrating our Security tool-chains into our DevOps pipeline, we can effectively mitigate our risks and continue our journey towards a secure, automated infrastructure.

In next Wednesday's follow up post, we will start to explore some of the basic principles of the SecDevOps methodology and how it can be operationalized for fun and profit. Over the course of the series, we will hear from a few notable guests on their experiences, and get their take on the SecDevOps movement and why we need it.

Threat Stack Introducing SecDevOps at AWS Summit New York

Next Thursday, we will be at the AWS Summit 2014 in New York meeting with AWS users from across the country -- many of which are our own customers -- as well as leading the discussion around the intersection of Security, Development and Operations and what that means for continuous monitoring in EC2.

As EC2 users increasingly look to incorporate security into their DevOps deployment practices, Threat Stack’s continuous monitoring plays a critical role in protecting cloud deployments. By integrating security best practices and DevOps activities, we’ve come to a new methodology: SecDevOps. A natural extension of DevOps, it will improve the security of entire EC2 environments. We’re looking forward to sharing our best practices at this summer’s AWS Summit in New York.

Going to the summit? Simply email us your availability for July 10th and we’ll schedule a dedicated time with you to discuss best practices for securing your Linux environment with SecDevOps.

Threat Stack + AWS


After speaking with AWS customers across the country, it’s clear that businesses using AWS not only want but need technologies like Threat Stack’s Cloud Sight that are built specifically for the cloud in order to fulfill their portion of the shared responsibility security model. It’s business-critical today to have solutions that solve multiple needs, providing visibility across an entire cloud environment. Heterogeneous cloud solutions like Cloud Sight allow companies working in or moving to the cloud to do so with confidence. We’ll show you how at the Summit.  

Email us your availability to schedule a meeting with our team.

Event Details

AWS Summit 2014: New York

When: July 10 from 11:00am to 7:00pm

Where: Booth #331 at the Javits Convention Center, 655 W 34th St, New York, NY

What: Talk with our founding team and see demonstrations of our cloud security monitoring platform, Cloud Sight.

At Booth #331, you’ll learn how Cloud Sight:

  • Is a security solution built for the needs of both Operations and Security

  • Provides intrusion detection, purpose built for EC2

  • Eliminates the need for multiple point tools

Book your meeting with us now -- space is limited!


Behavioral Threat Monitoring Without Models

One of the great things about the cloud is the ability for companies to grow and shrink their infrastructure elastically to meet varying levels of demand. What many people don’t think about is how to secure this sprawl of cloud compute instances. As new systems are deployed, how do you enforce a policy on them? How do you look for anomalous behavior when an instance hasn’t been up long enough to determine a baseline?

Cloud Sight has solved this problem from day 1 with our policy framework. Our policies encompass all attributes of an instance’s security posture: alert rules, file integrity rules, firewall rules, so many rules! But also, each policy has a unique, learned behavior model associated with it. For example, an Apache web server process doesn't usually fork /bin/sh. When our agent is activated, the instance’s baseline is already established from its peers which enables us to immediately start watching for anomalies.

Since we automate a lot of our own infrastructure (as do many of our customers) we realized it’d be useful to put an agent into a policy (and subsequently activate it) when the agent is deployed. So we got to work and a sprint or two later, we hammered it out.

See how it works below:


Choosing a policy will append the policy argument to the deployment key at the end of the URL, and you’re off to the races.

To begin selecting your own policy, login now. Have a question? Send us an email