Managing production environments to prevent data leaks

Nick Joyce
Real Kinetic Blog
Published in
4 min readDec 2, 2019

--

Update: Further research has demonstrated that People Data Labs did not own the IP as listed previously, but it does appear to be a very similar dataset. This article was updated to reflect these new details.

There have been several significant data leaks in the past couple of weeks. I wanted to take some time to dissect one of them and talk through some preventative measures you can put in place to help minimize the risk to your organisation.

Exposed Elastic Search cluster

Data Viper recently found a publicly exposed, unauthenticated 4TB Elastic Search cluster. This cluster contains personally identifiable information (PII) such as names, emails, phone numbers, social media information (LinkedIn, Facebook, Twitter, and Github) of over 1.2 billion (that’s with a b) people.

There seems to be no hack here; it would appear that the exposure of the Elastic cluster was purely by accident. The consequences for people’s data and the company involved are massive. Due to the sensitive nature of the data contained in the leak, high-value targets such as CEOs, CFOs, etc. need to increase their digital security. Additionally, the trust of the company hosting that trove of data has been forever marred.

Deployment of infrastructure can be simple. It’s relatively easy to find a script or set of instructions for nearly anything via Stack Overflow or blog articles. Follow the tutorial and your infrastructure is up and running. The problem with this, however, is that these types of articles are typically meant to be used as examples only, NOT for production usage. A production configuration is likely to have a significantly higher time investment. Maturing the infrastructure can only come by gaining a deep understanding of the system and with running it in production — detecting issues, making changes and repeating this process.

Some of the topics that must be considered when deploying a production-grade Elastic Search cluster include:

  • Load balancing
  • Volume/storage management
  • High availability/auto-scaling
  • Authentication and authorisation
  • Logging
  • Monitoring
  • Infrastructure management (patching, upgrading)
  • Backups and disaster recovery
  • Zero-downtime deploys (for version upgrades, etc.)

This list is not meant to be exhaustive but it should provide an idea of the complexity that is involved when “productionising” a system.

Best practices for managing production infrastructure

Managing production infrastructure at scale can be time-consuming. Here are some best practices to help you sleep at night:

  • Keep it simple. Complexity is a breeding ground for mistakes. Motivated hackers can leverage mistakes to gain unintended access to the system or exfiltrate data. This design principle can help to manage complexity.
  • Utilize managed services where possible. Most cloud vendors provide managed services, such as databases like RDS on AWS or Cloud SQL on GCP. Focus on delivering business value rather than spending time managing a system that others hire experts to do for you. This philosophy can also be applied when making decisions to use serverless versus traditional IaaS (i.e. VMs).
  • Before deploying new infrastructure, conduct review meetings where engineers from all backgrounds have the opportunity to ask questions around architecture, security, data management, etc.
  • Manage deployments through Infrastructure as Code (IaC). Repeatable, reliable infrastructure deployment is critical when there is a production outage. In such high-pressure, time-sensitive events, mistakes are easy to make — having a high degree of confidence in the ability to deploy infrastructure is vital. We have captured our best practices on how we do Infrastructure as Code for Real Kinetic and our clients.
  • Review the intended changes to production infrastructure before they are applied. This one works particularly well with IaC as it is possible to utilise pull/merge requests to elicit feedback from other engineers that may have more in-depth expertise in a particular system. It also serves to spread the knowledge around others in the organisation.
  • Assume an adversarial approach when building and deploying infrastructure. There will be people/scripts searching the internet looking for any potentially exposed systems that can be used to gain a foothold. Your organisation needs to find these vulnerabilities before they do. Understand what the security implications are for the changes that are going to be deployed into production.
  • Perform regular penetration testing. This can be done both internally through regularly scheduled times for the team and externally by engaging a third-party expert to identify risks.
  • All systems should log by default. All logs should be sent to a centralized system where it can be indexed and queried in aggregate in case of an incident. Ideally, these logs are immutable and cannot be changed by directly accessing disks.
  • Pragmatic monitoring. So often we see organisations that have monitoring alerts being sent to support staff on a regular, sometimes hourly basis. This will quickly lead to alert fatigue and time-to-response will drop dramatically. Automate systems so that a human only receives an alert notification when action needs to be taken.
  • Set up auditing and scanning systems that scan all public IPs owned by your organisation and report back what is exposed. Alert if anything unexpected changes. Sites like Shodan regularly scan the entire internet to understand what is being deployed where. These are used by researchers and script-kiddies alike to find “interesting” and potentially vulnerable systems.

If People Data Labs adopts these best practices, it will go a long way to help mitigate the kinds of issues they have experienced recently.

Conclusion

Data is an extremely valuable commodity these days. Some companies appear to be all too lackadaisical in processing and managing that data. Efforts like the GDPR and CCPA are attempting to ensure that companies take the appropriate steps to store and protect your data. If your organisation gathers data from the EU or California, you need to be aware of these pieces of legislation — being found in breach can result in substantial fines.

Correctly managing your infrastructure in production is part of the solution to ensuring data is not intentionally or unintentionally leaked. Real Kinetic has extensive experience and knowledge in managing infrastructure in production. If you need some help or advice, please reach out.

--

--

Cloud herder. Code monkey. Wood worker. Husband. Human. Managing Partner at Real Kinetic.