Athena sidesteps the traditional data pipeline, enabling advanced analysis directly on the data, with no need for preprocessing and specialized analytics software. However, at the same time, it also sidesteps many of your organization’s existing security measures. Athena enables analysts to gain direct access to sensitive data on S3 and derive useful insights, which may be similarly confidential. Your organization must have proper visibility into who performs Athena queries, why, and whether they are authorized to access the data. The use of Athena may have compliance implications as well. Get more background on Athena security in this detailed blog post. To use Athena:
Go to the AWS Management Console Point Athena at any relevant data stored in your S3 bucket. Use SQL to run any ad-hoc queries. You’ll be able to get results in seconds.
Athena is a serverless service that does not require setting up or managing the infrastructure. The service lets you pay only for the queries you run and automatically scale. Athena can run queries parallel and provide fast results, even when analyzing large datasets or running complex queries. Here are some important ways to maximize security when using Athena as part of your overall AWS security strategy.
Logging and monitoring in Athena
Actions performed by any IAM role, the user or AWS service in Athena Calls to the Athena API Actions on the Athena console
Feature activation Configuration changes Connection to S3 buckets
It is also possible to trigger a rule on API calls in CloudTrail to generate custom CloudWatch events.
Improving visibility with XDR
While Athena provides basic capabilities for logging and monitoring, it is difficult to connect these logs to traditional security tools. There is no physical machine or VM on which security teams can install an agent, meaning that traditional security tools cannot manage and control the usage of Athena. The same is true for S3 buckets. This also means that Athena activity will not be visible to traditional security tools. To achieve a holistic view of activity from serverless tools like Athena, you’ll need a security paradigm that can work with any data source, whether based on traditional agent-based security tooling or not. While XDR might seem like overkill just to protect Athena, consider that your organization uses other cloud-native technologies, which are similarly difficult to monitor and secure. XDR addresses security concerns across multiple cloud services on AWS and other clouds. A major threat vector for Athena, or any analytics service, is the interception of communication by attackers, for example, by Man in the Middle (MitM) attacks or session hijacking. To reduce the chances of attackers exfiltrating data pulled by Athena, you can use two security measures:
Securing S3 data you need to query with Athena
Here is a four-step process you can use to secure an S3 storage bucket queried by Athena:
Your source S3 bucket should NOT be publicly accessible — unless you want the bucket to be publicly accessible, do not enable this option. You can change this option for each bucket directly from the AWS console. You can encrypt your S3 bucket from the AWS console or encrypt your source files. All data in your S3 bucket should be encrypted — you can do this by applying encryption at rest on the bucket. You can use the AWS Key Management Service (AKS), which offers three types of keys — SSE-S3 lets S3 manage your encryption key; CSE-KMS enables you to create your own key, which KMS uses; and SSE-KMS lets KMS generate and manage a key. Ideally, you should use SSE-KMS keys, which let you control access to the key. Encrypt your query results — Athena stores all query results in a pre-configured S3 location called an S3 staging directory. Encrypting an S3 bucket and source files does not help encrypt query results. You need to encrypt your staging directory to encrypt all data at rest. You should not use the same key to encrypt your stored data and query results. Ideally, use different keys for query results to ensure one compromised key does not threaten all data. Encrypt your glue data catalog — the Data Catalog contains all Athena table definitions in addition to other things. Once your catalog is encrypted, Athena table definitions are encrypted (excluding the data). Controlling access to encrypted data
Bucket policies can help you fine-tune access to your source data. A bucket policy can stipulate who gets access to a certain S3 bucket and the actions they are allowed to perform on the content. For example, you can use a policy to prevent certain users from decrypting the data. Additionally, you can set a bucket policy that allows identity and access management (IAM) users of certain AWS accounts to gain access to the bucket. That way, if an unauthorized user gains access to your bucket’s encryption KMS key, they may not be able to access the contents — because the policy explicitly denies access to this role, group or individual user.
Access control for Athena queries
Unlike traditional databases, Athena does not support user accounts. To control access to Athena, you must use IAM policies, including the two following AWS-managed IAM policies for Athena: You should also create the two following custom IAM policies for the following types of Athena users:
Power-user policy — grants the user permission to create, modify and delete Athena objects such as databases, views and tables. Analyst user policy — does not provide any administrative privileges.
After creating these policies, do the following: I hope this will help you develop a security strategy for serverless tools in your organization.
Sources
Athena Security: Data Protection, Monitoring, and Secure Access, Satori Blog