Amazon Athena, a service that allows SQL queries to be run on data in Amazon S3, can be accessed through DBeaver. By leveraging the JDBC driver and AWS Single Sign-On (SSO), access to Athena can be managed with greater ease. The steps provided below will guide through setting up this connection.
Prerequisites
- DBeaver: The DBeaver tool must be installed. Downloads are available here.
- AWS CLI: AWS CLI is required for authentication via SSO. If not installed, follow the instructions here.
- AWS SSO Configured: Single Sign-On must be set up for the relevant AWS account. Details on configuring AWS SSO can be found here.
- Athena Access: The necessary permissions to query Amazon Athena and access S3 storage must be ensured.
Step 1: Configure AWS CLI for SSO
To authenticate using AWS SSO, AWS CLI needs to be configured first.
- The following command should be run to start the configuration process:
aws configure sso
- Several prompts will appear, and the following details should be provided:
- SSO Start URL: This will be the SSO start URL given by the organization (e.g., https://mycompany.awsapps.com/start).
- SSO Region: The AWS region where SSO is active must be provided (e.g., us-east-1).
- SSO User: The user will need to log in using their SSO credentials.
- Profile Name: A profile name such as
sso-athena-profile
should be chosen.
- Upon successful configuration, the AWS CLI profile will be linked to SSO, enabling authentication for future services.
- To refresh credentials, the command below should be used:
aws sso login --profile sso-athena-profile
Step 2: Download the Athena JDBC Driver
The Athena JDBC driver is required to establish the connection.
- Visit the Amazon Athena JDBC Driver page to download the appropriate version.
- The driver should be saved in an accessible location, as its path will be needed in the next steps.
Step 3: Open DBeaver and Add Athena Connection
- DBeaver should be opened, and a new connection can be added by selecting Database > New Database Connection.
- A search for “Athena” will display the option for Amazon Athena. If it doesn’t, the Add File option should be used to locate the previously downloaded JDBC driver.
- Navigate to the directory where the JDBC driver was saved and select the
.jar
file.
- Navigate to the directory where the JDBC driver was saved and select the
- After adding the driver, Athena should be selected, and the process can proceed by clicking Next.
Step 4: Configure Connection Settings
- General Settings:
- AWS Region: The AWS region where Athena data is stored should be selected (e.g., us-east-1).
- S3 Staging Directory: Enter the path to the S3 bucket that Athena will use to store query results (e.g.,
s3://your-athena-results-bucket/
).
- Driver Properties:
- OutputLocation: This should point to the same S3 bucket as the staging directory.
- Workgroup: (Optional) Enter the workgroup if using a custom Athena workgroup.
- AWS SSO Authentication:
- In the Connection Settings, navigate to the Driver Properties tab.
- Configure the following properties:
- AwsCredentialsProviderClass: Set this to
com.simba.athena.amazonaws.auth.profile.ProfileCredentialsProvider
. - AwsProfile: Enter the AWS CLI profile name created earlier (e.g.,
sso-athena-profile
). - AwsRegion: Specify the correct AWS region (e.g.,
us-east-1
).
- AwsCredentialsProviderClass: Set this to
- Test Connection:
- The Test Connection button can be used to verify if the connection to Athena is successful.
- If everything works, click Finish to save the connection.
Step 5: Query Data from Athena
- Upon successful connection, the Athena database and its tables will appear in the DBeaver Database Navigator.
- SQL queries can be run by right-clicking on the connection and choosing SQL Editor.
- Data stored in S3 can be queried through standard SQL commands.
Troubleshooting
- Invalid Credentials: The credentials should be refreshed by running
aws sso login
if authentication fails. - JDBC Driver Issues: Updating the JDBC driver often resolves compatibility problems.
Conclusion
By following the steps in this guide, a secure connection to Amazon Athena using DBeaver, AWS SSO, and JDBC can be established. Querying data stored in S3 can then be easily performed using DBeaver’s robust features.