AWS : Athena + Sagemaker

Integrate Athena and Sagemaker

2. Create Sagemaker Notebook instance and add IAM role

3. Add Policy athena:StartQueryExecution and athena:GetQueryExecution to default sagemaker policy

By default : athena policies are not added to Sagemaker default role.

4. Add S3 and Athena access to the Sagemaker role

5. Sagemaker instance pending state (takes few minutes to be “In-Service”)

6. Install PyAthena and run the Python code

pip install pyathena
from pyathena import connect
import pandas as pd
conn = connect(s3_staging_dir = 's3://deb-sagemkaer-output', region-name = 'us-east-1')

df = pd.read_sql("SELECT * from debtest.deb_startrek_data;",conn)
df

More details are in AWS Blog here